Skip to content

Serialization Exception #134

Open
Open
@artemiusgreat

Description

@artemiusgreat

Hi.

Not sure what happened, but something happened :)

  1. I updated all projects from .NET Core 3.0 to 3.1
  2. Made some changes in MySQL DB

Now, I get various errors from GenericXmlDataContractSerializer. Surprisingly, exception happens only when I build project the second time, after the first build it works fine. I mentioned DB, because I serialize trained model using MemoryStream and save it as a byte[] to the MySQL column of type LongBlob. I'm also using models from ML.NET and export / import them from DB the same way and they work fine, so probably DB is not an issue. All projects in the solution are built as x64. Serializer fails on any model, either RandomForest or AdaBoost, with the same exception.

The issue

  1. build the project and start debugging
  2. create, train model, and save it to DB as byte array using GetPredictor method below
  3. select byte array from DB, deserialize to a model, provide test data and get estimate - OK
  4. stop debugging, repeat steps 1-3, now prediction method fails with the exception below - NOT OK

The question

Maybe somebody knows what could be the reason for serializer to fall with the exception? Also, can I serialize trained model to MemoryStream using different serializer, without GenericXmlDataContractSerializer?

Most common exception

System.Runtime.Serialization.SerializationException: Element 'http://schemas.datacontract.org/2004/07/Core.Learners.SharpLearning.EngineSpace:Model' contains data from a type that maps to the name 'SharpLearning.RandomForest.Models:ClassificationForestModel'. The deserializer has no knowledge of any type that maps to this name. Consider changing the implementation of the ResolveName method on your DataContractResolver to return a non-null value for name 'ClassificationForestModel' and namespace 'SharpLearning.RandomForest.Models' 

After updating all Nuget packages I got another exception only once

Invalid XML at line 1 or something like that

Serializing trained model to byte array and save to DB

public virtual ResponseModel<byte> GetPredictor(IDictionary<int, string> columns, IDataView inputs)
{
  var responseModel = new ResponseModel<byte>();

  using (var memoryStream = new MemoryStream())
  {
    var processor = GetInput(columns, inputs, nameof(PredictorLabelsEnum.Emotion));
    var learner = new ClassificationRandomForestLearner();
    var serializer = new GenericXmlDataContractSerializer();
    var container = new MapModel<int, string>
    {
      Map = processor.Map,
      Model = learner.Learn(processor.Input.Observations, processor.Input.Targets)
    };
    
    serializer.Serialize(container, () => new StreamWriter(memoryStream));
    responseModel.Items = memoryStream.ToArray().ToList();
  }

  return responseModel;
}

Deserializing model from DB stream and getting prediction

public virtual ResponseModel<string> GetEstimate(IEnumerable<byte> predictor, IDictionary<int, string> columns, IDataView inputs)
{
  var responseModel = new ResponseModel<string>();

  using (var memoryStream = new MemoryStream(predictor.ToArray()))
  {
    var processor = GetInput(columns, inputs);
    var serializer = new GenericXmlDataContractSerializer();
    var model = serializer.Deserialize<MapModel<int, string>>(() => new StreamReader(memoryStream));
    var predictions = model.Predict(processor.Input.Observations);

    responseModel.Items.Add(predictions.OrderByDescending(o => o.Key).First().Value);
  }

  return responseModel;
}

Method GetInput in the code above is just a conversion from IDataView format in ML.NET to ObservationSet format in SharpLearning. MapModel is a wrapper that allows to save text labels along with numeric ones.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions