Description
Hi.
Not sure what happened, but something happened :)
- I updated all projects from .NET Core 3.0 to 3.1
- Made some changes in MySQL DB
Now, I get various errors from GenericXmlDataContractSerializer
. Surprisingly, exception happens only when I build project the second time, after the first build it works fine. I mentioned DB, because I serialize trained model using MemoryStream
and save it as a byte[] to the MySQL column of type LongBlob. I'm also using models from ML.NET
and export / import them from DB the same way and they work fine, so probably DB is not an issue. All projects in the solution are built as x64. Serializer fails on any model, either RandomForest
or AdaBoost
, with the same exception.
The issue
- build the project and start debugging
- create, train model, and save it to DB as byte array using
GetPredictor
method below - select byte array from DB, deserialize to a model, provide test data and get estimate - OK
- stop debugging, repeat steps 1-3, now prediction method fails with the exception below - NOT OK
The question
Maybe somebody knows what could be the reason for serializer to fall with the exception? Also, can I serialize trained model to MemoryStream
using different serializer, without GenericXmlDataContractSerializer
?
Most common exception
System.Runtime.Serialization.SerializationException: Element 'http://schemas.datacontract.org/2004/07/Core.Learners.SharpLearning.EngineSpace:Model' contains data from a type that maps to the name 'SharpLearning.RandomForest.Models:ClassificationForestModel'. The deserializer has no knowledge of any type that maps to this name. Consider changing the implementation of the ResolveName method on your DataContractResolver to return a non-null value for name 'ClassificationForestModel' and namespace 'SharpLearning.RandomForest.Models'
After updating all Nuget packages I got another exception only once
Invalid XML at line 1 or something like that
Serializing trained model to byte array and save to DB
public virtual ResponseModel<byte> GetPredictor(IDictionary<int, string> columns, IDataView inputs)
{
var responseModel = new ResponseModel<byte>();
using (var memoryStream = new MemoryStream())
{
var processor = GetInput(columns, inputs, nameof(PredictorLabelsEnum.Emotion));
var learner = new ClassificationRandomForestLearner();
var serializer = new GenericXmlDataContractSerializer();
var container = new MapModel<int, string>
{
Map = processor.Map,
Model = learner.Learn(processor.Input.Observations, processor.Input.Targets)
};
serializer.Serialize(container, () => new StreamWriter(memoryStream));
responseModel.Items = memoryStream.ToArray().ToList();
}
return responseModel;
}
Deserializing model from DB stream and getting prediction
public virtual ResponseModel<string> GetEstimate(IEnumerable<byte> predictor, IDictionary<int, string> columns, IDataView inputs)
{
var responseModel = new ResponseModel<string>();
using (var memoryStream = new MemoryStream(predictor.ToArray()))
{
var processor = GetInput(columns, inputs);
var serializer = new GenericXmlDataContractSerializer();
var model = serializer.Deserialize<MapModel<int, string>>(() => new StreamReader(memoryStream));
var predictions = model.Predict(processor.Input.Observations);
responseModel.Items.Add(predictions.OrderByDescending(o => o.Key).First().Value);
}
return responseModel;
}
Method GetInput
in the code above is just a conversion from IDataView
format in ML.NET to ObservationSet
format in SharpLearning
. MapModel
is a wrapper that allows to save text labels along with numeric ones.