-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Minutes Data Working Group 30 Jul 2020
Brad edited this page Jul 31, 2020
·
2 revisions
- Review MONAI Github wiki for minutes and notes
- Review previous meeting minutes
- Review discussions/presentations from joint working group and steering committee
- Plan next steps
- Joint effort between Data and Evaluation, Reproducibility & Benchmarking workgroups is to model samples that come from challenges and papers
- E.g., review the surgical data examples from papers referenced in previous minutes page
- Data workgroup should create a prototype of structure and schema in FHIR
- (Brad) Synthesize a FHIR resource based on the papers
- MONAI should explore which Python FHIR library to explore, that can effectively convert "FHIR to Tensor"
- (Brad) Look into Python libraries and share
- MLFlow
- Explore MLFlow as a potential model lifecycle management tool; details from website (from https://mlflow.org/) include:
- MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
- MLflow Tracking: Record and query experiments: code, data, config, and results
- MLflow Projects: Package data science code in a format to reproduce runs on any platform
- MLflow Models: Deploy machine learning models in diverse serving environments
- Model Registry: Store, annotate, discover, and manage models in a central repository
- Question: How does MLFlow intersect with MONAI?
- Question: Does MONAI plug into MLFlow and how does it integrate with the ecosystem?
- What group should explore this? Might be something for the reproducability group
- Explore MLFlow as a potential model lifecycle management tool; details from website (from https://mlflow.org/) include:
- Integrations and Partners
- What about H2O.ai? - H2O.AI has AutoML and reproducability tooling
- Should there be a "partners" ad-hoc workgroup for MONAI to look at integrations within the broader community? - e.g., what about AWS?
- Feedback from engineering
- Dev team should look at slide 7 of the joint working group content
- Should look at cross-validation; how to stratify the data, repeat the training workflow
- Validation meaning giving the results of the training model
- E.g., repeat validation 5 times - unbiased validation of model quality
- Currently MONAI 0.2 only supports validation fraction of data - only parameter available for now - need to expand so users can generate a fixed set (not just random seed)
- MSD - only used in the JSON file provided originally by the challenge provider
- Look at proposal of a FHIR specification; e.g., need a converter to take MSD to a FHIR format (could be a utility library)
- Can the underlying data representation be normalized? Need to do experiment
- Evaluation: MSD is readonly - how do you filter studies based on predictive match
- Search terms which would subset the data into a virtual collection (e.g., by patient age range)
- What about holdout data, how to make this consistently done?
- (Brad) Synthesize a FHIR resource based on the papers
- (Brad) Look into Python libraries and share
- (Brad) Share Powerpoints and create Github issues to represent directions to grow