Skip to content

Meeting Notes June 16 2018

srinivasannambi edited this page Jun 24, 2018 · 1 revision

Where we are today and what are our next steps:

For now, we’ll continue to focus on the project goals so we can get outcomes for the client.

  • Change needed in the leadership board:
    • RMSE will be impacted based on whether or not we scale it. If we are only doing a small range from 0 to 1 vs 0 to 100 | RMSE will be much larger on 0 to 100.
    • Can we scale them back before comparing ? R2 should still apply | we should be able to compare that. But errors should not. RMSE : Needs to be +ve. A negative RMSE is pretty much garbage but it's an indication that it's lower performing model - it’s a scikit-learn implementation.
  • We are still looking for new models and new analysis.
  • While we are starting to adapt to the pipeline, we can keep making modifications in a way that it works for everyone. This is a continuous process. We need to ensure our implementationswork in that class structure of the pipeline.
  • Next few weeks – the implementations will begin to mature.
  • Aside from improving the pipeline implementation, let’s establish a benchmark each week and make necessary modifications to improve the benchmark.
  • Our next goal. We should be able to circle back in 2 weeks and say - here is our level of performance and it should hopefully be a decent improvement. Let’s target a goal of 0.25 R2 | That would be a pretty good jump from 0.19 we have currently. Not working so hard to get 1 increase in R2(?)
  • Classification: Roopa started with classification. There are only 3 true labels. So, for now let's continue with the regressors. It makes more sense to do the regression as we are determining the probability of being potent | having a 1 or o is not going to be more helpful (?). So, let’s keep tackling the regression and at another point - we may decide to go back on classification.
  • What we are defining in the modeling section. Keep the feature importance intact. (?)
  • We are not going to make much progress unless we have more data. Did we hear back from Dr. : Remind Mike | Action for Chris
  • They are interested in using the selleck compounds as they have purchased them from third party as potential compounds. Worst thing we can have is not have columns in the data and limit our features further (?)
Clone this wiki locally