In step 1, we now render tables in extended (Pandoc) markdown. This means that Pandoc can be used to output either pdf or html in a well-formatted way.
We are now transforming from latex to markdown
R scripts for datamining.
This project aims to efficiently implement the CRISP datamining cycle. Initially we focus on supervised models for dichotomous class predictive models.
The project tries to combine R scripts with knitr in R-studio in such a way that we walk through the CRISP phases and deliver a scoring model in the most efficient way. In the meantime we would like to document our steps in a report that is nice enough to be used a reference of the work been done.
From wikipedia: CRISP-DM breaks the process of data mining into six major phases:
- Business Understanding
- Data Understanding
- Data Preparation
- Modelling
- Evaluation
- Deployment
In this project we focus on automating the steps of Data Understanding, Data Preparation, Modelling, Evaluation.