Skip to content

camclark/inSOLVEncy_2018_GovHack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

inSOLVEncy_2018_GovHack

Project Description

Using ML to identify which individuals will commit insolvency by creating a compliance risk model and visualizing the results.

Our GovHack Project Page

Watch our Project Summary Video

Bounty Winner: Best use of Gold Coast Data

Data Story

Our project inSOLVEnt takes a multifaceted approach to what is a multifaceted problem by creating not only a risk model for addressing non-compliance to personal insolvency, but visualisations and infographics addressing the common factors leading to negative insolvent outcomes.

We utilised the non-compliance personal insolvency data to first identify cases of non-compliance versus compliance. We streamlined this data using other data sources such as Regional Statistics, the ATO GovHack 2018 statistics, and ANZSCO occupation and regional classifications.

Once we had a clean data set we ran through tensa flows to identify a model. We tried neural networks first, which were overfitting and not generalising in our tests. We decided to simplify using linear regressions which worked well. Out of 250,000 records we misidentified 5. This is an incredible accuracy result. When training our model, we first separated all compliance and all non-compliance. Each where then randomly split using an 80% training, and 20% validation split. As non-compliance events were the minority, this method was to ensure that our training subsets were balanced.

We further delved into these results by isolating Gold Coast data by utilising AS3 data sets. We retrained our model in the same method. Our validation results for the Gold Coast data was 100%. The Gold Coast data was consistent with the National model, reinforcing the robustness of our solution. We retained both models using mean absolute error, rather than mean squared error, as mean squared error amplifies outliers.

Our model is able to predict non-compliance events to a high degree of accuracy. This risk model can be used by regulatory bodies to target audit and compliance services, and individuals and corporate entities to self-identify their compliance risk.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published