WSDM Cup 2017: Valdalism Detection

Vandalism detection (task 2) - WSDM Cup 2017

Introduction

We are a team of 4 from Complex Network Research Group (Murata Laboratory) - Tokyo Insitute of Technology. This repository is our submission to the 2017 WSDM Cup. In summary, the task is to detect vandalism in Wikidata dumps.

Milestones

Preparation: 2016/09/26 - 2016/09/30

Objective

Setup personal computer to match each others. (Python 3.5.2, Tensorflow 0.10, scikit-learn 0.17.1, Anaconda virtual env, coding style, etc.)
Literature review. (Paper listing, reading, and discussion)
Competition score metric analysis.
Finalize and present possible approaches.

Daily log

27th: List of vandalism detection papers; setup working environments; study wikidata dumps; analyze WSDM'17 score metrics.
28th: Paper reading; discussion about Random Forest and features selection; focusing on Random Forest model and its variations.
29th: Run the provided reference paper's code on the lab's machine; study related techniques to RR; study NN techiqnues that complement RR.
30th: Review week 1.

Baseline model: 2016/10/03 - 2016/10/07

Objective

Preprocess wikimedia data, study previous features extration code.
Implement simple random forest model based on [1].
Working baseline model and sketch of neural network model.

Daily log

3rd: Features from the baseline model [1] are all hand-picked.
4th: Meeting cancelled.
5th: Meeting cancelled.
6th: Some features are missing compared to the original implementation [1]. Using only 29 available features now yields 0.02 on ROC. This result is extremely low.
7th:

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
research		research
src		src
util		util
wsdmcup17-wdvd-baseline-feature-extraction @ d8d3744		wsdmcup17-wdvd-baseline-feature-extraction @ d8d3744
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WSDM Cup 2017: Valdalism Detection

Introduction

Milestones

Preparation: 2016/09/26 - 2016/09/30

Objective

Daily log

Baseline model: 2016/10/03 - 2016/10/07

Objective

Daily log

About

Releases

Packages

Contributors 2

Languages

net-titech/VandalismDectection-WSDM17

Folders and files

Latest commit

History

Repository files navigation

WSDM Cup 2017: Valdalism Detection

Introduction

Milestones

Preparation: 2016/09/26 - 2016/09/30

Objective

Daily log

Baseline model: 2016/10/03 - 2016/10/07

Objective

Daily log

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages