Readme

Description

This repo was generated for participating in the upstage contest. I got 0.8598 AUC, conducting ensemble catboost and lgb model from tabular data.

Goal : Binary classification
Using log data(2009/12~2011/12) from 5914 users, predict the probability of each customer's total purchases exceeding 300 on December 2011.
Module execution : Inference > feature engineering > feature generation
- inference.py
  With parsed arguments, print out Out Of Fold Validation Score by using StratifiedKFold from sklearn.model_selection.
- feature_generation.py
  By adding or subtracting required columns, manipulate pandas dataframe.
- feature_engineering
  - Execute functions : generate label, feature preprocess, and feature generation
  - By using the aggregation function, grow features or merge features

Installation

pip install -r requirements.txt at ./code dir.

catboost==0.24.4
lightgbm==3.1.1
matplotlib==3.1.3
numpy==1.19.5
pandas==1.1.5
scikit-learn==0.24.1
seaborn==0.11.1
xgboost==1.3.3

Example

inference.py

From arg parser, you can choose one from various options.

parser.add_argument('--seed', type=int, default=0, help="base seed is 0")
parser.add_argument('--ym', type=str, default='2011-12', help="add target year_month to predict, base is 2011-12")
parser.add_argument('--engineering', type=str, default='feature_engineering_all', help="choose feature engineering type")
parser.add_argument('--ensemble', type=bool, default=True, help="choose true if you want to ensemble model, else choose false")

train, test, y, features = getattr(import_module("feature_engineering"), args.engineering)(data, year_month)

if args.ensemble:
    oof_xgb, xgb_pred, fi = make_xgb_oof_prediction(train, y, test, features, model_params=xgb_params)
    oof_lgb, lgb_pred, fi = make_lgb_oof_prediction(train, y, test, features, model_params=lgb_params)
    oof_cat, cat_pred, fi = make_cat_oof_prediction(train, y, test, features, model_params=cat_params)
    ...

Contraints

Input directory doesn't exist!
- train.csv and sample_submission.csv was not uploaded. But the column info was opened below.
Columns of train data

Improvements

~~Things to improve.~~
Add Feature selection (sample)
Using Tabnet architecture. (sample1, sample2)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
code		code
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Readme

Description

Installation

Example

Contraints

Improvements

About

Releases

Packages

Languages

bcaitech1/p2-tab-ebbunnim

Folders and files

Latest commit

History

Repository files navigation

Readme

Description

Installation

Example

Contraints

Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages