Skip to content

Useful bash commands that were used

Hlib edited this page Jan 25, 2019 · 12 revisions

finding most logged projects

ls nn-data/new_framework/en_100_percent/classification/location/10411/train/*.label | xargs -I{}  bash -c "printf \"{} \"; grep "^[01]$" {} | wc -l " | sort -n -k 2

Sample output

...
nn-data/new_framework/en_100_percent/classification/location/10411/train/802_geotools.label 540
nn-data/new_framework/en_100_percent/classification/location/10411/train/306_jdownloader.label 548
nn-data/new_framework/en_100_percent/classification/location/10411/train/186_mule-esb.label 579
nn-data/new_framework/en_100_percent/classification/location/10411/train/192_CONNECT.label 587
nn-data/new_framework/en_100_percent/classification/location/10411/train/531_navajo.label 769
nn-data/new_framework/en_100_percent/classification/location/10411/train/735_hadoop-common.label 966
nn-data/new_framework/en_100_percent/classification/location/10411/train/636_camel.label 1013
nn-data/new_framework/en_100_percent/classification/location/10411/train/640_openflexo.label 1310

Split one project to train/test/valid set

Create fodlers

DATASET="camel"
PROJ="636_camel"
mkdir -p "/home/lv71161/hlibbabii/raw_datasets/devanbu/data/$DATASET/$PROJ/train"
mkdir -p "/home/lv71161/hlibbabii/raw_datasets/devanbu/data/$DATASET/$PROJ/test"
mkdir -p "/home/lv71161/hlibbabii/raw_datasets/devanbu/data/$DATASET/$PROJ/valid"
cd "/home/lv71161/hlibbabii/raw_datasets/devanbu/data/en_100_percent/train/$PROJ"
find . -type d | xargs -I{} mkdir -p /home/lv71161/hlibbabii/raw_datasets/devanbu/data/$DATASET/train/$PROJ/{} /home/lv71161/hlibbabii/raw_datasets/devanbu/data/$DATASET/test/$PROJ/{} /home/lv71161/hlibbabii/raw_datasets/devanbu/data/$DATASET/valid/$PROJ/{}

Check the total number of files in the project

find . -type f | wc -l

Copy files

find . -type f | sort -R > ~/files.tmp
cat ~/files.tmp | head -1300 | xargs -I{} cp {} /home/lv71161/hlibbabii/raw_datasets/devanbu/data/$DATASET/test/$PROJ/{}
cat ~/files.tmp | head -2600 | tail -1300 | xargs -I{} cp {} /home/lv71161/hlibbabii/raw_datasets/devanbu/data/$DATASET/valid/$PROJ/{}
cat ~/files.tmp | tail -5685 | xargs -I{} cp {} /home/lv71161/hlibbabii/raw_datasets/devanbu/data/$DATASET/train/$PROJ/{}
rm ~/files.tmp
echo "Done"

Gen param space

BASE_LRS=( "1e-2" "8e-3" "6e-3" "4e-3" "2e-3" "1e-3" "7.5e-4" "5e-4" "2.5e-4" )
DROPOUTS_MULTIPLIERS=( 0.2 0.6 1 1.4 1.8 2.2 2.6 3 3.5 4 4.5 5 6 7 )
for BASE_LR in "${BASE_LRS[@]}"; do
	for DROPOUTS_MULTIPLIER in "${DROPOUTS_MULTIPLIERS[@]}"; do
		echo "{\"training.lrs.base_lr\": $BASE_LR, \"arch.drop.multiplier\": $DROPOUTS_MULTIPLIER}" > 1.clp.${BASE_LR}_${DROPOUTS_MULTIPLIER}.py;
	done;
done;

Collect hyperparam search results

PATTERN="^100"
for f in $(ls | grep "$PATTERN"); do 
	drop=$(tr "," "\n" < $f/params.json | grep "drop" | grep "multiplier") 
	lr=$(tr "," "\n" < $f/params.json | grep "base_lr")
	epoch=$(cat $f/models/epoch.best)
	loss=$(cat $f/models/loss.best)
	acc=$(cat $f/models/acc.best)

	out="$out\n$drop\t$lr\t$epoch\t$loss\t$acc"
done
echo -e "$out"
Clone this wiki locally