2023 NYCU BioML

2023 NYCU Biological Machine Learning Final Project

Download project

git clone https://github.com/ConnectionOuOb/NYCU-2023-BioML.git

Download modules

conda install matplotlib xgboost catboost scikit-learn pandas biopython

Generate NR set by CD-Hit

find testKmer/ -name "*.fasta" | xargs -I % bash -c 'cd-hit -i % -o testNR/$(basename % .fasta).nr050.fasta -c 0.5 -n 2 -T 0'

Contributor

Connection

Generate Basic & SSE-PSSM feature sets
All ML/DL related coding
All Experiment

Ivern

Generate iFeature feature sets
Some Model test case
Independent test

Brian

Generate customize feature sets
Dataset pre-processing
Independent test

Reference

Chen TR, Juan SH, Huang YW, Lin YC, Lo WC. A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction. PLoS One. 2021 Jul 28;16(7):e0255076. doi: 10.1371/journal.pone.0255076. PMID: 34320027; PMCID: PMC8318245.
Zhen Chen, Pei Zhao, Fuyi Li, André Leier, Tatiana T Marquez-Lago, Yanan Wang, Geoffrey I Webb, A Ian Smith, Roger J Daly, Kuo-Chen Chou, Jiangning Song, iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences, Bioinformatics, Volume 34, Issue 14, July 2018, Pages 2499–2502, https://doi.org/10.1093/bioinformatics/bty140
Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012 Dec 1;28(23):3150-2. doi: 10.1093/bioinformatics/bts565. Epub 2012 Oct 11. PMID: 23060610; PMCID: PMC3516142.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
bin		bin
dataset		dataset
lib		lib
stastistics		stastistics
.gitattributes		.gitattributes
.gitignore		.gitignore
Assignment1.py		Assignment1.py
Assignment3.py		Assignment3.py
Brian_Experiment2.py		Brian_Experiment2.py
DistSSEPSSM_Client.py		DistSSEPSSM_Client.py
DistSSEPSSM_Server.py		DistSSEPSSM_Server.py
Experiment1_All.py		Experiment1_All.py
Experiment1_All_SSEPSSM.py		Experiment1_All_SSEPSSM.py
Experiment1_CAT.py		Experiment1_CAT.py
Experiment1_NN_NoNorm.py		Experiment1_NN_NoNorm.py
Experiment1_NoNorm.py		Experiment1_NoNorm.py
Experiment1_Norm.py		Experiment1_Norm.py
Experiment2_Step1.py		Experiment2_Step1.py
Experiment2_Step1_Norm.py		Experiment2_Step1_Norm.py
Experiment2_Step2.py		Experiment2_Step2.py
Experiment2_Step2_Norm.py		Experiment2_Step2_Norm.py
Experiment3_CAT.py		Experiment3_CAT.py
Experiment3_RF.py		Experiment3_RF.py
Experiment3_XGB.py		Experiment3_XGB.py
FeatureTest_SSEPSSM.py		FeatureTest_SSEPSSM.py
GenSSEPSSM.py		GenSSEPSSM.py
GenSSEPSSM_CSV.py		GenSSEPSSM_CSV.py
IndependentTest.py		IndependentTest.py
Ivern_IndependentTest.py		Ivern_IndependentTest.py
Ivern_ModelTest.py		Ivern_ModelTest.py
Ivern_TrainingTesting.py		Ivern_TrainingTesting.py
ModelTest.py		ModelTest.py
ModelTest_NN.py		ModelTest_NN.py
ModelTest_SSEPSSM_NN.py		ModelTest_SSEPSSM_NN.py
ModelTest_Seq2Seq.py		ModelTest_Seq2Seq.py
ModelTest_TFM.py		ModelTest_TFM.py
ModelTest_Trans.py		ModelTest_Trans.py
ModelTest_TransF.py		ModelTest_TransF.py
ModelTest_Transformer.py		ModelTest_Transformer.py
PrintModel.py		PrintModel.py
README.md		README.md
SaveBestFeature.py		SaveBestFeature.py
SaveBestFeature_Norm.py		SaveBestFeature_Norm.py
SaveSSEPSSM.py		SaveSSEPSSM.py
SaveSSEPSSM_Norm.py		SaveSSEPSSM_Norm.py
ShowConformation.py		ShowConformation.py
TestHypothesis.py		TestHypothesis.py
TrainingTesting.py		TrainingTesting.py
ValidateSSEPSSM.py		ValidateSSEPSSM.py
brian_IT.py		brian_IT.py
independent_test_answer_1226.csv		independent_test_answer_1226.csv
ssepssm		ssepssm
testSeq2Seq.py		testSeq2Seq.py
workflow.md		workflow.md
第06組.csv		第06組.csv
第06組_1.csv		第06組_1.csv
第06組_2.csv		第06組_2.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

2023 NYCU BioML

2023 NYCU Biological Machine Learning Final Project

Download project

Download modules

Generate NR set by CD-Hit

Contributor

Connection

Ivern

Brian

Reference

About

Releases

Packages

Contributors 3

Languages

ConnectionOuOb/NYCU-2023-BioML

Folders and files

Latest commit

History

Repository files navigation

2023 NYCU BioML

2023 NYCU Biological Machine Learning Final Project

Download project

Download modules

Generate NR set by CD-Hit

Contributor

Connection

Ivern

Brian

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages