NLP

About the dataset

The AG's news topic classification dataset is constructed by choosing 4 largest classes from the original corpus. Each class contains 30,000 training samples and '1,900' testing samples. The total number of training samples is 120,000 and testing 7,600.

task1

we do preprocessing at data and Calculate the probabilities of N_Grams

task2

finally, we end the project by doing :

Feature extraction ( apply all 3 algorithms with the classifier and choose the best according to the model's accuracy)
ML classifier ( apply any ML classifier SVM, NB, DT, RF, etc.) and evaluation metrics ( including model's accuracy, confusion matrix )

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
FINAL (1).ipynb		FINAL (1).ipynb
README.md		README.md
TASK1.ipynb		TASK1.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP

About the dataset

task1

task2

About

Releases

Packages

Languages

AlaaElhariry/NLP

Folders and files

Latest commit

History

Repository files navigation

NLP

About the dataset

task1

task2

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages