2- Research/Development/Biomedical Improvement
3- Society Discrimination/Negative Impact
4- Patient’s progress/Infected improved life
5- Surge/HIV positive cases
6- HIV negative cases
7- Accident/Death cases
8- Suicide Cases
0- other
1-Python
2-NLTK
3-Keras
4-Matplotlib (for visualization)
5-Spacy (for extracting places names from the articles)
.
└── Web-Scraper-And-Classifier-For-HIV-Articles-
├── Classifier.ipynb
├── Data
│ └── final.xlsx
├── hiv_article_dataset_creator.py
├── LICENSE
├── news_articles_url_scrapper.py
├── README.md
└── visualization
├── all.png
├── death.png
├── matrimony.png
├── pie.png
├── suicide.png
└── visualization.py
- Firstly run
news_articles_url_scrapper.py
. It will scrap all the articles from a given date to another and will dump all the urls in a CSV file named all_articles.csv - Then run
hiv_article_dataset_creator.py
. It will scrap all the HIV articles from theall_articles.csv
fille and a createhiv_report_data.xlsx
file containing columns Year,Heading and Content of the HIV article. - Running the
Classifier.ipynb
will classifiy the HIV articles in the above given categories. - Run
visualization.py
if you want to get the visualized report on (death cases,suicide cases,matrimony related articles and the places mentioned in surge/epidimic category).