UTD-degree-webscraper

Scrape UTD Website for Courses

https://www.utdallas.edu/wp-sitemap-posts-fact_sheets-1.xml
- Leads to 'fact-sheets' of 143 degrees
- Store all links, tag="td" class="loc"
Go to each link, on each link do the following first:
- Find catalog page: tag="a" href=True
  - a['href'] will return the link string
  - Store the a['href'] string into a textfile, then add a new line
Now with all href links, open it then start scraping each link:
- Scrape all things with class xind-0 to xind-6
- Scrape h2 tag with class="xind-1" -> this is the name of the degree
- Add all of this to a textfile, with the name being the h2 tagged degree

NOTES / DON'T KNOW HOW TO DO: - How do I store these text files? Should I just store them as txt

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
courses		courses
data		data
jupyter notebooks		jupyter notebooks
updated courses		updated courses
.DS_Store		.DS_Store
README.md		README.md
URL_DEGREES.txt		URL_DEGREES.txt
course codes.csv		course codes.csv
degree plan course table creation.ipynb		degree plan course table creation.ipynb
find all courses.ipynb		find all courses.ipynb
find department initials.py		find department initials.py
scrape facts sheet.py		scrape facts sheet.py
single paged scraper.py		single paged scraper.py

Provide feedback