Evaluations Data #6

AdamMcAdamson · 2023-02-24T23:30:48Z

We would like to provide evaluation data as part of our API.

To this effect, we need to:

Extend our database schema to include Evalution data
Create a Golang model conforming to the schema
Develop a scraper with chromedp to scrape the data available from https://coursebook.utdallas.edu/ues-report/

kneevin · 2023-05-11T21:14:18Z

hey! i'd like to take on this scraper.

AdamMcAdamson · 2023-05-11T22:02:40Z

That sound good to me, @hochladen?

Also, this issue should probably be moved to https://github.com/UTDNebula/api-tools.

jpahm · 2023-05-11T22:08:53Z

I'm open to it; though I can say that this could also be implemented as a part of the existing coursebook scraper (as well as anything else we may need to pull from coursebook)

I'm not entirely opposed to having this as a separate scraper, but I'd say it's a matter of considering if the separation of tasks would be worth the added clutter.

AdamMcAdamson · 2023-05-11T22:11:49Z

Right, I had forgotten that we can pull eval data from the coursebook scraper with the speedup.

kneevin · 2023-05-12T18:47:15Z

so should i try to refactor the current scraper so it can scrape the eval. data as well? or is it ok to have two separate scrapers? i think it'll make more sense to refactor the current scraper it'll just take me a bit more time to figure it out

jpahm · 2023-05-12T19:06:13Z

Adding it to the current scraper would probably be best, though I still have to push it since it's still WIP and stored on my PC locally at the moment. It'll probably be a day or two before I do that, since I'm currently in the middle of moving back home.

jpahm · 2023-09-18T21:24:18Z

Yeah so it's fair to say "a day or two" was a vast underestimate of how long this would take to add; regardless, it should be added soon as part of the existing scraper now that other priorities have been taken care of.

jpahm · 2024-01-18T04:10:47Z

I've completed the scraper component of this work, but there are some concerns regarding IP ratelimits to be addressed. A data model and associated database changes still need to be completed.

jpahm · 2024-01-23T19:02:59Z

I've completed the scraper component of this work, but there are some concerns regarding IP ratelimits to be addressed. A data model and associated database changes still need to be completed.

Upon further investigation of this, I'm not seeing any immediate great solutions for the IP ratelimit problem -- this problem also occurs with scraping courses, but in a much more manageable fashion. Scraping evals leads to a long IP ratelimit every 30-40 evals or so, which obviously isn't sustainable for scraping en masse.

A solution to this issue that I proposed to @iamwood would be to set up an API endpoint for evals that parses and returns specific evals on-the-fly* rather than parsing them all en-masse. I'll discuss this alongside some other things once the semester starts rolling more.

Any thoughts on this issue are welcome!

*Alongside some sort of caching would be preferred

democat3457 · 2024-01-26T22:13:50Z

+1 for caching + evals on-the-fly, I think it's a good compromise.

jpahm · 2024-03-12T22:21:36Z

So, after putting together an on-demand scraping endpoint for evals, it seems like we are now being hindered on this front by evals being locked behind captcha verification. I'm not sure if there's any way to circumvent this, but I'm out of ideas for the time being. As such, I'm going to be putting this issue on hold in favor of prioritization of other tasks.

jpahm self-assigned this Sep 18, 2023

jpahm transferred this issue from UTDNebula/nebula-api Sep 27, 2023

jpahm added the Type: Feature Request New feature or request label Sep 27, 2023

jpahm added the L2 A task suitable for someone who is comfortable helping with implementing features. label Oct 24, 2023

jpahm added L3 A task suitable for someone who is comfortable implementing large-scale features/projects. and removed L2 A task suitable for someone who is comfortable helping with implementing features. labels Nov 9, 2023

jpahm added the Status: On Hold This has been delayed for a future sprint or release label Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluations Data #6

Evaluations Data #6

AdamMcAdamson commented Feb 24, 2023 •

edited by jpahm

Loading

kneevin commented May 11, 2023

AdamMcAdamson commented May 11, 2023

jpahm commented May 11, 2023

AdamMcAdamson commented May 11, 2023

kneevin commented May 12, 2023

jpahm commented May 12, 2023

jpahm commented Sep 18, 2023

jpahm commented Jan 18, 2024

jpahm commented Jan 23, 2024 •

edited

Loading

democat3457 commented Jan 26, 2024

jpahm commented Mar 12, 2024

Evaluations Data #6

Evaluations Data #6

Comments

AdamMcAdamson commented Feb 24, 2023 • edited by jpahm Loading

kneevin commented May 11, 2023

AdamMcAdamson commented May 11, 2023

jpahm commented May 11, 2023

AdamMcAdamson commented May 11, 2023

kneevin commented May 12, 2023

jpahm commented May 12, 2023

jpahm commented Sep 18, 2023

jpahm commented Jan 18, 2024

jpahm commented Jan 23, 2024 • edited Loading

democat3457 commented Jan 26, 2024

jpahm commented Mar 12, 2024

AdamMcAdamson commented Feb 24, 2023 •

edited by jpahm

Loading

jpahm commented Jan 23, 2024 •

edited

Loading