Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluations Data #6

Open
1 of 3 tasks
AdamMcAdamson opened this issue Feb 24, 2023 · 11 comments
Open
1 of 3 tasks

Evaluations Data #6

AdamMcAdamson opened this issue Feb 24, 2023 · 11 comments
Assignees
Labels
L3 A task suitable for someone who is comfortable implementing large-scale features/projects. Status: On Hold This has been delayed for a future sprint or release Type: Feature Request New feature or request

Comments

@AdamMcAdamson
Copy link
Member

AdamMcAdamson commented Feb 24, 2023

We would like to provide evaluation data as part of our API.

To this effect, we need to:

@kneevin
Copy link

kneevin commented May 11, 2023

hey! i'd like to take on this scraper.

@AdamMcAdamson
Copy link
Member Author

That sound good to me, @hochladen?

Also, this issue should probably be moved to https://github.com/UTDNebula/api-tools.

@jpahm
Copy link
Contributor

jpahm commented May 11, 2023

I'm open to it; though I can say that this could also be implemented as a part of the existing coursebook scraper (as well as anything else we may need to pull from coursebook)

I'm not entirely opposed to having this as a separate scraper, but I'd say it's a matter of considering if the separation of tasks would be worth the added clutter.

@AdamMcAdamson
Copy link
Member Author

Right, I had forgotten that we can pull eval data from the coursebook scraper with the speedup.

@kneevin
Copy link

kneevin commented May 12, 2023

so should i try to refactor the current scraper so it can scrape the eval. data as well? or is it ok to have two separate scrapers? i think it'll make more sense to refactor the current scraper it'll just take me a bit more time to figure it out

@jpahm
Copy link
Contributor

jpahm commented May 12, 2023

Adding it to the current scraper would probably be best, though I still have to push it since it's still WIP and stored on my PC locally at the moment. It'll probably be a day or two before I do that, since I'm currently in the middle of moving back home.

@jpahm
Copy link
Contributor

jpahm commented Sep 18, 2023

Yeah so it's fair to say "a day or two" was a vast underestimate of how long this would take to add; regardless, it should be added soon as part of the existing scraper now that other priorities have been taken care of.

@jpahm jpahm self-assigned this Sep 18, 2023
@jpahm jpahm transferred this issue from UTDNebula/nebula-api Sep 27, 2023
@jpahm jpahm added the Type: Feature Request New feature or request label Sep 27, 2023
@jpahm jpahm added the L2 A task suitable for someone who is comfortable helping with implementing features. label Oct 24, 2023
@jpahm jpahm added L3 A task suitable for someone who is comfortable implementing large-scale features/projects. and removed L2 A task suitable for someone who is comfortable helping with implementing features. labels Nov 9, 2023
@jpahm
Copy link
Contributor

jpahm commented Jan 18, 2024

I've completed the scraper component of this work, but there are some concerns regarding IP ratelimits to be addressed. A data model and associated database changes still need to be completed.

@jpahm
Copy link
Contributor

jpahm commented Jan 23, 2024

I've completed the scraper component of this work, but there are some concerns regarding IP ratelimits to be addressed. A data model and associated database changes still need to be completed.

Upon further investigation of this, I'm not seeing any immediate great solutions for the IP ratelimit problem -- this problem also occurs with scraping courses, but in a much more manageable fashion. Scraping evals leads to a long IP ratelimit every 30-40 evals or so, which obviously isn't sustainable for scraping en masse.

A solution to this issue that I proposed to @iamwood would be to set up an API endpoint for evals that parses and returns specific evals on-the-fly* rather than parsing them all en-masse. I'll discuss this alongside some other things once the semester starts rolling more.

Any thoughts on this issue are welcome!

*Alongside some sort of caching would be preferred

@democat3457
Copy link
Member

+1 for caching + evals on-the-fly, I think it's a good compromise.

@jpahm
Copy link
Contributor

jpahm commented Mar 12, 2024

So, after putting together an on-demand scraping endpoint for evals, it seems like we are now being hindered on this front by evals being locked behind captcha verification. I'm not sure if there's any way to circumvent this, but I'm out of ideas for the time being. As such, I'm going to be putting this issue on hold in favor of prioritization of other tasks.

@jpahm jpahm added the Status: On Hold This has been delayed for a future sprint or release label Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
L3 A task suitable for someone who is comfortable implementing large-scale features/projects. Status: On Hold This has been delayed for a future sprint or release Type: Feature Request New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants