This repository was archived by the owner on Dec 14, 2023. It is now read-only.
This repository was archived by the owner on Dec 14, 2023. It is now read-only.
GSoC 2019 ideas #568
Closed
Description
Student application period has just started, so it's about time for us to come up with some GSoC ideas.
We can reuse some from the last year:
- Build a tool to do some cool visualizations
- Create PostgreSQL-based job queue
- Probably too hard and not going to happen, but it doesn't hurt to leave it there. Or should we just remove it?
- Implement a method to detect subtopics of a topic
- A lot of students are asking about this idea but I'm not sure if I'm the one who would be able to be the best mentor for this task as (simply put) I don't know much about the subject. Dongge, our GSoC 2017 student, did implement subtopics using Louvain but it's still unmerged to this day.
- Do your own freehand project
I'd also add easier, low-priority tasks from our side-projects, e.g. the Ultimate Sitemap Parser:
- (easy) Add support for RSS / Atom sitemaps
- (easy) Detection of sitemap if it's not present in robots.txt (has some relevant discussion and stats too; I wonder whether Alex (@kienli) is a student!)
- (medium-hard)
yield
found links instead ofreturn
ing them (simple concept but would probably require rethinking and rewriting a lot of stuff)
sentence_splitter
Python module:
- Add more supported languages?