GSoC 2019 ideas

Student application period has just started, so it's about time for us to come up with some GSoC ideas.

We can reuse some from the [last year](https://cyber.harvard.edu/gsoc/MediaCloud):

* **Build a tool to do some cool visualizations**
* **Create PostgreSQL-based job queue**
    * Probably too hard and not going to happen, but it doesn't hurt to leave it there. Or should we just remove it?
* **Implement a method to detect subtopics of a topic**
    * A lot of students are asking about this idea but I'm not sure if I'm the one who would be able to be the best mentor for this task as (simply put) I don't know much about the subject. Dongge, our GSoC 2017 student, did implement subtopics using Louvain but it's still unmerged to this day.
* **Do your own freehand project**

I'd also add easier, low-priority tasks from our side-projects, e.g. the [Ultimate Sitemap Parser](https://github.com/berkmancenter/mediacloud-ultimate_sitemap_parser/issues):

* *(easy)* [Add support for RSS / Atom sitemaps](https://github.com/berkmancenter/mediacloud-ultimate_sitemap_parser/issues/3)
* *(easy)* [Detection of sitemap if it's not present in robots.txt](https://github.com/berkmancenter/mediacloud-ultimate_sitemap_parser/issues/8) (has some relevant discussion and stats too; I wonder whether Alex (@kienli) is a student!)
* *(medium-hard)* [`yield` found links instead of `return`ing them](https://github.com/berkmancenter/mediacloud-ultimate_sitemap_parser/issues/2) (simple concept but would probably require rethinking and rewriting a lot of stuff)

[`sentence_splitter` Python module](https://github.com/berkmancenter/mediacloud-sentence-splitter):

* Add more supported languages?

[`feed_seeker`](https://github.com/mitmedialab/feed_seeker):

* Fix https://github.com/mitmedialab/feed_seeker/issues/2 and https://github.com/mitmedialab/feed_seeker/issues/3?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GSoC 2019 ideas #568

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GSoC 2019 ideas #568

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions