-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve RSS Feed aggregator #361
Comments
I'd like to avoid an additional dependency for the database. I have an idea for a very minimalistic migration system with python's built-in sqlite lib. Did you have something else in mind, or can i start some work in a branch and we'll see where it goes? I'm not married to the idea, so i'm open to suggestions :) On the account of possibly missing post dates: |
Personally I'm happy to use what's built into python, I'm fine with raw SQL, my main concern is indeed the migrations. As we only have one deploy and that's unlikely to change something minimal could work but I'd be interested to know what it is. |
Practically we hold one directory (lets call it "migrations"), which stores python, or sql files. The files have a prefix which indicates the order. Note: Anything like this isn't going to have the near magical django migration experience, its pretty bare-bones. Anything else would be overkill i think. Now, we only have one problem. How does the instance know what migration to run next (or what migration it currently is at). Sqlite comes to the rescue: I don't mind writing sql, and we can just run the migrations on startup everytime -> Making the deployment near effortless. |
Seems reasonable. I would hold off for now until I've done some more refactoring though. |
We should use a simple sqlite database to store stuff.
We should have a command for adding a feed to a channel. Something like:
/add_feed <channel> <feed_name> <feed_url>
Not sure if the converter is needed. Maybe optional list of choices? But only if we need it for the currently planned feeds.
When the run the command, it grabs the latest entry of that feed, and adds it to the database. We store the current datetime so we know to only grab feeds later than this in the future.
Model something like:
I could be wrong but I don't think there's a reason to store any feed items anywhere.
Then we poll for posts later than
time_of_latest_post
, stick them in the channel, and updatetime_of_latest_post
.It's worth noting that the publication date is optional in RSS. Not sure what we can do about this. Maybe we can do something similar to now and check if the posts are in the channel but I don't see a good way of not splurging every ancient post into the channel. Maybe we should just ignore stuff without a date.
The text was updated successfully, but these errors were encountered: