-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default mail if nothing new since last newsletter #12
base: main
Are you sure you want to change the base?
Conversation
max: 5, | ||
minimumDate : new Date('2025-02-01'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit-pick:
minimumDate : new Date('2025-02-01'), | |
minDate : new Date('2025-02-01'), |
}; | ||
|
||
async function scrapeDate($: cheerio.CheerioAPI) { | ||
// Extract the latest dazte of publication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Extract the latest dazte of publication | |
// Extract the latest date of publication |
@@ -15,5 +16,77 @@ export const scrape = async ({ | |||
const $ = cheerio.load(await response.text()); | |||
const dom = new JSDOM($.html()); | |||
const article = new Readability(dom.window.document).parse(); | |||
return article?.textContent.substring(0, maxContentSize); | |||
const publicationDate = await scrapeDate($); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: I'm afraid the way to get the publication date of the article is a bit random. For example, targeting the HTML element to hope to find the publication date with .publication-date, #pub-date
is really specific and will only work for a few pages!
How to be sure that there is a publication date in the page and how to be sure to get the right one if there are several dates in the page?
IMO we can find a better way to handle this feature (using the database for instance).
Problem
The user should receive an email specifying there are no news this time.
Solution
The agent checks for the date of the different piece of news and filter only those published since the last newsletter.
How To Test
Go to newsletterFormat.ts and modify the links, interests or minimumDate of the articles.