Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: Expand recurring events until end date is reached #48

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

antosan
Copy link
Contributor

@antosan antosan commented May 27, 2021

Fixes #34

Refactored the parser implementation and cleaned up the parser file - we previously expanded all the events from the start date and had to filter them again afterwards (this led to fewer events being returned sometimes). The parser now receives optional startDate and endDate and returns all events after startDate and before endDate if they are provided.

Recurring finite events (events ending after a specified occurrences count or events with an until date) are expanded up-to the endDate if it is provided or expanded until all upcoming events are exhausted if the endDate is not provided.

Infinite events are expanded up-to the endDate if provided and if not, only 10 upcoming events are expanded.

}
if (!endRange && !recur.isFinite() && occurrencesCount >= MAX_INFINITE_OCCURRENCES_COUNT) {
break;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how the break/continue conditions are at the beginning of the for loop 👍

Personally, I would've reached for a while loop here:

while (next = expand.next()) {
  // do something with next
}

but that's just a minor stylistic preference.

Logic-wise I'm not 100% sure if it's a good idea to include the possibility of an almost infinite loop. The problem here is that the library user of scrapegoat has no way to limit the iteration count if bad data is passed to the parser. Personally I'd prefer the library user to specify the upper iteration bound, like maxExpandCount. If they don't care about this issue, they still can pass maxExpandCount: Infinity. I also think that this upper iteration bound should be applied to all events. It also would be a required parameter which forces you to think about this possibility.

Furthermore I think there's an opportunity for improvement here: we could remove all recurring events that start after the specified end date. That would save us some iterations.

So my preferred logic would be:

  • Remove all events that start after endRange (if specified)
  • Then iterate:
    • If max iteration count is reached: break
    • If date is after end date: break
    • If date is before start date: continue

Maybe you can also add an error if startRange is after endRange.

What do you think? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising the almost infinite loop possibility 👍 The only case where I see an almost infinite loop occurring is when an event is finite but has an end date (or count) that is way too far in the future (for infinite events we are already capping the iterations to 10). Are we seeing this the same way?

I like the idea of the library user specifying the upper iteration bound 👍 If I understand you well, I will include maxExpandCount in the config options provided by the user as a required parameter and pass it to all events:

const config = {
    auth: {
        user: BAIKAL_USER,
        pass: BAIKAL_PASSWORD,
    },
    uri: BAIKAL_URI,
    maxExpandCount: 10,
};

Furthermore I think there's an opportunity for improvement here: we could remove all recurring events that start after the specified end date. That would save us some iterations.

This situation does not exist. When we query CalDav with an end date specified, the response we get only includes recurring events that start (or have an occurrence) between the startRange and endRange.

If they don't care about this issue, they still can pass maxExpandCount: Infinity.

Here is where It gets blurry again: (going with the logic you suggested)

CASE 1: startRange and endRange provided and maxExpandCount: Infinity

  • If the endRange is near the startRange, then all is well. However, if the endRange is way into the future, we have the same possibility of an almost infinite loop, only that the user opted into it - this is OK

CASE 2: startRange provided, endRange not provided and maxExpandCount: Infinity

  • What happens here? This will loop indefinitely... (at least until the year 10000 breaks the JS Date 😄) I do not see a use case of maxExpandCount being set to Infinity, it should be a finite number always.
    What do you think?

Maybe you can also add an error if startRange is after endRange.

Right now we silently change the endRange to null if this case happens but I agree with throwing an error 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising the almost infinite loop possibility 👍 The only case where I see an almost infinite loop occurring is when an event is finite but has an end date (or count) that is way too far in the future (for infinite events we are already capping the iterations to 10). Are we seeing this the same way?

Yes, I was referring to this case. I don't know how common that is. But I know that some tools limit the max iteration count or the end date when creating new re-occurring events.

I think I would scope the config option under events:

const config = {
    auth: {
        user: BAIKAL_USER,
        pass: BAIKAL_PASSWORD,
    },
    uri: BAIKAL_URI,
    events: {
		maxExpandCount: 10,
    }
};

This situation does not exist. When we query CalDav with an end date specified, the response we get only includes recurring events that start (or have an occurrence) between the startRange and endRange.

Ok, I didn't know that. Thanks for clarifying. You could add a comment in the code that explains this.

About the logic behind maxExpandCount:

I think maxExpandCount should be a finite number by default and should be applied to all situations, even when there's a start and an end date. Specifying maxExpandCount gives you the guarantee that scrapegoat won't expand events above a given limit, even when the end date is far in the future. Personally I think using Infinity for "Don't apply any limit" is ok, but I'm also ok with using undefined. In both cases, it needs to be explained in the README. I would use a finite number by default because often developers don't think about these edge cases. But I think the number can be really high, like 5.000 (this would include a reoccurring event every day for 10 years).

Regarding your concern when an endDate is missing: I think it's ok to have the possibility for an infinite loop. Since the default for maxExpandCount is a finite number, you need disable this safety net explicitly. If you do that, you're on your own. Again, I think there should be a warning in the README.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for the clarification. This is now implemented and ready for review.

@@ -152,8 +152,6 @@ function parseEvents(xml) {

etag = stripDoubleQuotes(etag);

let eventData = {};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning this up :)

lib/Calendar.js Outdated
return isNotRecurring || isSameDayOrAfter;
});
});
.then((xml) => xmlParser.parseEvents(xml, start, end));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove this, you need to make sure that non recurring events are still filtered by start and end date.

Apparently this feature doesn't have any tests. Could you add some? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-recurring events are not a problem here. We only needed this in the previous version for recurring events. When we call byTimeTemplate({ start, end }) the returned events are already filtered out by start and end date. If this is not the case by any chance then that would be a problem with CalDav and not scrapegoat. This is already covered in the current tests.

expect(response[0]).to.have.property("data");
});
});
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, more tests 👍

@antosan antosan requested a review from jhnns July 9, 2021 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Recurring events not expanded to full range
2 participants