Skip to content
Reid Beels edited this page Aug 4, 2011 · 1 revision

What kind of duplication will we need to handle? Thoughts on dealing with event duplication.

**Labels:**Phase-Requirements,duplication

What kinds of event duplication will we need to handle?

  • Giving people tools to sort out the incoming pieces is important
  • Being able to set a canonical event to cluster duplicates around
  • Being able to delete pure duplicate content or mistake

**** Having the ability to track what has been deleted in case of mistakes, some sort of versioning

Thoughts on handling event duplication

http://www.rubyinside.com/bloom-filters-a-powerful-tool-599.html can be used to help dedupe - effectively you can make 'fingerprints' of things. -Anselm Hook

Perhaps problem can be somewhat ameliorated by not scraping events from calendars that are merely secondary sources -- they contain only (or almost only) events that appear elsewhere on the web.