Skip to content

Topic size and deduplication

Compare
Choose a tag to compare
@ddangelov ddangelov released this 07 Apr 20:09
· 209 commits to master since this release

Topic size is defined as the number of document vectors which have the topic as its nearest topic vector. Search by topic has been modified to only show documents who have the topic as its nearest topic, in order to avoid overlapping results from similar topics.

Topic deduplication is added to make topics more robust.