diff --git a/en/lessons/clustering-visualizing-word-embeddings.md b/en/lessons/clustering-visualizing-word-embeddings.md index 57db5af341..186336c4af 100644 --- a/en/lessons/clustering-visualizing-word-embeddings.md +++ b/en/lessons/clustering-visualizing-word-embeddings.md @@ -26,8 +26,6 @@ doi: 10.46430/phen0111 {% include toc.html %} -# Clustering and Visualising Documents using Word Embeddings - ## Introduction As corpora are increasingly 'born digital' on hard drives as well as web and email servers, we are moving from being able to select or group documents using keyword or manual searches to needing to be able to automate this task at scale. Moreover, large-ish, unlabelled corpora of thousands or tens-of-thousands of documents are not particularly well-suited to topic modelling or TF/IDF analysis either. Since we don't have a sense of what kinds of groups might exist, what kinds of topics might be covered, or what level of distinctiveness in vocabulary might matter, we need different, more flexible ways to visualise and extract structure from texts.