From 5a68995f3657790fee72274b96c135b6f1115ec1 Mon Sep 17 00:00:00 2001 From: Anisa Hawes <87070441+anisa-hawes@users.noreply.github.com> Date: Fri, 29 Sep 2023 12:06:40 +0100 Subject: [PATCH] Update clustering-visualizing-word-embeddings.md Remove duplicate title. --- en/lessons/clustering-visualizing-word-embeddings.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/en/lessons/clustering-visualizing-word-embeddings.md b/en/lessons/clustering-visualizing-word-embeddings.md index 57db5af341..186336c4af 100644 --- a/en/lessons/clustering-visualizing-word-embeddings.md +++ b/en/lessons/clustering-visualizing-word-embeddings.md @@ -26,8 +26,6 @@ doi: 10.46430/phen0111 {% include toc.html %} -# Clustering and Visualising Documents using Word Embeddings - ## Introduction As corpora are increasingly 'born digital' on hard drives as well as web and email servers, we are moving from being able to select or group documents using keyword or manual searches to needing to be able to automate this task at scale. Moreover, large-ish, unlabelled corpora of thousands or tens-of-thousands of documents are not particularly well-suited to topic modelling or TF/IDF analysis either. Since we don't have a sense of what kinds of groups might exist, what kinds of topics might be covered, or what level of distinctiveness in vocabulary might matter, we need different, more flexible ways to visualise and extract structure from texts.