Topic Reduction #1134
Unanswered
econinomista
asked this question in
Q&A
Topic Reduction
#1134
Replies: 1 comment 1 reply
-
If you want to only keep the 100 frequented topics and not do anything else with the other topics, including their respective documents, then it might be worthwhile to use manual topic modeling. You would take the documents and their labels of the top 100 most frequent topics and create a separate model that only learns to predict those 100 topics. Other than that, you could also merge all other topics together and consider them to be an outlier class. That way, you would not have to create a separate model. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear all,
I have a question regarding the reducing of topics in BERT. I have trained a model using twitter data and the top 100 frequented topics look very good to me. Now, I would like to keep those 100 topics for predicting the rest of my data. I do not want to use model.reduce_topics, since the topics I get then are too broad. Is there a possibility to just keep my top 100 topics from before and perform the predictions based on that and is there someone who already has experience in doing so?
Best and thanks in advance
Nikola
Beta Was this translation helpful? Give feedback.
All reactions