You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you are concerned about hallucinations in the topic categorization, I have just summarized various ideas that we have at polis to reduce them by exploiting compute at test-time with multi-sampling and/or semantic entropy: compdemocracy/polis#1880 .
You are already exploiting the stochasticity of the LLM with the rejection/retry scheme when the LLM gives an empty answer, in
`Expected all ${uncategorizedCommentsForModel.length} comments to be categorized, but ${uncategorized.length} are not categorized properly. Retrying in ${RETRY_DELAY_MS/1000} seconds...`
Multi-sampling / Semantic entropy would go one step further by systematically calling multiple times, not just upon failure [the two ideas can also be nested]. You would have a for loop not to MAX_RETRIES but to N_SAMPLES (say, 5 or 10, ideally parallelized), then a processing step.
Yes it will take longer, thus a trade-off with #11 :) but hopefully that parallelization will more than compensate. Cost-wise, it does multiply the number of calls to the LLM over all the comments, thus the total cost, which might or might not be an issue depending on the current typical cost, and if topics are "good enough" right now.
The text was updated successfully, but these errors were encountered:
Follow-up: if you're interested in looking how to use similar test-time semantic entropy for summarization, I've written here how that could work: compdemocracy/polis#1881 . It's a bit more complicated, but the core idea is the same.
jucor
changed the title
Using multi-sample/semantic-entropy for topic categorization
Using multi-sample/semantic-entropy for topic categorization (and for summarization?)
Jan 22, 2025
Dear Jigsaw team
If you are concerned about hallucinations in the topic categorization, I have just summarized various ideas that we have at polis to reduce them by exploiting compute at test-time with multi-sampling and/or semantic entropy: compdemocracy/polis#1880 .
You are already exploiting the stochasticity of the LLM with the rejection/retry scheme when the LLM gives an empty answer, in
sensemaking-tools/src/tasks/categorization.ts
Lines 54 to 89 in 9c86f3b
Multi-sampling / Semantic entropy would go one step further by systematically calling multiple times, not just upon failure [the two ideas can also be nested]. You would have a for loop not to
MAX_RETRIES
but toN_SAMPLES
(say, 5 or 10, ideally parallelized), then a processing step.Yes it will take longer, thus a trade-off with #11 :) but hopefully that parallelization will more than compensate. Cost-wise, it does multiply the number of calls to the LLM over all the comments, thus the total cost, which might or might not be an issue depending on the current typical cost, and if topics are "good enough" right now.
The text was updated successfully, but these errors were encountered: