Speed-up calls to LLM by parallelization of the topic categorization #11

jucor · 2025-01-21T21:02:00Z

Dear Jigsaw team

As discussed by email, it would be really helpful if the library could run faster. The topic learning is very fast. The categorization could do with being faster. In our IRL conversations I remember you mentioned that you definitely had this in mind, so I’m just adding it here to follow-up.

When looking at the categorization code, I suspect you were probably thinking of parallelizing the call accross the mini-batches, i.e. this loop:

sensemaking-tools/src/sensemaker.ts

Lines 217 to 235 in 8eb482e

    
           for ( 
        
             let i = 0; 
        
             i < comments.length; 
        
             i += this.modelSettings.defaultModel.categorizationBatchSize 
        
           ) { 
        
             const uncategorizedBatch = comments.slice( 
        
               i, 
        
               i + this.modelSettings.defaultModel.categorizationBatchSize 
        
             ); 
        
             const categorizedBatch = await categorizeWithRetry( 
        
               this.modelSettings.defaultModel, 
        
               instructions, 
        
               uncategorizedBatch, 
        
               includeSubtopics, 
        
               topics, 
        
               additionalInstructions 
        
             ); 
        
             categorized.push(...categorizedBatch); 
        
           }

Parallelizing this loop seems possibly the highest-level way with the less amount of work needed and the maximum return.

Of course, as you also pointed, there’s the question of whether Vertex will throttle the requests. Does Vertex offer an async caller which automatically respects its throttling limits? That would be neat :)

Thanks!

dborkan · 2025-01-22T18:37:56Z

@jucor thanks for sharing and pointing out the relevant code locations. We're definitely interested in speeding this up, and are looking to implement parallelization likely after our current sprint to tackle hallucinations.

There are Vertex quota limits, so we'll need to do some rate limiting ourselves or find a helper library. We did recently create a helper function resolvePromisesInParallel for summarization, this could be a good starting point for categorization

jucor mentioned this issue Jan 22, 2025

Using multi-sample/semantic-entropy for topic categorization (and for summarization?) #12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed-up calls to LLM by parallelization of the topic categorization #11

Speed-up calls to LLM by parallelization of the topic categorization #11

jucor commented Jan 21, 2025

dborkan commented Jan 22, 2025

Speed-up calls to LLM by parallelization of the topic categorization #11

Speed-up calls to LLM by parallelization of the topic categorization #11

Comments

jucor commented Jan 21, 2025

dborkan commented Jan 22, 2025