Gunicorn worker gets terminated due to signal 9 (low memory) #34

owais142002 · 2023-05-30T16:31:53Z

Is this a new bug?

I believe this is a new bug
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Gunicorn worker gets terminated due to signal 9 (low memory), however, I have 29 gb space in RAM available to use. As soon as I send request to my server hosted on render, the worker gets killed. But things are working fine in my local space as I have a system with some good specs. How much should I scale the server or is there any other way around by which I can solve this issue

Expected Behavior

I expected that things will work fine as it is working fine locally.

Steps To Reproduce

Just try running spladeEncoder stuff on render or replit.

Relevant log output

No response

Environment

- **OS**:
- **Language version**:
- **Pinecone client version**:

Additional Context

No response

miararoy · 2023-06-02T19:56:31Z

Hi @beamerboyyyy

Running the spladeEncoder is using a small, distilled model link this model should run <8G ram

you can see that this library unittests - https://github.com/pinecone-io/pinecone-text/blob/main/tests/test_splade.py is running on ubuntu-latest with 7G ram see here

It will be hard to debug this issue without more details. My intuition is that when loading the model on render/replit the model is loaded multiple times on different processes, thus causing the machine to fail. Try to check and see if there's a parallelism factor and try setting it to 1 and see if the issue reproduces.

eudaimoniatech · 2023-08-16T16:45:17Z

@beamerboyyyy I had the same issue un Kubernetes and the root cause was that Torch do not releases memory allocated for the tensors. I fixed it this way and now my ingestion container runs smoothly with around 900MB memory allocated during the whole process without deadly spikes.

def _encode(self, texts: Union[str, List[str]]) -> Union[SparseVector, List[SparseVector]]:
       """
       Args:
           texts: single or list of texts to encode.

       Returns a list of Splade sparse vectors, one for each input text.
       """
       with torch.no_grad():
           inputs = self.tokenizer(
               texts,
               return_tensors="pt",
               padding=True,
               truncation=True,
               max_length=self.max_seq_length,
           ).to(self.device)

           logits = self.model(**inputs).logits
           del inputs  # Explicitly delete the inputs tensor

           inter = torch.log1p(torch.relu(logits))
           token_max = torch.max(inter, dim=1)
           del inter, logits  # Explicitly delete intermediate tensors

           nz_tokens_i, nz_tokens_j = torch.where(token_max.values > 0)

           output = []
           for i in range(token_max.values.shape[0]):
               nz_tokens = nz_tokens_j[nz_tokens_i == i]
               nz_weights = token_max.values[i, nz_tokens]
               output.append({"indices": nz_tokens.cpu().tolist(), "values": nz_weights.cpu().tolist()})

           del token_max, nz_tokens_i, nz_tokens_j  # Explicitly delete tensors

       return output[0] if isinstance(texts, str) else output

Probably not the best solution but hope it helps. If someone has better ideas please share.

owais142002 added the bug Something isn't working label May 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gunicorn worker gets terminated due to signal 9 (low memory) #34

Gunicorn worker gets terminated due to signal 9 (low memory) #34

owais142002 commented May 30, 2023

miararoy commented Jun 2, 2023

eudaimoniatech commented Aug 16, 2023 •

edited

Loading

Gunicorn worker gets terminated due to signal 9 (low memory) #34

Gunicorn worker gets terminated due to signal 9 (low memory) #34

Comments

owais142002 commented May 30, 2023

Is this a new bug?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Additional Context

miararoy commented Jun 2, 2023

eudaimoniatech commented Aug 16, 2023 • edited Loading

eudaimoniatech commented Aug 16, 2023 •

edited

Loading