You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When ingesting an invalid URL, e.g. ImportWebPageAsync("http://malformed_url") KM places the document in the poisoned queue after some attempts: Microsoft.KernelMemory.Pipeline.Queue.DevTools.SimpleQueues[0] Message '20250124.114916.8130921.4d6c0b1c4b4d41ff84a0cb26ac27abe8' processing failed with exception, max attempts reached, moving to poison queue..
But the status reported by GetDocumentStatusAsync is still as it was before (my log message: Document 416A1AABBD2B38AE93197949C710199DC83695E497F514EFA5097173535AE492 null?:False completed:False empty:False remaining steps:extract, partition, gen_embeddings, save_records ready:False).
It would be nice to have an additional field failed in DataPipelineStatus, maybe even with a message-field why it failed. Since one most likely wants to delete the failed document, it would be nice to include an optional flag deleteUponFailure to ImportWebPageAsync (or the other Import* methods).
What happened?
Status reports URL is still ingesting, while it is in the poisoned queue.
Importance
a fix would make my life easier
Platform, Language, Versions
KernelMemory 0.95
kernelmemory/service created 2025-01-20T15:41:17.539712455Z
C# / .net9
Relevant log output
The text was updated successfully, but these errors were encountered:
A somewhat related observation: If a url containing a url-fragment (e.g. https://microsoft.github.io/kernel-memory/quickstart/start-service#check-openapi-swagger) is ingested, KM throws an error. Maybe you want to disregard url-fragements.
Context / Scenario
When ingesting an invalid URL, e.g.
ImportWebPageAsync("http://malformed_url")
KM places the document in the poisoned queue after some attempts:Microsoft.KernelMemory.Pipeline.Queue.DevTools.SimpleQueues[0] Message '20250124.114916.8130921.4d6c0b1c4b4d41ff84a0cb26ac27abe8' processing failed with exception, max attempts reached, moving to poison queue.
.But the status reported by
GetDocumentStatusAsync
is still as it was before (my log message:Document 416A1AABBD2B38AE93197949C710199DC83695E497F514EFA5097173535AE492 null?:False completed:False empty:False remaining steps:extract, partition, gen_embeddings, save_records ready:False
).It would be nice to have an additional field
failed
inDataPipelineStatus
, maybe even with a message-field why it failed. Since one most likely wants to delete the failed document, it would be nice to include an optional flagdeleteUponFailure
toImportWebPageAsync
(or the otherImport*
methods).What happened?
Status reports URL is still ingesting, while it is in the poisoned queue.
Importance
a fix would make my life easier
Platform, Language, Versions
KernelMemory 0.95
kernelmemory/service created 2025-01-20T15:41:17.539712455Z
C# / .net9
Relevant log output
The text was updated successfully, but these errors were encountered: