-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LC-395: Address error handling when SolrIndexer sends batch of documents to multiple collections #73
base: main
Are you sure you want to change the base?
Conversation
…ultiple collections and an error occurs from one or more of the targeted collections.
@@ -22,6 +22,7 @@ | |||
import org.apache.solr.common.SolrInputDocument; | |||
import org.apache.solr.common.util.NamedList; | |||
import org.apache.solr.common.util.SimpleOrderedMap; | |||
import org.eclipse.jetty.util.MultiException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
org.eclipse.jetty:jetty-util is a transitive dependency of solrj...
Should this be referenced explicitly in the lucille-core pom, or just leave it as a transitive dependency?
Seems like we might want to move the SolrIndexer to its own module at some point, in which case we'd have to deal with transitive dependencies like this that we're relying on without declaring... Thoughts on a general policy for this kind of situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right, we should declare dependencies that we directly use explicitly and not rely on transitive dependencies from a 3rd party. That said I actually think we can just have our own exception for this (which might be refactored into a batch response / batchlet given other work).
doc2.setField("foo", "baz"); | ||
doc2.setField("collection", "this-collection-does-not-exist"); | ||
|
||
// Send a second document to the test collection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// send a second document to a different collection that doesn't exist
doc3.setField("foo", "bar"); | ||
doc3.setField("collection", "this-collection-also-does-not-exist"); | ||
|
||
Throwable exception = assertThrows(SolrException.class, () -> indexer.sendToIndex(List.of(doc1, doc2))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how come we are sending doc1 and doc2 but NOT doc3 here?
|
||
//run the same test again with docs targeting both collections that do not exist with doc1 last. | ||
|
||
exception = assertThrows(MultiException.class, () -> indexer.sendToIndex(List.of(doc3, doc2, doc1))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would there be any added testing value in putting doc1 in the middle?
do we have a ticket for this PR? |
@@ -110,6 +111,10 @@ public void closeConnection() { | |||
|
|||
@Override | |||
protected void sendToIndex(List<Document> documents) throws Exception { | |||
//when sending updates/deletes to separate collections we do not want an error sending to a particular collection to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super-minor, but whitespace after the // would be in keeping with the convention
sendAddUpdateBatch( | ||
collection, solrDocRequestsByCollection.get(collection).getAddUpdateDocs()); | ||
sendDeletionBatch(collection, solrDocRequestsByCollection.get(collection)); | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, this PR has a bigger consequence than just making sure that failures against one collection don't prevent us from sending to different collections...
it also means that if we have an certain doc ID and there's an update followed by a delete, we might get a failure from the update but still go on to try the delete. Likewise, if there's a delete followed by an update, we might get a failure from the delete, but still go on to try the update. Thinking about this, maybe this new behavior is fine. The important thing is that we never swap an update and a delete, and that's not happening here. But, I wanted to leave this comment just to bring attention to the change -- is this something you considered and also concluded was OK? A code comment recording our conclusion might be helpful for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not consider this until you mentioned it. However, I do think we are doing the "least worse" thing here. I think we have to consider this situation in any retry behavior. We need to make sure that any indexer retry behavior that we implement does not mess with the order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
I created LC-395 for this issue because there was not a ticket. @kyle-kmw could you please add LC-395 to the PR title? Thanks. |
The SolrIndexer when configured to use a collection field will send documents to multiple collections as determined by the collection field on each document. If one of the collections is down, or doesn't exist, or some other error occurs we do not want this to prevent the other documents in the batch to the collections that they are routed to. This PR defers the throwing of any exceptions from calling sendAddUpdateBatch or sendDeletionBatch by collecting them in a MultiException and throwing the error(s) collectively at the end of batch processing.
This PR does not address false reporting of failures. When an error occurs while processing a batch it is still the case that every document in the batch will be marked as a failure.