You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I would like to know if there is an option of retrieving the RM3 model, which is based on some list of documents.
As far as I know, in Pyserini the usage of RM3 is only for reranking documents that were firstly ranked by BM25, and then the output is the list of the documents after they are being reranked.
However, given a list of documents, I would like to get the RM3 model based on these documents. That is, as an output I would like to have something like a dictionary of terms and their probabilities in the created model.
I've noticed that in Anserini's implementation there exists a private function that returns a model, based on scored documents (exactly what I need), whose signature is:
private FeatureVector estimateRelevanceModel(ScoredDocuments docs, IndexReader reader, boolean tweetsearch, boolean useRf);
The problem is that it's written in Java and I don't know how to use it in Python. I'm aware of the option to use pyserini.pyclass.autoclass, though I'm not sure if I can use it, and if I do - how.
If it may help, my goal is using this model with another index that I've created, of the documents' passages.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi, I would like to know if there is an option of retrieving the RM3 model, which is based on some list of documents.
As far as I know, in Pyserini the usage of RM3 is only for reranking documents that were firstly ranked by BM25, and then the output is the list of the documents after they are being reranked.
However, given a list of documents, I would like to get the RM3 model based on these documents. That is, as an output I would like to have something like a dictionary of terms and their probabilities in the created model.
I've noticed that in Anserini's implementation there exists a private function that returns a model, based on scored documents (exactly what I need), whose signature is:
private FeatureVector estimateRelevanceModel(ScoredDocuments docs, IndexReader reader, boolean tweetsearch, boolean useRf);
The problem is that it's written in Java and I don't know how to use it in Python. I'm aware of the option to use pyserini.pyclass.autoclass, though I'm not sure if I can use it, and if I do - how.
If it may help, my goal is using this model with another index that I've created, of the documents' passages.
I'll be glad to have any kind of help,
Eyal
Beta Was this translation helpful? Give feedback.
All reactions