Constructing and retrieving RM3 #1667

eyalelani · 2023-10-03T10:56:51Z

eyalelani
Oct 3, 2023

Hi, I would like to know if there is an option of retrieving the RM3 model, which is based on some list of documents.
As far as I know, in Pyserini the usage of RM3 is only for reranking documents that were firstly ranked by BM25, and then the output is the list of the documents after they are being reranked.

However, given a list of documents, I would like to get the RM3 model based on these documents. That is, as an output I would like to have something like a dictionary of terms and their probabilities in the created model.

I've noticed that in Anserini's implementation there exists a private function that returns a model, based on scored documents (exactly what I need), whose signature is:
private FeatureVector estimateRelevanceModel(ScoredDocuments docs, IndexReader reader, boolean tweetsearch, boolean useRf);
The problem is that it's written in Java and I don't know how to use it in Python. I'm aware of the option to use pyserini.pyclass.autoclass, though I'm not sure if I can use it, and if I do - how.

If it may help, my goal is using this model with another index that I've created, of the documents' passages.

I'll be glad to have any kind of help,
Eyal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Constructing and retrieving RM3 #1667

{{title}}

Replies: 0 comments

Select a reply

Constructing and retrieving RM3 #1667

eyalelani Oct 3, 2023

Replies: 0 comments

eyalelani
Oct 3, 2023