Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why logistic regression is equivalent to Bradley-Terry model? #3505

Open
VityaVitalich opened this issue Aug 30, 2024 · 2 comments
Open

Why logistic regression is equivalent to Bradley-Terry model? #3505

VityaVitalich opened this issue Aug 30, 2024 · 2 comments

Comments

@VityaVitalich
Copy link

Dear maintainers,

Thank you for your valuable arena. I am currently researching the way of LLMs evaluation and got stack with a question about Bradley-Terry model.
As it stands, from multiple sources, BT is obtained through maximizing BT likelihood (as well as in your paper). However inside the code, logistic regression is fitted on some kind of "one-hot" matrix, where +1 is model_a and -1 is model_b, and target is 1 in case model_a wins and 0 if model_b wins. Lets neglect controlling length of answer for simplicity, but I can not understand why this is equivalent to BT model.

Could you please explain this or give me some sources where i could find the derivation?

def compute_elo_mle_with_tie(
    df, SCALE=400, BASE=10, INIT_RATING=1000, sample_weight=None
):
    from sklearn.linear_model import LogisticRegression

    ptbl_a_win = pd.pivot_table(
        df[df["winner"] == "model_a"],
        index="model_a",
        columns="model_b",
        aggfunc="size",
        fill_value=0,
    )
    ptbl_tie = pd.pivot_table(
        df[df["winner"].isin(["tie", "tie (bothbad)"])],
        index="model_a",
        columns="model_b",
        aggfunc="size",
        fill_value=0,
    )
    ptbl_tie = ptbl_tie + ptbl_tie.T
    ptbl_b_win = pd.pivot_table(
        df[df["winner"] == "model_b"],
        index="model_a",
        columns="model_b",
        aggfunc="size",
        fill_value=0,
    )
    ptbl_win = ptbl_a_win * 2 + ptbl_b_win.T * 2 + ptbl_tie

    models = pd.Series(np.arange(len(ptbl_win.index)), index=ptbl_win.index)

    p = len(models)
    X = np.zeros([p * (p - 1) * 2, p])
    Y = np.zeros(p * (p - 1) * 2)

    cur_row = 0
    sample_weights = []
    for m_a in ptbl_win.index:
        for m_b in ptbl_win.columns:
            if m_a == m_b:
                continue
            # if nan skip
            if math.isnan(ptbl_win.loc[m_a, m_b]) or math.isnan(ptbl_win.loc[m_b, m_a]):
                continue
            X[cur_row, models[m_a]] = +math.log(BASE)
            X[cur_row, models[m_b]] = -math.log(BASE)
            Y[cur_row] = 1.0
            sample_weights.append(ptbl_win.loc[m_a, m_b])

            X[cur_row + 1, models[m_a]] = math.log(BASE)
            X[cur_row + 1, models[m_b]] = -math.log(BASE)
            Y[cur_row + 1] = 0.0
            sample_weights.append(ptbl_win.loc[m_b, m_a])
            cur_row += 2
    X = X[:cur_row]
    Y = Y[:cur_row]

    lr = LogisticRegression(fit_intercept=False, penalty=None)
    lr.fit(X, Y, sample_weight=sample_weights)
    elo_scores = SCALE * lr.coef_[0] + INIT_RATING
    if "mixtral-8x7b-instruct-v0.1" in models.index:
        elo_scores += 1114 - elo_scores[models["mixtral-8x7b-instruct-v0.1"]]
    return pd.Series(elo_scores, index=models.index).sort_values(ascending=False)
@acylam
Copy link

acylam commented Sep 2, 2024

I asked a related question earlier about the anchor model. Still waiting for response: #3377

@cthorrez
Copy link
Contributor

@VityaVitalich I wrote a blog post which includes an explanation of this. The idea is that if you do an exponential reparameterization of the Bradley-Terry strength parameters, the probabilities can be expressed as the sigmoid of the difference in ratings. Then if you construct the X matrix such that each row has only two non-zero entries, with a 1 and a -1 and the competitor indices, then when you do the dot product of that row with the parameter vector (the ratings) it acts to just produce the difference between the two selected ratings.

https://www.claytonthorrez.com/blog/posts/fast_llm_ratings/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants