Wrong TM score? Order matters? #6

ekiefl · 2022-02-02T04:21:06Z

Hello, looks like some well-written and organized code in this project--thanks for making it.

I'm noticing different results depending on the order my filepaths are passed to TMscoring. Usually the results of symmetric, but in a batch of around 300 unique comparisons, about 30 exhibit this behavior. Here is one such example. I've attached the files as *.txt so github allows me to upload them, but you should rename them to *.pdb to replicate this example.

s1.txt
s2.txt

import tmscoring
print(tmscoring.__version__)

aln = tmscoring.TMscoring('s1.pdb', 's2.pdb')
aln.optimise()
TM_score = aln.tmscore(**aln.get_current_values())
print(TM_score)

aln = tmscoring.TMscoring('s2.pdb', 's1.pdb') # reverse order
aln.optimise()
TM_score = aln.tmscore(**aln.get_current_values())
print(TM_score)

This yields

>>> 0.3
>>> 0.038127712872195546
>>> 0.8497114209938352

When I pass these files to https://zhanggroup.org/TM-score/ the result is 0.84971.... In every case where the TM score is non-symmetric, the web server yields the larger result, which by eye, appears to be the correct choice.

s1.pdb differs from s2.pdb in that it was generated from MODELLER, which does not have hydrogen atoms, whereas s2.pdb is generated from AlphaFold, which does. Take from that what you will :\

The text was updated successfully, but these errors were encountered:

Dapid · 2023-03-03T12:01:13Z

Sorry for the slow reply. Unfortunately I don't have much time to work on this anymore.

It is puzzling indeed. Optimising for RMSD instead does yield a symmetric result. Maybe iminuit is having a hiccup there?

Hydrogens shouldn't matter, since we are just considering the alpha carbons.

ardhe-qb · 2023-11-10T14:57:04Z

Hi @Dapid,

Have been there any developments in this repo? Is there another (possibly more maintained) Python library that allows us to calculate the TM score?

ekiefl · 2023-11-10T21:09:55Z

@ardhe-qb I haven't ran this code, but based on my original bug report, I do suspect that calculating TM with both directions and taking the max yields the correct result. So you could install tmscoring and then somewhere in your project create the following wrap:

import tmscoring
from pathlib import Path
from typing import Union

Pathish = Union[Path, str]

def get_tmscore(path1: Pathish, path2: Pathish) -> float:
    aln1 = tmscoring.TMscoring(path1, path2)
    aln2 = tmscoring.TMscoring(path2, path1)

    aln1.optimise()
    aln2.optimise()
    
    return max(
        aln1.tmscore(**aln1.get_current_values()),
        aln2.tmscore(**aln1.get_current_values()),
    )

Dapid · 2023-11-10T21:17:48Z

Hi @Dapid,

Have been there any developments in this repo? Is there another (possibly more maintained) Python library that allows us to calculate the TM score?

Sorry, I am not in the field anymore, so I don't have the time to work on the code. I am also not aware of any other implementation (that is why I created this one).

If someone wants to modernise it, I could review it, or even pass it on. The codebase is fairly short.

Dapid · 2023-11-10T21:19:14Z

@ardhe-qb I haven't ran this code, but based on my original bug report, I do suspect that calculating TM with both directions and taking the max yields the correct result. So you could install tmscoring and then somewhere in your project create the following wrap:

Interesting. Is that because the normalisation is different, or is the optimiser not converging to the same point?

ekiefl · 2023-11-10T21:31:53Z

@ardhe-qb I haven't ran this code, but based on my original bug report, I do suspect that calculating TM with both directions and taking the max yields the correct result. So you could install tmscoring and then somewhere in your project create the following wrap:

Interesting. Is that because the normalisation is different, or is the optimiser not converging to the same point?

I'm not sure. I never looked into the code, or even how TM score works, it was just a paragraph in my thesis. Based on my experimentation in Feb 2022 that led me to file this bug report, I seemed to come to the conclusion that the max matches https://zhanggroup.org/TM-score/

@ardhe-qb If you want to go ahead with the above hack, I would recommend first catching mismatches and then testing against https://zhanggroup.org/TM-score/ to verify the max score matches:

import tmscoring
from pathlib import Path
from typing import Union

Pathish = Union[Path, str]

def get_tmscore(path1: Pathish, path2: Pathish) -> float:
    aln1 = tmscoring.TMscoring(path1, path2)
    aln2 = tmscoring.TMscoring(path2, path1)

    aln1.optimise()
    aln2.optimise()

    score1 = aln1.tmscore(**aln1.get_current_values()),
    score2 = aln2.tmscore(**aln2.get_current_values()),

    if score1 != score2:
        print(f"Mismatch between {path1} (TMScore: {score1}) and {path2} (TMScore: {score2})")
    
    return max(score1, score2)

Dapid added the bug label Mar 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong TM score? Order matters? #6

Wrong TM score? Order matters? #6

ekiefl commented Feb 2, 2022

Dapid commented Mar 3, 2023

ardhe-qb commented Nov 10, 2023

ekiefl commented Nov 10, 2023

Dapid commented Nov 10, 2023

Dapid commented Nov 10, 2023

ekiefl commented Nov 10, 2023

Wrong TM score? Order matters? #6

Wrong TM score? Order matters? #6

Comments

ekiefl commented Feb 2, 2022

Dapid commented Mar 3, 2023

ardhe-qb commented Nov 10, 2023

ekiefl commented Nov 10, 2023

Dapid commented Nov 10, 2023

Dapid commented Nov 10, 2023

ekiefl commented Nov 10, 2023