Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data input rescore #69

Open
ClaraVanM opened this issue Mar 6, 2024 · 7 comments
Open

data input rescore #69

ClaraVanM opened this issue Mar 6, 2024 · 7 comments
Labels

Comments

@ClaraVanM
Copy link

I get this error when trying to run prank rescore: "Dataset must contain 'protein' and 'prediction' columns!"
What is the format that the input needs to have?

@rdk
Copy link
Owner

rdk commented Mar 6, 2024

You can find exemple dataset files in test_data. In particular: fpocket3.ds and concavity.ds. There is one column (protein) for original structure file and another (prediction) for a file with pocket predictions computed by some other algorithm.

Predictions of which binding site prediction tool are you trying to rescore? Fpocket, Concavity or something else entirely?

@ClaraVanM
Copy link
Author

ClaraVanM commented Mar 7, 2024

fpocket and ghecom, thank you for the fast response.

@rdk
Copy link
Owner

rdk commented Mar 7, 2024

fpocket should work, although I haven't tested it with 4.x versions. Are you using the latest version of fpocket?
If you try it please let me know if it works. In case there is any problem with rescoring fpocket 4.x predictions I will implement the fix.

Support for rescoring ghecom was never implemented. However, I was thinking about imlementing a universal input format for rescoring: a simple csv file with pocket centers. Would it help you?

@rdk rdk added the question label Mar 7, 2024
@ClaraVanM
Copy link
Author

ClaraVanM commented Mar 7, 2024

I used fpocket 4.1.4, I get this error:

P2Rank 2.5.0-dev.3
[INFO] Console - P2Rank 2.5.0-dev.3

[INFO] Console -
[INFO] Main - loading default config from [/vsc-hard-mounts/leuven-data/351/vsc35111/thesis/P2Rank/p2rank/distro/config/default.groovy]
[INFO] Main - loading default config from [/vsc-hard-mounts/leuven-data/351/vsc35111/thesis/P2Rank/p2rank/distro/config/default_rescore.groovy]
[INFO] Main - looking for dataset in dataset_base_dir [/vsc-hard-mounts/leuven-data/351/vsc35111/thesis/P2Rank/p2rank/fpocket_prank.ds]...
[INFO] Dataset - loading dataset [/vsc-hard-mounts/leuven-data/351/vsc35111/thesis/P2Rank/p2rank/fpocket_prank.ds]
[INFO] Futils - deleting /vsc-hard-mounts/leuven-data/351/vsc35111/thesis/P2Rank/p2rank/distro/test_output/rescore_fpocket_prank/run.log
rescoring pockets on proteins from dataset [fpocket_prank.ds]
[INFO] Console - rescoring pockets on proteins from dataset [fpocket_prank.ds]
[INFO] RescorePocketsRoutine - outdir: /vsc-hard-mounts/leuven-data/351/vsc35111/thesis/P2Rank/p2rank/distro/test_output/rescore_fpocket_prank
[INFO] FeatureSetup - enabledFeatures: [chem, volsite, protrusion, bfactor, atom_table]
[INFO] Dataset - processing dataset [fpocket_prank.ds] using 0 threads
[INFO] Dataset -

processing [1a28_deposited_refined_prot_out.pdb] (1/1)

processing [1a28_deposited_refined_prot_out.pdb] (1/1)
[INFO] Console - processing [1a28_deposited_refined_prot_out.pdb] (1/1)
[INFO] Protein - loading protein [/data/leuven/351/vsc35111/thesis/proteins/1a28_deposited_refined_prot.pdb]
[INFO] PdbUtils - loading file [/data/leuven/351/vsc35111/thesis/proteins/1a28_deposited_refined_prot.pdb]
[INFO] Struct - groups in chain A: 251
[INFO] Struct - groups in chain B: 249
[INFO] Struct - groups in chain A: 251
[INFO] Struct - 251 groups in chain A
[INFO] Struct - groups in chain B: 249
[INFO] Struct - 249 groups in chain B
[INFO] Protein - structure atoms: 4216
[INFO] Protein - protein atoms: 3972
[INFO] Protein - loading ligands
[INFO] Ligands - loading 0 ligands
[INFO] Ligands - loading 0 ligands
[INFO] Ligands - Loaded 0 relevant ligands: []
[ERROR] Dataset - error processing dataset item [1a28_deposited_refined_prot_out.pdb]
java.lang.NullPointerException: null
at cz.siret.prank.domain.loaders.DatasetItemLoader.loadPredictionPair(DatasetItemLoader.groovy:62) ~[p2rank.jar:?]
at cz.siret.prank.domain.Dataset$Item.loadPredictionPair(Dataset.groovy:841) ~[p2rank.jar:?]
at cz.siret.prank.domain.Dataset$Item.getPredictionPair(Dataset.groovy:823) ~[p2rank.jar:?]
at cz.siret.prank.program.routines.predict.RescorePocketsRoutine$_execute_closure1.doCall(RescorePocketsRoutine.groovy:63) ~[p2rank.jar:?]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:343) ~[groovy-4.0.18.jar:4.0.18]
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:328) ~[groovy-4.0.18.jar:4.0.18]
at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:279) ~[groovy-4.0.18.jar:4.0.18]
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1007) ~[groovy-4.0.18.jar:4.0.18]
at groovy.lang.Closure.call(Closure.java:433) ~[groovy-4.0.18.jar:4.0.18]
at groovy.lang.Closure.call(Closure.java:422) ~[groovy-4.0.18.jar:4.0.18]
at cz.siret.prank.domain.Dataset$1.processItem(Dataset.groovy:145) ~[p2rank.jar:?]
at cz.siret.prank.domain.Dataset.processssItem(Dataset.groovy:228) [p2rank.jar:?]
at cz.siret.prank.domain.Dataset.access$0(Dataset.groovy) [p2rank.jar:?]
at cz.siret.prank.domain.Dataset$2.call(Dataset.groovy:192) [p2rank.jar:?]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.base/java.lang.Thread.run(Thread.java:829) [?:?]
rescoring finished in 0 hours 0 minutes 3.441 seconds

Yes i think that can be usefull! But then I would need to calculate pocket centers from the ghecom output, as they are not given in the output.

@rdk
Copy link
Owner

rdk commented Mar 7, 2024

Looks like there is a bug that needs to be fixed. Could you attach fpocket output files that are giving you the error, at least for 1a28?

But then I would need to calculate pocket centers from the ghecom output, as they are not given in the output.

Yes exactly.
How are you generating ghecom predictions? Using web server or locally?

@ClaraVanM
Copy link
Author

1a28_deposited_refined_prot_out_pdb.txt
I changed the name of the file, since github would not let me upload the .pdb version.
i am running it locally.

@rdk
Copy link
Owner

rdk commented Jun 26, 2024

@ClaraVanM sorry for the delay.

If it is still relevant, you can try if the release 2.4.2 (https://github.com/rdk/p2rank/releases/tag/2.4.2) works in your case. It containes updated support for loading fpocket predictions. Also please check if your dataset file has the right format (example: https://github.com/rdk/p2rank/blob/develop/distro/test_data/fpocket.ds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants