Skip to content

Commit 10a5795

Browse files
committed
add docs for testing a pre-trained model
1 parent e1265ae commit 10a5795

File tree

1 file changed

+65
-0
lines changed

1 file changed

+65
-0
lines changed

docs/getstarted.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -412,3 +412,68 @@ fig.update_layout(
412412
title='Loss vs epochs'
413413
)
414414
```
415+
416+
## Testing new data
417+
418+
If new PDB files need to be tested with a pre-trained model, the first step would be to process and save them into HDF5 files. Let's suppose that the model has been trained with `ProteinProteinInterfaceResidueQuery` queries mapped to graphs:
419+
420+
```python
421+
from deeprank2.query import QueryCollection, ProteinProteinInterfaceResidueQuery
422+
423+
queries = QueryCollection()
424+
425+
# Append data points
426+
queries.add(ProteinProteinInterfaceResidueQuery(
427+
pdb_path = "<new_pdb_file1.pdb>",
428+
chain_id1 = "A",
429+
chain_id2 = "B"
430+
))
431+
queries.add(ProteinProteinInterfaceResidueQuery(
432+
pdb_path = "<new_pdb_file2.pdb>",
433+
chain_id1 = "A",
434+
chain_id2 = "B"
435+
))
436+
437+
hdf5_paths = queries.process(
438+
"<output_folder>/<prefix_for_outputs>",
439+
feature_modules = 'all')
440+
```
441+
442+
Then, the GraphDataset instance representing the testing set can be defined. Note that there is no need of setting the dataset's parameters, since they are inherited from the information saved in the pre-trained model.
443+
444+
```python
445+
from deeprank2.dataset import GraphDataset
446+
447+
dataset_test = GraphDataset(
448+
hdf5_path = "<output_folder>/<prefix_for_outputs>",
449+
train = False,
450+
train_data = "<pretrained_model_path>"
451+
)
452+
```
453+
454+
Finally, the Trainer instance can be defined and the new data can be tested:
455+
456+
```python
457+
from deeprank2.trainer import Trainer
458+
from deeprank2.neuralnets.gnn.naive_gnn import NaiveNetwork
459+
from deeprank2.utils.exporters import HDF5OutputExporter
460+
461+
trainer = Trainer(
462+
NaiveNetwork,
463+
dataset_test = dataset_test,
464+
pretrained_model = "<pretrained_model_path>",
465+
output_exporters = [HDF5OutputExporter("<output_folder_path>")]
466+
)
467+
468+
trainer.test()
469+
```
470+
471+
The results can then be read in a Pandas Dataframe and visualized:
472+
473+
```python
474+
import os
475+
import pandas as pd
476+
477+
output = pd.read_hdf(os.path.join("<output_folder_path>", "output_exporter.hdf5"), key="testing")
478+
output.head()
479+
```

0 commit comments

Comments
 (0)