RuntimeError when using calc.pandas #106

alsalehf · 2023-03-20T12:55:25Z

description

I get stuck in a loop when using pandas.clac and results in runtime error
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase

Below is the code im using for testing:
minimal reproduction code
from rdkit import Chem
from mordred import Calculator, descriptors
#import pandas as pd
import unicodedata

components = ["CCO"]

def s2d(smiles_list):
final_list = [unicodedata.normalize("NFKD", ls) for ls in smiles_list]

mols = [Chem.MolFromSmiles(smi) for smi in final_list]
calc = Calculator(descriptors, ignore_3D=True)
df3 = calc.pandas(mols)
return df3

s2d(components)

df = s2d(components)
print(df)

Please fill me if possible.

environment

I'm running the code in a windows 10 machine with a venv environment.

Please fill me.

conda or pip

pip.

python version

Python 3.10.4

library version

Please execute the command and paste result.

pip

Package Version

mordred 1.2.0
networkx 2.8.8
numpy 1.24.2
pandas 1.5.3
Pillow 9.4.0
pip 22.0.4
python-dateutil 2.8.2
pytz 2022.7.1
rdkit 2022.9.5
setuptools 58.1.0
six 1.16.0

pip show rdkit
Name: rdkit
Version: 2022.9.5
Summary: A collection of chemoinformatics and machine-learning software written in C++ and Python
Home-page: https://github.com/kuelumbus/rdkit-pypi
Author: Christopher Kuenneth
Author-email: [email protected]
License: BSD-3-Clause
Location: c:\users\admin\chemslenv\lib\site-packages
Requires: numpy, Pillow
Required-by:

The text was updated successfully, but these errors were encountered:

ismorphism · 2023-03-25T15:40:57Z

@alsalehf Hello! That's very important question, but somehow the developers do not have a time to answer it. The solution is to drop all "bad" descriptors - they do not work for you due to problems with your molecules' stereochemistry or smth else. That's my idea, which it's based on my experience:

from mordred import Calculator, PBF, MomentOfInertia, TopologicalCharge, MolecularDistanceEdge, MoRSE, GravitationalIndex, GeometricalIndex, EState, DistanceMatrix, DetourMatrix, CPSA, BaryszMatrix, Autocorrelation, AdjacencyMatrix, descriptors, get_descriptors_from_module

descs = get_descriptors_from_module(descriptors, submodule=True)

# exclude some from descs
descs = filter(lambda d: ((d.__module__ != AdjacencyMatrix.__name__) and 
                          (d.__module__ != Autocorrelation.__name__) and
                          (d.__module__ != DetourMatrix.__name__) and 
                          (d.__module__ != BaryszMatrix.__name__) and 
                          (d.__module__ != CPSA.__name__) and 
                          (d.__module__ != DistanceMatrix.__name__) and 
                          (d.__module__ != EState.__name__) and 
                          (d.__module__ != GeometricalIndex.__name__) and 
                          (d.__module__ != GravitationalIndex.__name__) and 
                          (d.__module__ != MoRSE.__name__) and 
                          (d.__module__ != MolecularDistanceEdge.__name__) and 
                          (d.__module__ != MomentOfInertia.__name__) and 
                          (d.__module__ != PBF.__name__) and 
                          (d.__module__ != TopologicalCharge.__name__)), descs)

calc = Calculator(descs)
calc.pandas(mols)

CHAOHSUTW · 2025-01-16T15:15:58Z

@alsalehf Hi! I also faced with the same error while using calc.pandas(mols). I just tried to run the calculator for each mol, gather it as a dict, then generate the dataframe. It worked eventually. See if this helps.

calc = Calculator(descriptors, ignore_3D=True)
mols = [Chem.MolFromSmiles(smi) for smi in smiles]
results = []
for mol in mols:
    desc = calc(mol)  # Calculate descriptors
    results.append(dict(desc))  # Convert result to a dictionary

df = pd.DataFrame(results)

JacksonBurns · 2025-01-16T15:18:09Z

This issue is resolved in our community maintained fork mordred-community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError when using calc.pandas #106

RuntimeError when using calc.pandas #106

alsalehf commented Mar 20, 2023

ismorphism commented Mar 25, 2023 •

edited

Loading

CHAOHSUTW commented Jan 16, 2025

JacksonBurns commented Jan 16, 2025

RuntimeError when using calc.pandas #106

RuntimeError when using calc.pandas #106

Comments

alsalehf commented Mar 20, 2023

description

environment

conda or pip

python version

library version

ismorphism commented Mar 25, 2023 • edited Loading

CHAOHSUTW commented Jan 16, 2025

JacksonBurns commented Jan 16, 2025

ismorphism commented Mar 25, 2023 •

edited

Loading