You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get stuck in a loop when using pandas.clac and results in runtime error
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase
Below is the code im using for testing:
minimal reproduction code
from rdkit import Chem
from mordred import Calculator, descriptors
#import pandas as pd
import unicodedata
components = ["CCO"]
def s2d(smiles_list):
final_list = [unicodedata.normalize("NFKD", ls) for ls in smiles_list]
mols = [Chem.MolFromSmiles(smi) for smi in final_list]
calc = Calculator(descriptors, ignore_3D=True)
df3 = calc.pandas(mols)
return df3
s2d(components)
df = s2d(components)
print(df)
Please fill me if possible.
environment
I'm running the code in a windows 10 machine with a venv environment.
pip show rdkit
Name: rdkit
Version: 2022.9.5
Summary: A collection of chemoinformatics and machine-learning software written in C++ and Python
Home-page: https://github.com/kuelumbus/rdkit-pypi
Author: Christopher Kuenneth
Author-email: [email protected]
License: BSD-3-Clause
Location: c:\users\admin\chemslenv\lib\site-packages
Requires: numpy, Pillow
Required-by:
The text was updated successfully, but these errors were encountered:
@alsalehf Hello! That's very important question, but somehow the developers do not have a time to answer it. The solution is to drop all "bad" descriptors - they do not work for you due to problems with your molecules' stereochemistry or smth else. That's my idea, which it's based on my experience:
from mordred import Calculator, PBF, MomentOfInertia, TopologicalCharge, MolecularDistanceEdge, MoRSE, GravitationalIndex, GeometricalIndex, EState, DistanceMatrix, DetourMatrix, CPSA, BaryszMatrix, Autocorrelation, AdjacencyMatrix, descriptors, get_descriptors_from_module
descs = get_descriptors_from_module(descriptors, submodule=True)
# exclude some from descs
descs = filter(lambda d: ((d.__module__ != AdjacencyMatrix.__name__) and
(d.__module__ != Autocorrelation.__name__) and
(d.__module__ != DetourMatrix.__name__) and
(d.__module__ != BaryszMatrix.__name__) and
(d.__module__ != CPSA.__name__) and
(d.__module__ != DistanceMatrix.__name__) and
(d.__module__ != EState.__name__) and
(d.__module__ != GeometricalIndex.__name__) and
(d.__module__ != GravitationalIndex.__name__) and
(d.__module__ != MoRSE.__name__) and
(d.__module__ != MolecularDistanceEdge.__name__) and
(d.__module__ != MomentOfInertia.__name__) and
(d.__module__ != PBF.__name__) and
(d.__module__ != TopologicalCharge.__name__)), descs)
calc = Calculator(descs)
calc.pandas(mols)
description
I get stuck in a loop when using pandas.clac and results in runtime error
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase
Below is the code im using for testing:
minimal reproduction code
from rdkit import Chem
from mordred import Calculator, descriptors
#import pandas as pd
import unicodedata
components = ["CCO"]
def s2d(smiles_list):
final_list = [unicodedata.normalize("NFKD", ls) for ls in smiles_list]
s2d(components)
df = s2d(components)
print(df)
Please fill me if possible.
environment
I'm running the code in a windows 10 machine with a venv environment.
Please fill me.
conda or pip
pip.
python version
Python 3.10.4
library version
Please execute the command and paste result.
Package Version
mordred 1.2.0
networkx 2.8.8
numpy 1.24.2
pandas 1.5.3
Pillow 9.4.0
pip 22.0.4
python-dateutil 2.8.2
pytz 2022.7.1
rdkit 2022.9.5
setuptools 58.1.0
six 1.16.0
pip show rdkit
Name: rdkit
Version: 2022.9.5
Summary: A collection of chemoinformatics and machine-learning software written in C++ and Python
Home-page: https://github.com/kuelumbus/rdkit-pypi
Author: Christopher Kuenneth
Author-email: [email protected]
License: BSD-3-Clause
Location: c:\users\admin\chemslenv\lib\site-packages
Requires: numpy, Pillow
Required-by:
The text was updated successfully, but these errors were encountered: