Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Could not convert string 'Y845' to numeric" error when using the "calculate.KinaseActivity()" function. #52

Open
liuxiaoxian opened this issue Aug 22, 2024 · 4 comments

Comments

@liuxiaoxian
Copy link

I’ve installed KSTAR on my laptop, and the KSTAR environment has been configured.

I used the recommended example data (Chylek, 2014) to test KSTAR following the standard tutorial. I was able to complete the 'Map Datasets to KinPred' part, but I encountered a bug related to pandas when trying to run the 'Predict Kinase Activities' part. Any suggestions are appreciated!

When I run this step:
kinact = calculate.KinaseActivity(experiment, activity_log,data_columns = data_columns, phospho_type='Y')

I get this error:
TypeError: Could not convert string 'Y845' to numeric

'Y845' seems to be the "mod_sites" column in the example data table.

The entire error message is:

Traceback (most recent call last):
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/groupby/groupby.py", line 1870, in _agg_py_fallback
res_values = self.grouper.agg_series(ser, alt, preserve_dtype=True)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/groupby/ops.py", line 850, in agg_series
result = self._aggregate_series_pure_python(obj, func)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/groupby/ops.py", line 871, in _aggregate_series_pure_python
res = func(group)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/groupby/groupby.py", line 2376, in
alt=lambda x: Series(x).mean(numeric_only=numeric_only),
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/series.py", line 6226, in mean
return NDFrame.mean(self, axis, skipna, numeric_only, **kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/generic.py", line 11969, in mean
return self._stat_function(
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/generic.py", line 11926, in _stat_function
return self._reduce(
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/series.py", line 6134, in _reduce
return op(delegate, skipna=skipna, **kwds)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/nanops.py", line 147, in f
result = alt(values, axis=axis, skipna=skipna, **kwds)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/nanops.py", line 404, in new_func
result = func(values, axis=axis, skipna=skipna, mask=mask, **kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/nanops.py", line 720, in nanmean
the_sum = _ensure_numeric(the_sum)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/nanops.py", line 1693, in _ensure_numeric
raise TypeError(f"Could not convert string '{x}' to numeric")
TypeError: Could not convert string 'Y845' to numeric

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "", line 1, in
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/kstar/calculate.py", line 123, in init
self.set_data_columns(data_columns = data_columns)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/kstar/calculate.py", line 164, in set_data_columns
self.check_data_columns()
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/kstar/calculate.py", line 132, in check_data_columns
evidence = self.evidence.groupby([config.KSTAR_ACCESSION, config.KSTAR_SITE]).agg(self.aggregate).reset_index()
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/groupby/generic.py", line 1442, in aggregate
result = op.agg()
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/apply.py", line 172, in agg
return self.apply_str()
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/apply.py", line 580, in apply_str
return self._apply_str(obj, func, *self.args, **self.kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/apply.py", line 663, in _apply_str
return f(*args, **kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/groupby/groupby.py", line 2374, in mean
result = self._cython_agg_general(
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/groupby/groupby.py", line 1925, in _cython_agg_general
new_mgr = data.grouped_reduce(array_func)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 1428, in grouped_reduce
applied = sb.apply(func)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 366, in apply
result = func(self.values, **kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/groupby/groupby.py", line 1922, in array_func
result = self._agg_py_fallback(how, values, ndim=data.ndim, alt=alt)
File "/Users/4474390/Documents/CICPT4617/kstar/lib/python3.9/site-packages/pandas/core/groupby/groupby.py", line 1874, in _agg_py_fallback
raise type(err)(msg) from err
TypeError: agg function failed [how->mean,dtype->object]

===========================
Packages in the environment by pip freeze requirement.txt

appnope==0.1.4
asttokens==2.4.1
biopython==1.83
certifi==2024.2.2
charset-normalizer==3.3.2
comm==0.2.2
contourpy==1.2.1
cycler==0.12.1
debugpy==1.8.1
decorator==5.1.1
exceptiongroup==1.2.0
executing==2.0.1
fonttools==4.51.0
idna==3.7
importlib_metadata==7.1.0
importlib_resources==6.4.0
ipykernel==6.29.4
ipython==8.18.1
jedi==0.19.1
jupyter_client==8.6.1
jupyter_core==5.7.2
kiwisolver==1.4.5
kstar==0.5.0
matplotlib==3.8.4
matplotlib-inline==0.1.6
nest-asyncio==1.6.0
numpy==1.26.4
packaging==24.0
pandas==2.1.0
parso==0.8.4
patsy==0.5.6
pexpect==4.9.0
pillow==10.3.0
platformdirs==4.2.0
prompt-toolkit==3.0.43
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
Pygments==2.17.2
pyparsing==3.1.2
python-dateutil==2.9.0.post0
pytz==2024.1
pyzmq==25.1.2
requests==2.31.0
scipy==1.11.0
seaborn==0.13.2
six==1.16.0
stack-data==0.6.3
statsmodels==0.13.0
tornado==6.4
traitlets==5.14.2
typing_extensions==4.11.0
tzdata==2024.1
urllib3==2.2.1
wcwidth==0.2.13
zipp==3.18.1

@liuxiaoxian
Copy link
Author

example.txt

The example data I used.

@srcrowl
Copy link
Contributor

srcrowl commented Aug 22, 2024

In the original KSTAR v0.5 (which it looks like is the version you are using), there were some bugs related to some newer versions of pandas (>2). Those should be fixed in KSTAR v0.5.3. This should be the default version when installing, but if that's not the case, you can force the correct version in pip or conda with pip install kstar==0.5.3 or conda install -c naeglelab kstar==0.5.3. This should fix the error you are seeing, but if that's not the case, let us know and we'll work through what else could be causing it. We will also make sure to remove these old, buggy versions for the future.

@liuxiaoxian
Copy link
Author

liuxiaoxian commented Aug 23, 2024

Thank you for reply!

I created a new environment with python 3.10 and kstar 0.5.3. The previous error when processing "Y845" was gone, then an new error occurred. It seems to be a "data:time(sec):" column.

TypeError: Could not convert string '0.04' to numeric

The entire error message is:

kinact = calculate.KinaseActivity(experiment, activity_log,data_columns = data_columns, phospho_type='Y')

Traceback (most recent call last):
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 1874, in _agg_py_fallback
res_values = self.grouper.agg_series(ser, alt, preserve_dtype=True)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/groupby/ops.py", line 849, in agg_series
result = self._aggregate_series_pure_python(obj, func)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/groupby/ops.py", line 877, in _aggregate_series_pure_python
res = func(group)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 2380, in
alt=lambda x: Series(x).mean(numeric_only=numeric_only),
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/series.py", line 6225, in mean
return NDFrame.mean(self, axis, skipna, numeric_only, **kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/generic.py", line 11992, in mean
return self._stat_function(
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/generic.py", line 11949, in _stat_function
return self._reduce(
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/series.py", line 6133, in _reduce
return op(delegate, skipna=skipna, **kwds)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/nanops.py", line 147, in f
result = alt(values, axis=axis, skipna=skipna, **kwds)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/nanops.py", line 404, in new_func
result = func(values, axis=axis, skipna=skipna, mask=mask, **kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/nanops.py", line 720, in nanmean
the_sum = _ensure_numeric(the_sum)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/nanops.py", line 1693, in _ensure_numeric
raise TypeError(f"Could not convert string '{x}' to numeric")
TypeError: Could not convert string '0.04' to numeric

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "", line 1, in
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/kstar/calculate.py", line 122, in init
self.set_data_columns(data_columns = data_columns)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/kstar/calculate.py", line 175, in set_data_columns
self.check_data_columns()
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/kstar/calculate.py", line 131, in check_data_columns
evidence = self.evidence.groupby([config.KSTAR_ACCESSION, config.KSTAR_SITE])[self.data_columns].agg(self.aggregate).reset_index()
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/groupby/generic.py", line 1445, in aggregate
result = op.agg()
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/apply.py", line 172, in agg
return self.apply_str()
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/apply.py", line 586, in apply_str
return self._apply_str(obj, func, *self.args, **self.kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/apply.py", line 669, in _apply_str
return f(*args, **kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 2378, in mean
result = self._cython_agg_general(
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 1929, in _cython_agg_general
new_mgr = data.grouped_reduce(array_func)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 1428, in grouped_reduce
applied = sb.apply(func)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 366, in apply
result = func(self.values, **kwargs)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 1926, in array_func
result = self._agg_py_fallback(how, values, ndim=data.ndim, alt=alt)
File "/Users/4474390/Documents/CICPT4617/kstar_0.5.3/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 1878, in _agg_py_fallback
raise type(err)(msg) from err
TypeError: agg function failed [how->mean,dtype->object]

=======================

Modules installed in this environment:

biopython==1.81
certifi==2024.7.4
charset-normalizer==3.3.2
contourpy==1.2.1
cycler==0.12.1
fonttools==4.53.1
idna==3.7
kiwisolver==1.4.5
kstar==0.5.3
matplotlib==3.8.4
numpy==1.26.4
packaging==24.1
pandas==2.1.4
pillow==10.4.0
pyparsing==3.1.2
python-dateutil==2.9.0.post0
pytz==2024.1
requests==2.31.0
scipy==1.11.4
seaborn==0.13.2
six==1.16.0
tzdata==2024.1
urllib3==2.2.2

@srcrowl
Copy link
Contributor

srcrowl commented Aug 28, 2024

KSTAR uses the pandas groupby function to combine similar sites, but it expects the column to contain numeric values and I'm assuming that it was the error. I recommend forcing the data column to numeric with experiment[data_cols] = experiment[data_cols].astype(float). I would expect it to be numeric when loading but it must not be for some reason, so checking for any values that may be causing issues outside of the context of KTAR should help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants