Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code snippet in README throws an error #14

Open
yamsgithub opened this issue Mar 12, 2018 · 5 comments
Open

Code snippet in README throws an error #14

yamsgithub opened this issue Mar 12, 2018 · 5 comments

Comments

@yamsgithub
Copy link

I am running the code from the README but it throws an error:


from dsbox.datapreprocessing.profiler import Profiler
import pandas as pd

profiler = Profiler()
data = pd.read_csv('test.csv', dtype=object)
jsonResult = profiler.produce(inputs=data)

ImportError Traceback (most recent call last)
in ()
----> 1 from dsbox.datapreprocessing.profiler import Profiler
2 import pandas as pd
3
4 profiler = Profiler()
5 data = pd.read_csv('/Users/yamuna/D3M/data/185_baseball/185_baseball_dataset/tables/learningData.csv', dtype=object)

ImportError: cannot import name Profiler

@kyao
Copy link
Contributor

kyao commented Mar 12, 2018

It looks like Python cannot find the Profiler.

We need to update our readme file.

To install our profiler, please clone the master branch. Then, cd to your local repo directory, and do:

pip install -e .

Also, for sample usage you can take a look at:

https://github.com/usc-isi-i2/dsbox-profiling/blob/master/ta1-pipeline.py

@yamsgithub
Copy link
Author

I am running the ta1-pipeline.py but gettting the following error. Where are the config files?

FileNotFoundError Traceback (most recent call last)
in ()
19
20 # Load the json configuration file
---> 21 with open("ta1-pipeline-config.json", 'r') as inputFile:
22 jsonCall = json.load(inputFile)
23 inputFile.close()

FileNotFoundError: [Errno 2] No such file or directory: 'ta1-pipeline-config.json'

@kyao
Copy link
Contributor

kyao commented Mar 12, 2018

NIST provides that file when we submit our pipelines for testing. That file looks

{ "train_data": "/path-to-data/seed_datasets_current/38_sick/TRAIN", "test_data": "/path-to-data/seed_datasets_current/38_sick/TEST", "output_folder": "." }

@yamsgithub
Copy link
Author

Thanks. I was able to run it. I am using the following config.json:
{ "train_data": "/Users/yamuna/D3M/data/196_autoMpg/TRAIN", "test_data": "/Users/yamuna/D3M/data/196_autoMpg/TEST", "output_folder": "." }

But now get this error:

python ta1-pipeline.py > profiling_output.log
Traceback (most recent call last):
File "ta1-pipeline.py", line 57, in
ds2.metadata.pretty_print()
File "/Users/yamuna/D3M/ta2/src/d3m-metadata/d3m_metadata/metadata.py", line 646, in pretty_print
self.pretty_print(selector + [element], handle=handle, _level=_level + 1)
File "/Users/yamuna/D3M/ta2/src/d3m-metadata/d3m_metadata/metadata.py", line 635, in pretty_print
self.pretty_print(selector + [ALL_ELEMENTS], handle=handle, _level=_level + 1)
File "/Users/yamuna/D3M/ta2/src/d3m-metadata/d3m_metadata/metadata.py", line 646, in pretty_print
self.pretty_print(selector + [element], handle=handle, _level=_level + 1)
File "/Users/yamuna/D3M/ta2/src/d3m-metadata/d3m_metadata/metadata.py", line 625, in pretty_print
for line in json.dumps(query(selector=selector), indent=1, cls=MetadataJsonEncoder).splitlines():
File "/Users/yamuna/anaconda3/lib/python3.6/json/init.py", line 238, in dumps
**kw).encode(obj)
File "/Users/yamuna/anaconda3/lib/python3.6/json/encoder.py", line 201, in encode
chunks = list(chunks)
File "/Users/yamuna/anaconda3/lib/python3.6/json/encoder.py", line 438, in _iterencode
yield from _iterencode(o, _current_indent_level)
File "/Users/yamuna/anaconda3/lib/python3.6/json/encoder.py", line 430, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/Users/yamuna/anaconda3/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
yield from chunks
File "/Users/yamuna/anaconda3/lib/python3.6/json/encoder.py", line 437, in _iterencode
o = _default(o)
File "/Users/yamuna/D3M/ta2/src/d3m-metadata/d3m_metadata/metadata.py", line 182, in default
return super().default(o)
File "/Users/yamuna/anaconda3/lib/python3.6/json/encoder.py", line 180, in default
o.class.name)
TypeError: Object of type 'int64' is not JSON serializable

@kyao
Copy link
Contributor

kyao commented Mar 12, 2018

We submitted a fix to this bug. Are you using an older version of d3m metadata? Try using the version with tag v2018.1.26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants