Initial commit of NHANES 2011 data #1

kshedden · 2015-03-22T17:15:35Z

Not sure what you have in mind in terms of data format, meta data, etc. Let me know and I will revise the PR as needed.

josef-pkt · 2015-03-22T18:15:05Z

@vincentarelbundock
I think when we build this out we will use something like
https://github.com/vincentarelbundock/Rdatasets
which would put nhanes one level lower into a csv folder

in general:
There were discussion on the nipy mailing list about making installable python dataset packages, which makes sense if users will want to use most of the data available or they don't get too large, but not so much if we want to use just a few datasets as in rdatasets.
I didn't pay a lot of attention to the details of dataset packages and meta information. For now the rdataset pattern plus our datasets inside statsmodels seems to be enough.
It's possible to rethink this in future if someone is interested. I saw that there are also related datset packages for Julia (one of them a translation of Vincent's rdatasets) which will have similar installation and license/copyright questions as we do.

josef-pkt · 2015-03-22T18:22:05Z

On specific question:
Is the Hanes .gz file an archive with a single csv file or does it have a collection of csv files?
What's the advantage of using an archive instead of a plain csv file?

I'm fine either way, but AFAIK, we would have to write the py2/py3 compatible helper functions to get the data from an archive file. (The statespace notebooks are doing that, and it was what triggered me into looking at creating smdatasets)

kshedden · 2015-03-22T18:51:19Z

There's just a single file in there, in csv format. It's only compressed to save space/bandwidth.

I don't feel strongly about this.

kshedden added 2 commits March 22, 2015 13:11

Initial commit of NHANES 2011 data

ce72700

Revise description

885d085

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial commit of NHANES 2011 data #1

Initial commit of NHANES 2011 data #1

kshedden commented Mar 22, 2015

josef-pkt commented Mar 22, 2015

josef-pkt commented Mar 22, 2015

kshedden commented Mar 22, 2015

Initial commit of NHANES 2011 data #1

Are you sure you want to change the base?

Initial commit of NHANES 2011 data #1

Conversation

kshedden commented Mar 22, 2015

josef-pkt commented Mar 22, 2015

josef-pkt commented Mar 22, 2015

kshedden commented Mar 22, 2015