How to reproduce the data if jason_read() is not working for everyone #9

sunnymh · 2013-10-19T18:10:06Z

It seems like json_read() only works for a few people. So if that is the case, how do we reproduce the data in class on Tuesday? Those people who can't use json_read() definitely can't reproduce the code from groups which use that function.

The text was updated successfully, but these errors were encountered:

kqdtran · 2013-10-19T18:38:49Z

Idk of any elegant method, but there's a workaround in #3

sunnymh · 2013-10-19T18:57:52Z

That's a work around for people who can't use json_read(), but I was wondering how do we check the code of people who use json_read() on Tuesday?

teresita · 2013-10-19T20:48:48Z

@sunnymh if you're getting errors, it could be a misspelling of json? (there's no 'a')

aculich · 2013-10-21T05:40:17Z

@teresita Thanks for picking up on the spelling error here. Definitely no 'a' in JSON, so that could be the problem. @sunnymh is this working for you now?

sunnymh · 2013-10-21T05:52:31Z

@aculich That's not actually my question. As in #3 people are getting errors using read_json() ValueError: arrays must all be same length, and I got the same error using read_json() as well. So I used json.load() as suggested in #3 and I think a lot of people are using json.load() as well. So my question is that, for people like me, there is the possibility that we can't run other people's code which uses read_json(). Sorry about the misspelling.

aculich · 2013-10-21T05:57:28Z

The Steps to Curate Data: Issue #8 contains most of the answer to this problem. An alternative acceptable method would be to use the CSV version of the new data which is available here:

http://earthquake.usgs.gov/earthquakes/feed/v1.0/csv.php

sunnymh · 2013-10-21T06:08:50Z

@aculich #8 uses json.load() as well, which doesn't require us to install the latest version of Pandas to run read_json(). So on Tuesday, people who use json.load() might not be able to run read_json() if some other groups happen to use it. Is that going to be a problem?

aculich · 2013-10-21T06:56:13Z

So what you suggest here is an interesting conundrum.... and illustrates why we are using a virtual machine. The code that uses read_json() needs pandas upgraded to version 0.12, but that might impact other code the person has on their machine if it relies on an earlier version of pandas. In general the pandas code is probably forwards-compatible, but you can't be sure. So as long as you provide instructions in your version of the README.md file you should be able to get other people to upgrade their version of pandas to run your code. If we were not using a virtual machine this could cause real problems that lead to dependency hell which multiple conflicting versions of packages need to be installed. We will discuss this in class and how we can use VMs as a strategy to handle this problem. Whichever strategy you've chosen will likely work okay for Tuesday's code review, but in practice it is important to be mindful of the implications.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reproduce the data if jason_read() is not working for everyone #9

How to reproduce the data if jason_read() is not working for everyone #9

sunnymh commented Oct 19, 2013

kqdtran commented Oct 19, 2013

sunnymh commented Oct 19, 2013

teresita commented Oct 19, 2013

aculich commented Oct 21, 2013

sunnymh commented Oct 21, 2013

aculich commented Oct 21, 2013

sunnymh commented Oct 21, 2013

aculich commented Oct 21, 2013

How to reproduce the data if jason_read() is not working for everyone #9

How to reproduce the data if jason_read() is not working for everyone #9

Comments

sunnymh commented Oct 19, 2013

kqdtran commented Oct 19, 2013

sunnymh commented Oct 19, 2013

teresita commented Oct 19, 2013

aculich commented Oct 21, 2013

sunnymh commented Oct 21, 2013

aculich commented Oct 21, 2013

sunnymh commented Oct 21, 2013

aculich commented Oct 21, 2013