-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decision on handling JSON fields in download #61
Comments
Just on the record:
Others:
Agree, me too.
Easy thing to do. Just let it be documented, somewhere. Finally: |
I just had to work around this exact issue again, so it's fresh in my mind. I agree with @dmpe on all points, and would add a little more. Details about
|
The The |
I found a situation where you would not get all the columns in JSON that you would in CSV (if you skipped the "nested" fields as I was proposing). The |
This is for discussion purposes...
The program development team is leaning toward using JSON as the default download format. This decision is based on JSON being a more reliable and faster method of download--which was confirmed with the Socrata development team.
However, JSON and CSV differs in an important way. JSON files have three fields not available in the CSV and not seen in the web interface:
The first two items are parsimonious breakouts of the concatenated location column that is generated by Socrata. However, these columns may be highly redundant as it's a common practice to upload a "latitude" and "longitude" column which is used to create a "location" column.
If we keep
location.latitude
andlocation.longitude
, we should convert them to numbers to serve their practical function.The
location.needs_geocoding
is less useful. It's an internal flag for Socrata to handle their geocoding practice to display on their maps. This is pertinent when the location column is not lat/long, but an address field. I don't see much reason to keep this column except it's the easier thing to do.Something to keep in mind is the consistency of what one sees in R versus the web browser. Passing along a valid SoDA endpoint (e.g., example.com/resource/four-four.csv or example.com/resource/four-four.json) can be viewed in the browser. In our case, it always chooses JSON which can be inconsistent with the CSV columns. Need to balance this in the discussion.
Open for discussion on the best way to handle these three columns: keep them, drop (some of) them, or use trickier logic to align them to the request (e.g., drop them for CSV requests, keep them otherwise).
/cc @dmpe @geneorama
The text was updated successfully, but these errors were encountered: