Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add module for exporting the data back out to csv #4

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

spencerrecneps
Copy link
Contributor

I've added a module to export the data back out to csv. It's not quite complete but I think I'll need some help with the final piece. I've got it looping through each GTFS table and creating the appropriate .txt file, but I can't figure out how to get the table headers and rows written to the csv file. It seems like it should be easy to do and I've been reading through the SQLAlchemy docs, but I'm just not getting it. Do you mind taking a look?

You can ignore the pull request for now. I can re-send once I have it all working properly.

P.S. I'm pretty new to all of this so if there's a better way to have you review what I'm doing, please suggest.

@spencerrecneps
Copy link
Contributor Author

Also, please disregard the version change. I was playing around with things on my local copy and forgot to revert that line.

@jarondl
Copy link
Owner

jarondl commented Feb 2, 2014

That's a good start, but there is indeed no point in merging until it is more complete.
The issues we'll need to solve are:

  • A csv might miss columns, replaced with NULL in the db. So currently the exporter cannot tell the difference between missing columns and missing fields. Are we okay with that?
  • Some fields are "validated" or in fact converted when input to the database. We need to figure out a way to revert this conversion. This is vital, and needs to be done anyway. Perhaps we need to change something more basic in the architecture. For example, a binary is stored in the csv as 0 or 1.

Let's try to work on those issues, and keep the conversation here.

@spencerrecneps
Copy link
Contributor Author

The only columns that could be overlooked are optional columns according to the spec, right? In other words, if our model defines a column as nullable, it might be left out. I think that should be OK as long as our definitions of nullable columns corresponds to the optional columns in the spec.

@jarondl
Copy link
Owner

jarondl commented Feb 4, 2014

The point was that we cannot tell the difference between a missing column and a column full of nulls.
I guess we could just output empty columns.

Yeah, optional in specs <=> nullable in our model.

@spencerrecneps
Copy link
Contributor Author

I'm OK with outputting empty columns.

As for the field validation, what are the challenges? I don't imagine the mechanics of converting between, for example, binary and 0/1 is too difficult. Is the real issue determining the best way to operationalize the conversion? For instance, do we store that information within the relevant class in gtfs_entities, or do we create a new type_conversion class to handle it?

Again, sorry for my ignorance - I'm still learning how to think correctly about these kinds of problems.

@vingerha
Copy link
Contributor

Just reading this and would like to understand the use-case but since this is 10y old, I guess there is none anymore ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants