Skip to content

Dataframe class name #17

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
datapythonista opened this issue Jun 23, 2020 · 4 comments
Open

Dataframe class name #17

datapythonista opened this issue Jun 23, 2020 · 4 comments

Comments

@datapythonista
Copy link
Member

datapythonista commented Jun 23, 2020

I think there is consensus (correct me if I'm wrong), on having a 2-D structure where (at least) columns are labelled, and where a whole column share a type. More specific discussions about this structure can be made in #2.

In this issue, I'd like to discuss how we should name the class representing this structure. We've been using dataframe for the concept so far, and it's how the class is named in pandas, vaex, Modin, R and others. But in #14 (comment) it was proposed that we consider other names. I list here the proposed options in the comment and couple more. I propose that people write their username next to their preferred option, and use the comments to expand on why if needed.

Also, I think we should decide about capitalization, I guess these are the only options (using dataframe as example, but applied to the preferred option from the above list):

@datapythonista
Copy link
Member Author

datapythonista commented Jun 25, 2020

I personally think that in Python there is some consistency in using lowercase capitalization for types and data structures: int, str, list, datetime, dict, tuple, array,... and it feels like dataframe belongs to that group, more than a general class.

But if @devin-petersohn is ok with DataFrame, and there are no more opinions, I'll open a PR in the RFC for DataFrame (since it's the one being used so far).

@devin-petersohn
Copy link
Member

I am only okay with DataFrame if Array (capitalization) is how we are spelling the array protocol. As long as we are consistent between the two, I am okay.

@TomAugspurger
Copy link

TomAugspurger commented Jun 25, 2020 via email

@rgommers
Copy link
Member

+1 to matching what the array / Array API standard does.

we didn't specify a name there - there's just "an array object". Reason: not needed (there's no way to call <array>.__init__), and existing libraries will not rename their existing array, Array, NDArray, Tensor` objects anyway.

Status quo is DataFrame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants