Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix from_records() column reorder issue, if columns!=None use passed param (#59717) #59809

Merged
Merged
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -624,6 +624,7 @@ I/O
- Bug in :meth:`DataFrame.to_stata` when writing :class:`DataFrame` and ``byteorder=`big```. (:issue:`58969`)
- Bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
- Bug in :meth:`HDFStore.get` was failing to save data of dtype datetime64[s] correctly (:issue:`59004`)
- Bug in :meth:`from_records` where columns parameter with numpy array data was not reordeing and filtering out the columns (:issue:`59717`)
VibavariG marked this conversation as resolved.
Show resolved Hide resolved
- Bug in :meth:`read_csv` causing segmentation fault when ``encoding_errors`` is not a string. (:issue:`59059`)
- Bug in :meth:`read_csv` raising ``TypeError`` when ``index_col`` is specified and ``na_values`` is a dict containing the key ``None``. (:issue:`57547`)
- Bug in :meth:`read_csv` raising ``TypeError`` when ``nrows`` and ``iterator`` are specified without specifying a ``chunksize``. (:issue:`59079`)
Expand Down
3 changes: 2 additions & 1 deletion pandas/core/internals/construction.py
Original file line number Diff line number Diff line change
Expand Up @@ -750,7 +750,8 @@ def to_arrays(

elif isinstance(data, np.ndarray) and data.dtype.names is not None:
# e.g. recarray
columns = Index(list(data.dtype.names))
if columns is None:
columns = Index(list(data.dtype.names))
VibavariG marked this conversation as resolved.
Show resolved Hide resolved
arrays = [data[k] for k in columns]
return arrays, columns

Expand Down
34 changes: 34 additions & 0 deletions pandas/tests/frame/methods/test_to_arrays.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import numpy as np
VibavariG marked this conversation as resolved.
Show resolved Hide resolved
from numpy import array

import pandas._testing as tm
from pandas.core.indexes.api import ensure_index
from pandas.core.internals.construction import to_arrays


def test_to_arrays():
# GH 59717
data = np.array(
[
("John", 25, "New York", 50000),
("Jane", 30, "San Francisco", 75000),
("Bob", 35, "Chicago", 65000),
("Alice", 28, "Los Angeles", 60000),
],
dtype=[("name", "U10"), ("age", "i4"), ("city", "U15"), ("salary", "i4")],
)

columns = ["name", "salary", "city"]
indexed_columns = ensure_index(columns)

actual_arrays, actual_cols = to_arrays(data, indexed_columns)
VibavariG marked this conversation as resolved.
Show resolved Hide resolved
expected_arrays = [
array(["John", "Jane", "Bob", "Alice"], dtype="<U10"),
array([50000, 75000, 65000, 60000], dtype="int32"),
array(["New York", "San Francisco", "Chicago", "Los Angeles"], dtype="<U15"),
]

for actual, expected in zip(actual_arrays, expected_arrays):
tm.assert_numpy_array_equal(actual, expected)

assert actual_cols.equals(indexed_columns)
Loading