Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use python range instead of numpy arange #12613

Closed
wants to merge 1 commit into from
Closed

Conversation

adolph
Copy link

@adolph adolph commented Nov 21, 2023

In Basic documentation Expressions there is a type error when following along with the documentation code. This can be fixed by using Python's builtin range instead of Numpy's arange.

Error: TypeError: Series constructor called with unsupported type 'ndarray' for the values parameter

Below are Python and package versions, output from version with error and output from pull request without error.

Versions of Python and packages

 python --version
 pip freeze | grep -i '\(polars\|numpy\)'
Python 3.11.5
numpy==1.26.2
polars==0.19.15

Current code with error:

import polars as pl
import numpy as np
from datetime import datetime

df2 = pl.DataFrame(
    {
        "x": np.arange(0, 8),
        "y": ["A", "A", "A", "B", "B", "C", "X", "X"],
    }
)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[83], line 5
      2 import numpy as np
      3 from datetime import datetime
----> 5 df2 = pl.DataFrame(
      6     {
      7         "x": np.arange(0, 8),
      8         "y": ["A", "A", "A", "B", "B", "C", "X", "X"],
      9     }
     10 )

File ~/polars/polars-user-guide/python/.venv/lib/python3.11/site-packages/polars/dataframe/frame.py:360, in DataFrame.__init__(self, data, schema, schema_overrides, orient, infer_schema_length, nan_to_null)
    355     self._df = dict_to_pydf(
    356         {}, schema=schema, schema_overrides=schema_overrides
    357     )
    359 elif isinstance(data, dict):
--> 360     self._df = dict_to_pydf(
    361         data,
    362         schema=schema,
    363         schema_overrides=schema_overrides,
    364         nan_to_null=nan_to_null,
    365     )
    367 elif isinstance(data, (list, tuple, Sequence)):
    368     self._df = sequence_to_pydf(
    369         data,
    370         schema=schema,
   (...)
    373         infer_schema_length=infer_schema_length,
    374     )

File ~/polars/polars-user-guide/python/.venv/lib/python3.11/site-packages/polars/utils/_construction.py:907, in dict_to_pydf(data, schema, schema_overrides, nan_to_null)
    898     data_series = [
    899         pl.Series(
    900             name, [], dtype=schema_overrides.get(name), nan_to_null=nan_to_null
    901         )._s
    902         for name in column_names
    903     ]
    904 else:
    905     data_series = [
    906         s._s
--> 907         for s in _expand_dict_scalars(
    908             data, schema_overrides=schema_overrides, nan_to_null=nan_to_null
    909         ).values()
    910     ]
    912 data_series = _handle_columns_arg(data_series, columns=column_names, from_dict=True)
    913 pydf = PyDataFrame(data_series)

File ~/polars/polars-user-guide/python/.venv/lib/python3.11/site-packages/polars/utils/_construction.py:815, in _expand_dict_scalars(data, schema_overrides, order, nan_to_null)
    812     updated_data[name] = s
    814 elif arrlen(val) is not None or _is_generator(val):
--> 815     updated_data[name] = pl.Series(
    816         name=name, values=val, dtype=dtype, nan_to_null=nan_to_null
    817     )
    818 elif val is None or isinstance(  # type: ignore[redundant-expr]
    819     val, (int, float, str, bool, date, datetime, time, timedelta)
    820 ):
    821     updated_data[name] = pl.Series(
    822         name=name, values=[val], dtype=dtype
    823     ).extend_constant(val, array_len - 1)

File ~/polars/polars-user-guide/python/.venv/lib/python3.11/site-packages/polars/series/series.py:324, in Series.__init__(self, name, values, dtype, strict, nan_to_null, dtype_if_empty)
    316     self._s = iterable_to_pyseries(
    317         name,
    318         values,
   (...)
    321         strict=strict,
    322     )
    323 else:
--> 324     raise TypeError(
    325         f"Series constructor called with unsupported type {type(values).__name__!r}"
    326         " for the `values` parameter"
    327     )

TypeError: Series constructor called with unsupported type 'ndarray' for the `values` parameter

Updated code without error:

import polars as pl
import numpy as np
from datetime import datetime

df2 = pl.DataFrame(
    {
        "x": range(0, 8),
        "y": ["A", "A", "A", "B", "B", "C", "X", "X"],
    }
)
df2

shape: (8, 2)

x y
i64 str
0 "A"
1 "A"
2 "A"
3 "B"
4 "B"
5 "C"
6 "X"
7 "X"

@adolph
Copy link
Author

adolph commented Nov 21, 2023

I would like to withdraw this PR. I'm not able to reproduce in a fresh notebook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants