Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError in chapter 3 with python 3.7 #4

Open
rrcook opened this issue Jul 6, 2019 · 1 comment
Open

TypeError in chapter 3 with python 3.7 #4

rrcook opened this issue Jul 6, 2019 · 1 comment

Comments

@rrcook
Copy link

rrcook commented Jul 6, 2019

MacOS 10.14 Mojave, python 3.7.
The cell

#cat_to_num(data['Sex'])
features = prepare_data(data_train)
features[:5]

Gives

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-1224e570c6ab> in <module>
      1 #cat_to_num(data['Sex'])
----> 2 features = prepare_data(data_train)
      3 features[:5]

<ipython-input-5-07c5dd9cc834> in prepare_data(data)
     16 
     17     # Adding Embarked categorical value
---> 18     features = features.join( cat_to_num(data['Embarked']) )
     19 
     20     return features

<ipython-input-4-a227bdfcf37c> in cat_to_num(data)
      2 # Changed to automatically add column names
      3 def cat_to_num(data):
----> 4     categories = unique(data)
      5     features = {}
      6     for cat in categories:

/usr/local/lib/python3.7/site-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts, axis)
    262     ar = np.asanyarray(ar)
    263     if axis is None:
--> 264         ret = _unique1d(ar, return_index, return_inverse, return_counts)
    265         return _unpack_tuple(ret)
    266 

/usr/local/lib/python3.7/site-packages/numpy/lib/arraysetops.py in _unique1d(ar, return_index, return_inverse, return_counts)
    310         aux = ar[perm]
    311     else:
--> 312         ar.sort()
    313         aux = ar
    314     mask = np.empty(aux.shape, dtype=np.bool_)

TypeError: '<' not supported between instances of 'float' and 'str'

@rrcook
Copy link
Author

rrcook commented Jul 6, 2019

I've found that the problem is there are two nulls in the Embarked column of data/titanic.csv. You can either fill them in or add a fillna in the code, such as

    # Adding Embarked categorical value
    features = features.join( cat_to_num(data['Embarked'].fillna('Q')) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant