Skip to content

Little exercise with endless posibilities...

Notifications You must be signed in to change notification settings

sacses/pandas-apply-lambda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Pandas Apply Lambda

The ultimate tool when leading with a Pandas dataframe!!!

Image


Using the apply function with lambda

The general syntax is:

df.apply(lambda x: func( x['col1'], x['col2']), axis=1 )

This will allow you to create pretty much any logic, I promise!!!

📁 Data

To perform the challenges you will use the dataset /data/imput/IMDB-Movie-Data.csv

🐼 Challenge 1. Warm up

We want to create bins of movies according to the number of votes they've received. For that matter, we will create a new column named 'bin' which will tag every movie as follow:

  • From 0 to 1000 ==> 1
  • From 1000 to 10000 ==> 2
  • From 10000 to 100000 ==> 3
  • From 100000 to 1000000 ==> 4
  • More than 1000000 ==> 5

🐼 🐼 Challenge 2. Using axis concept

We want to know how much is the revenue per minute for every movie.

🐼 🐼 🐼 Challenge 3. Using the lambda

We want to create a new ranking where we add 1 point if the genre is thriller but subtract 1 point if the genre is comedy.

🐼 🐼 🐼 🐼 Challenge 4. Now the real stuff

We want to know if the sum of the ASCII value of every character of the movie title divided by the number of votes retrieve a prime number...remember, prime numbers are integers.

🐼 🐼 🐼 🐼 🐼 Challenge 5. And finally some fantasy

Feel free to propose your own ranking based in aggregations of at least 3 columns of the dataset.

🐼 🐼 🐼 🐼 🐼 🐼 Challenge 6. Freaky bonus

We want to know which movies might have hidden paterns in their description. A way to know that is finding those movies which the sum of all numeric values of the string description hash (SHA256) are between their revenue and their number of votes.


About

Little exercise with endless posibilities...

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published