Skip to content

Latest commit

 

History

History
47 lines (30 loc) · 2.04 KB

README.md

File metadata and controls

47 lines (30 loc) · 2.04 KB

Pandas Apply Lambda

The ultimate tool when leading with a Pandas dataframe!!!

Image


Using the apply function with lambda

The general syntax is:

df.apply(lambda x: func( x['col1'], x['col2']), axis=1 )

This will allow you to create pretty much any logic, I promise!!!

📁 Data

To perform the challenges you will use the dataset /data/input/IMDB-Movie-Data.csv

🐼 Challenge 1. Warm up

We want to create bins of movies according to the number of votes they've received. For that matter, we will create a new column named 'bin' which will tag every movie as follow:

  • From 0 to 999 ==> 1
  • From 1000 to 9999 ==> 2
  • From 10000 to 99999 ==> 3
  • From 100000 to 999999 ==> 4
  • More than 1000000 ==> 5

🐼 🐼 Challenge 2. Using axis concept

We want to know how much is the revenue per minute for every movie.

🐼 🐼 🐼 Challenge 3. Using the lambda

We want to create a new ranking where we add 1 point if the genre is thriller but subtract 1 point if the genre is comedy.

🐼 🐼 🐼 🐼 Challenge 4. Now the real stuff

We want to know if the sum of the ASCII value of every character of the movie title divided by the number of votes retrieve a prime number...remember, prime numbers are integers.

🐼 🐼 🐼 🐼 🐼 Challenge 5. And finally some fantasy

Feel free to propose your own ranking based in aggregations of at least 3 columns of the dataset.

🐼 🐼 🐼 🐼 🐼 🐼 Challenge 6. Freaky bonus

We want to know which movies might have hidden paterns in their description. A way to know that is finding those movies which the sum of all numeric values of the string description hash (SHA256) are between their revenue and their number of votes.