- Download the latest version of Python (3.7) and install it on your system
- Download Anaconda Distribution for Python 3.7 and install it on your system
- Run "Anaconda Prompt" and type "jupyter notebook" in the command prompt
Importing the Pandas and Matplotlib Library
import pandas as pd
import matplotlib.pyplot as plt
Reading the Dataset into a DataFrame
filename = 'data.csv'
data_raw = pd.read_csv(filename)
Print the imported data
data_raw
Get First 10 rows
data_raw.head(10)
Unnamed: 0 | ID | Name | Age | Photo | Nationality | Flag | Overall | Potential | Club | ... | Composure | Marking | StandingTackle | SlidingTackle | GKDiving | GKHandling | GKKicking | GKPositioning | GKReflexes | Release Clause | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 158023 | L. Messi | 31 | https://cdn.sofifa.org/players/4/19/158023.png | Argentina | https://cdn.sofifa.org/flags/52.png | 94 | 94 | FC Barcelona | ... | 96.0 | 33.0 | 28.0 | 26.0 | 6.0 | 11.0 | 15.0 | 14.0 | 8.0 | €226.5M |
1 | 1 | 20801 | Cristiano Ronaldo | 33 | https://cdn.sofifa.org/players/4/19/20801.png | Portugal | https://cdn.sofifa.org/flags/38.png | 94 | 94 | Juventus | ... | 95.0 | 28.0 | 31.0 | 23.0 | 7.0 | 11.0 | 15.0 | 14.0 | 11.0 | €127.1M |
2 | 2 | 190871 | Neymar Jr | 26 | https://cdn.sofifa.org/players/4/19/190871.png | Brazil | https://cdn.sofifa.org/flags/54.png | 92 | 93 | Paris Saint-Germain | ... | 94.0 | 27.0 | 24.0 | 33.0 | 9.0 | 9.0 | 15.0 | 15.0 | 11.0 | €228.1M |
3 | 3 | 193080 | De Gea | 27 | https://cdn.sofifa.org/players/4/19/193080.png | Spain | https://cdn.sofifa.org/flags/45.png | 91 | 93 | Manchester United | ... | 68.0 | 15.0 | 21.0 | 13.0 | 90.0 | 85.0 | 87.0 | 88.0 | 94.0 | €138.6M |
4 | 4 | 192985 | K. De Bruyne | 27 | https://cdn.sofifa.org/players/4/19/192985.png | Belgium | https://cdn.sofifa.org/flags/7.png | 91 | 92 | Manchester City | ... | 88.0 | 68.0 | 58.0 | 51.0 | 15.0 | 13.0 | 5.0 | 10.0 | 13.0 | €196.4M |
5 | 5 | 183277 | E. Hazard | 27 | https://cdn.sofifa.org/players/4/19/183277.png | Belgium | https://cdn.sofifa.org/flags/7.png | 91 | 91 | Chelsea | ... | 91.0 | 34.0 | 27.0 | 22.0 | 11.0 | 12.0 | 6.0 | 8.0 | 8.0 | €172.1M |
6 | 6 | 177003 | L. Modrić | 32 | https://cdn.sofifa.org/players/4/19/177003.png | Croatia | https://cdn.sofifa.org/flags/10.png | 91 | 91 | Real Madrid | ... | 84.0 | 60.0 | 76.0 | 73.0 | 13.0 | 9.0 | 7.0 | 14.0 | 9.0 | €137.4M |
7 | 7 | 176580 | L. Suárez | 31 | https://cdn.sofifa.org/players/4/19/176580.png | Uruguay | https://cdn.sofifa.org/flags/60.png | 91 | 91 | FC Barcelona | ... | 85.0 | 62.0 | 45.0 | 38.0 | 27.0 | 25.0 | 31.0 | 33.0 | 37.0 | €164M |
8 | 8 | 155862 | Sergio Ramos | 32 | https://cdn.sofifa.org/players/4/19/155862.png | Spain | https://cdn.sofifa.org/flags/45.png | 91 | 91 | Real Madrid | ... | 82.0 | 87.0 | 92.0 | 91.0 | 11.0 | 8.0 | 9.0 | 7.0 | 11.0 | €104.6M |
9 | 9 | 200389 | J. Oblak | 25 | https://cdn.sofifa.org/players/4/19/200389.png | Slovenia | https://cdn.sofifa.org/flags/44.png | 90 | 93 | Atlético Madrid | ... | 70.0 | 27.0 | 12.0 | 18.0 | 86.0 | 92.0 | 78.0 | 88.0 | 89.0 | €144.5M |
10 rows × 89 columns
Check out the columns data
data_raw.columns
Index(['Unnamed: 0', 'ID', 'Name', 'Age', 'Photo', 'Nationality', 'Flag', 'Overall', 'Potential', 'Club', 'Club Logo', 'Value', 'Wage', 'Special', 'Preferred Foot', 'International Reputation', 'Weak Foot', 'Skill Moves', 'Work Rate', 'Body Type', 'Real Face', 'Position', 'Jersey Number', 'Joined', 'Loaned From', 'Contract Valid Until', 'Height', 'Weight', 'LS', 'ST', 'RS', 'LW', 'LF', 'CF', 'RF', 'RW', 'LAM', 'CAM', 'RAM', 'LM', 'LCM', 'CM', 'RCM', 'RM', 'LWB', 'LDM', 'CDM', 'RDM', 'RWB', 'LB', 'LCB', 'CB', 'RCB', 'RB', 'Crossing', 'Finishing', 'HeadingAccuracy', 'ShortPassing', 'Volleys', 'Dribbling', 'Curve', 'FKAccuracy', 'LongPassing', 'BallControl', 'Acceleration', 'SprintSpeed', 'Agility', 'Reactions', 'Balance', 'ShotPower', 'Jumping', 'Stamina', 'Strength', 'LongShots', 'Aggression', 'Interceptions', 'Positioning', 'Vision', 'Penalties', 'Composure', 'Marking', 'StandingTackle', 'SlidingTackle', 'GKDiving', 'GKHandling', 'GKKicking', 'GKPositioning', 'GKReflexes', 'Release Clause'], dtype='object')
Basic Information about columns (datatype, count etc)
data_raw.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 18207 entries, 0 to 18206 Data columns (total 89 columns): Unnamed: 0 18207 non-null int64 ID 18207 non-null int64 Name 18207 non-null object Age 18207 non-null int64 Photo 18207 non-null object Nationality 18207 non-null object Flag 18207 non-null object Overall 18207 non-null int64 Potential 18207 non-null int64 Club 17966 non-null object Club Logo 18207 non-null object Value 18207 non-null object Wage 18207 non-null object Special 18207 non-null int64 Preferred Foot 18159 non-null object International Reputation 18159 non-null float64 Weak Foot 18159 non-null float64 Skill Moves 18159 non-null float64 Work Rate 18159 non-null object Body Type 18159 non-null object Real Face 18159 non-null object Position 18147 non-null object Jersey Number 18147 non-null float64 Joined 16654 non-null object Loaned From 1264 non-null object Contract Valid Until 17918 non-null object Height 18159 non-null object Weight 18159 non-null object LS 16122 non-null object ST 16122 non-null object RS 16122 non-null object LW 16122 non-null object LF 16122 non-null object CF 16122 non-null object RF 16122 non-null object RW 16122 non-null object LAM 16122 non-null object CAM 16122 non-null object RAM 16122 non-null object LM 16122 non-null object LCM 16122 non-null object CM 16122 non-null object RCM 16122 non-null object RM 16122 non-null object LWB 16122 non-null object LDM 16122 non-null object CDM 16122 non-null object RDM 16122 non-null object RWB 16122 non-null object LB 16122 non-null object LCB 16122 non-null object CB 16122 non-null object RCB 16122 non-null object RB 16122 non-null object Crossing 18159 non-null float64 Finishing 18159 non-null float64 HeadingAccuracy 18159 non-null float64 ShortPassing 18159 non-null float64 Volleys 18159 non-null float64 Dribbling 18159 non-null float64 Curve 18159 non-null float64 FKAccuracy 18159 non-null float64 LongPassing 18159 non-null float64 BallControl 18159 non-null float64 Acceleration 18159 non-null float64 SprintSpeed 18159 non-null float64 Agility 18159 non-null float64 Reactions 18159 non-null float64 Balance 18159 non-null float64 ShotPower 18159 non-null float64 Jumping 18159 non-null float64 Stamina 18159 non-null float64 Strength 18159 non-null float64 LongShots 18159 non-null float64 Aggression 18159 non-null float64 Interceptions 18159 non-null float64 Positioning 18159 non-null float64 Vision 18159 non-null float64 Penalties 18159 non-null float64 Composure 18159 non-null float64 Marking 18159 non-null float64 StandingTackle 18159 non-null float64 SlidingTackle 18159 non-null float64 GKDiving 18159 non-null float64 GKHandling 18159 non-null float64 GKKicking 18159 non-null float64 GKPositioning 18159 non-null float64 GKReflexes 18159 non-null float64 Release Clause 16643 non-null object dtypes: float64(38), int64(6), object(45) memory usage: 12.4+ MB
Check out the shape of DataFrame object
data_raw.shape
(18207, 89)
Check if there are null values in the DataFrame
data_raw.isnull().any().any() # If true is returned --> there are null values in the DataFrame
True
Get the columns having null values
data_raw.isnull().any()
Unnamed: 0 False ID False Name False Age False Photo False Nationality False Flag False Overall False Potential False Club True Club Logo False Value False Wage False Special False Preferred Foot True International Reputation True Weak Foot True Skill Moves True Work Rate True Body Type True Real Face True Position True Jersey Number True Joined True Loaned From True Contract Valid Until True Height True Weight True LS True ST True ... Dribbling True Curve True FKAccuracy True LongPassing True BallControl True Acceleration True SprintSpeed True Agility True Reactions True Balance True ShotPower True Jumping True Stamina True Strength True LongShots True Aggression True Interceptions True Positioning True Vision True Penalties True Composure True Marking True StandingTackle True SlidingTackle True GKDiving True GKHandling True GKKicking True GKPositioning True GKReflexes True Release Clause True Length: 89, dtype: bool
Get the total number of null values
data_raw.isnull().sum().sum()
76984
Get the number of null values for each of the columns
data_raw.isnull().sum()
Unnamed: 0 0 ID 0 Name 0 Age 0 Photo 0 Nationality 0 Flag 0 Overall 0 Potential 0 Club 241 Club Logo 0 Value 0 Wage 0 Special 0 Preferred Foot 48 International Reputation 48 Weak Foot 48 Skill Moves 48 Work Rate 48 Body Type 48 Real Face 48 Position 60 Jersey Number 60 Joined 1553 Loaned From 16943 Contract Valid Until 289 Height 48 Weight 48 LS 2085 ST 2085 ... Dribbling 48 Curve 48 FKAccuracy 48 LongPassing 48 BallControl 48 Acceleration 48 SprintSpeed 48 Agility 48 Reactions 48 Balance 48 ShotPower 48 Jumping 48 Stamina 48 Strength 48 LongShots 48 Aggression 48 Interceptions 48 Positioning 48 Vision 48 Penalties 48 Composure 48 Marking 48 StandingTackle 48 SlidingTackle 48 GKDiving 48 GKHandling 48 GKKicking 48 GKPositioning 48 GKReflexes 48 Release Clause 1564 Length: 89, dtype: int64
Fork it (https://github.com/qualityjacks/Fifa19_Insights/fork)
Create your feature branch
git checkout -b feature
Commit your changes
git commit -m 'some-text'
Push to the branch
git push origin feature
Create a new Pull Request