-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AtividadeS11 #20
base: main
Are you sure you want to change the base?
AtividadeS11 #20
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
import pandas as pd | ||
|
||
df = pd.read_csv('../../../material/mais_ouvidas_2024.csv') | ||
|
||
print(df.head()) | ||
print(df.columns) | ||
print(df.dtypes) | ||
|
||
#- Corrija a coluna 'Spotify Streams' para que tenha apenas valores numéricos. | ||
df ['Spotify Streams'] = df['Spotify Streams'].str.replace(',', '').fillna('0').astype(int) | ||
print(df['Spotify Streams']) | ||
|
||
#- Corrija a coluna 'Release Date' para o formato datetime. | ||
print(df.dtypes['Release Date']) | ||
df['Release Date'] = pd.to_datetime(df['Release Date'], format='mixed') | ||
print(df.dtypes['Release Date']) | ||
|
||
#- Converta as colunas 'YouTube Views', 'TikTok Views', 'Pandora Streams', 'Spotify Streams', 'TikTok Likes', 'Shazam Counts', 'Soundcloud Streams' para o tipo inteiro. | ||
print(df.dtypes) | ||
|
||
col_convert = ['YouTube Views', 'TikTok Views', 'Pandora Streams','TikTok Likes', 'Shazam Counts', 'Soundcloud Streams'] | ||
|
||
original_dtypes = df[col_convert].dtypes | ||
print(original_dtypes) | ||
print() | ||
|
||
for col in col_convert: | ||
df[col] = df[col].str.replace(',', '') # Remove as vírgulas | ||
df[col] = df[col].fillna('0') # Substitui NaN por '0' | ||
df[col] = df[col].astype(int) # Converte para int | ||
|
||
tipos = df[col_convert].dtypes | ||
print(tipos) | ||
|
||
#- Crie uma nova coluna chamada 'Streaming Popularity', que seja a média da popularidade nas plataformas 'Spotify Popularity', 'YouTube Views', 'TikTok Likes', e 'Shazam Counts'. (lembrem-se que só é possível calcular médias e fazer operações matemáticas com tipos númericos) | ||
|
||
df['Streaming Popularity'] = (df['Spotify Popularity'] + df['YouTube Views'] + df['TikTok Likes'] + df['Shazam Counts']) /4 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. indicaria o uso do |
||
print(df['Streaming Popularity']) | ||
|
||
#- Crie uma coluna 'Total Streams', somando os valores de 'Spotify Streams', 'YouTube Views', 'TikTok Views', 'Pandora Streams', e 'Soundcloud Streams'. | ||
|
||
df['Total Streams'] = (df['Spotify Streams'] + df['YouTube Views'] + df['TikTok Views'] + df['Pandora Streams'] + df['Soundcloud Streams']) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. indicaria o uso do |
||
print(df['Total Streams']) | ||
|
||
#- Filtre apenas as faixas onde a popularidade do Spotify ('Spotify Popularity') é maior que 80 e que tenham mais de 1 milhão de streams totais ('Total Streams'). | ||
|
||
filtered_df = df[(df['Spotify Popularity'] > 80) & (df['Total Streams'] > 1_000_000)] | ||
print (filtered_df) | ||
|
||
#- Salve o DataFrame resultante em um novo arquivo JSON chamado 'faixas_filtradas.json'. | ||
filtered_df.to_csv('./faixas_filtradas.csv', index=False) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. acredito que deveria ser salvo em json, então o uso da função é |
||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
essa era a única coluna que deveria ser convertida? por que tomou a decisão para substituir os valores nulos por 0 e porque converter a int e não float?