-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reverse colorbar #65
base: master
Are you sure you want to change the base?
reverse colorbar #65
Conversation
I can get the plot to work, but I think it's best to change up the code so that we put in a place holder for the date. I think the dashboard can go and handle the animation instead of plotly directly. See code comment for changes I made to make the iteration process faster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you change you code to this, it should at least plot faster...
analysis/db/us_map/choroplethMap.py
Outdated
|
||
# plt.show() | ||
# color_map = plt.cm.get_cmap('viridis') | ||
# reversed_viridis = color_map.reversed() | ||
|
||
|
||
fig = px.choropleth(molten_df, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a dataframe that was just a subset of a particular date, and then used that subseted dataframe to plot the figure
plot_data = molten_df[molten_df.date_iso == '2020-02-01']
fig = px.choropleth(plot_data,
geojson=counties,
locations=plot_data.fips_str,
color='value',
#animation_frame='date',
hover_data=['State', 'value'],
color_continuous_scale='viridis_r',
range_color=(0, 300),
scope="usa",
title='Confirmed cases',
labels={'value': 'confirmed cases'}
)
analysis/db/us_map/choroplethMap.py
Outdated
@@ -34,7 +34,10 @@ | |||
molten_df['date_iso'] = pd.to_datetime(molten_df['date'], format="%m/%d/%y") # change date to ISO8601 standard format | |||
|
|||
fips = molten_df['fips_str'].tolist() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the below changes, you don't need this line anymore since you're passing in the column of values into the plotting function
|
||
confirmed_df = pd.read_csv('https://github.com/CSSEGISandData/COVID-19/raw/master/csse_covid_19_data/' | ||
'csse_covid_19_time_series/time_series_covid19_confirmed_US.csv') | ||
loc_df = pd.read_excel(here('./data/db/original/maps/State_FIPS.xlsx')) | ||
pop_df = pd.read_excel(here('./data/db/original/maps/PopulationEstimates.xls')) # population dataset for 2019 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where did this dataset come from?
analysis/db/us_map/choroplethMap.py
Outdated
'csse_covid_19_time_series/time_series_covid19_confirmed_US.csv') | ||
loc_df = pd.read_excel(here('./data/db/original/maps/State_FIPS.xlsx')) | ||
pop_df = pd.read_excel(here('./data/db/original/maps/PopulationEstimates.xls')) # population dataset for 2019 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should provide a download link to where you got these datasets from.
analysis/db/us_map/choroplethMap.py
Outdated
|
||
molten_pop_df = pd.merge(molten_df, pop_df, on='fips_str') # add population per county | ||
grouped_by = molten_pop_df.groupby(['fips_str', 'date_iso', 'Admin2', 'POP_ESTIMATE_2019'])['value'].sum().reset_index() | ||
grouped_by['value'] = grouped_by['value']/grouped_by['POP_ESTIMATE_2019'] # get per capita value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't overwrite the original 'value' column. you should make a new column (in this case something like 'total_per_cap') that is assigned the per capita value
analysis/db/us_map/choroplethMap.py
Outdated
color_continuous_scale="Viridis", | ||
range_color=(0, 300), | ||
color_continuous_scale='viridis_r', | ||
range_color=(0, 500), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you choose 500? can we set this to something like max(per_cap)
and use a variable instead of hard-coding a value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, you are right and I am working on that. I was thinking of putting there the third quartile as 75%, cause when I am taking the max value, which is for New York, it is much higher than other states and that's why it gets a bit wrong coloring. I tried to use quartile's fuction, but range_color
didn't accept my input. The same goes with per capita case, but there it shows another state with the highest cases number, which is very strange, so I am assuming that I might be doing wrong calculations
''' | ||
# ax = sns.lineplot(x="date_iso", y="value", hue='Province_State', data=grouped_counts) # show cases per state monthly | ||
# ax = sns.stripplot(x="date_iso", y="value", hue='Province_State', data=grouped_counts) | ||
# ax = sns.violinplot(x='date_iso', y='value', hue='Province_State', data=grouped_counts, palette="Set2", split=True, | ||
# scale="count", inner="quartile") | ||
# ax = sns.countplot(x="date_iso", hue='Province_State', data=grouped_counts) # works better if there are certain dates | ||
# plt.tight_layout() | ||
# plt.show() | ||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you comment these out? we could also add general values into the dashboard too
analysis/db/us_map/choroplethMap.py
Outdated
# animation_frame='date', | ||
hover_data=['Admin2', 'value', 'POP_ESTIMATE_2019'], | ||
color_continuous_scale='viridis_r', | ||
range_color=(0, plot_data['value'].max()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you use the new column variable name make sure you change this as well.
analysis/db/us_map/choroplethMap.py
Outdated
|
||
|
||
|
||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files should end with a new line
Also. might be worth having the raw count, and also the per-capita count as a toggle between the maps.
Since the only real difference between the plotting code is which column you're using to plot, we can make a function that takes a dataframe, and plotting column as input and returns the plot.
Would be able to use the function to return both plots that we would feed into the dashboard.
# TODO: See if rate is changing, counts over time (a 14 day sliding window count) | ||
# Choropleth map with time slider and hover text | ||
# TODO: Try to merge PopulationEstimates.xls to confirmed_df and remove State_FIPS.xlsx | ||
|
||
confirmed_df = pd.read_csv('https://github.com/CSSEGISandData/COVID-19/raw/master/csse_covid_19_data/' | ||
'csse_covid_19_time_series/time_series_covid19_confirmed_US.csv') | ||
loc_df = pd.read_excel(here('./data/db/original/maps/State_FIPS.xlsx')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to where you got data from
No description provided.