Skip to content

Commit 1f03cc9

Browse files
committed
Added some hints
1 parent e8220e6 commit 1f03cc9

File tree

1 file changed

+67
-0
lines changed

1 file changed

+67
-0
lines changed

source/lessons/L7/exercise-7-hints.rst

+67
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,70 @@ are represented in the data. Consider following example:
4646
data.dtypes
4747
4848
Great, now we have the data in ``datetime`` format!
49+
50+
Creating an empty DataFrame with a datetime index
51+
-------------------------------------------------
52+
53+
For Problem 2 in this exercise you are asked to calculate average seasonal ages for each year in our data file.
54+
The easiest way to do this is to create an empty DataFrame to store the seaonal temperatures, with one temperature for each year and season.
55+
Thus, the DataFrame should have columns for each season and the date as an index.
56+
In order to do this, we'll need first to create a variable to store the dates for the index, then create the DataFrame using that index.
57+
Let's consider an example for my world, where there are two seasons: ``coldSeason`` and ``warmSeason``.
58+
For each season, I want list the number of times I wore a jacket, with data from the past 4 years.
59+
I can start by making a variable with 1 date for each of the past 4 years using the Pandas ``pd.date_range()`` function.
60+
61+
.. ipython:: python
62+
63+
timeIndex = pd.date_range('2014', '2017', freq='AS')
64+
print(timeIndex)
65+
66+
As you can see, we now have a variable ``timeIndex`` in the Pandas datetime format with dates for January 1 of the past 4 years.
67+
The starting and ending years are clear, and the ``freq='AS'`` indicates the frequecy of dates between the listed starting and ending times.
68+
In this case, ``AS`` refers to annual values (1 time per year) at the start of the year.
69+
70+
With the ``timeIndex`` variable, we can now create our empty DataFrame to store the seasonal jacket numbers using the Pandas ``pd.DataFrame()`` function.
71+
72+
.. ipython:: python
73+
74+
seasonData = pd.DataFrame(index=timeIndex, columns=['coldSeason', 'warmSeason'])
75+
print(seasonData)
76+
77+
Now we have our empty DataFrame where I can fill in the number of times I needed a jacket in each season using the date index!
78+
79+
Slicing up the seasons
80+
----------------------
81+
82+
The other main task in Problem 2 is to sort values from the different months into seasonal average values.
83+
There are several ways in which this can be done, but one nice way to do it is using a ``for`` loop to loop over each year of data you consider and then fill in the seasonal values for that year.
84+
For each year, you want to identify the slice of dates that correspond to that season, calculate their mean, then store that result in the corresponding location in the new DataFrame created in the previous hint.
85+
For the ``for`` loop itself, it may be easiest to start with the second full year of data (1953), since we do not have temperatures for December of 1951.
86+
If you loop over the years from 1953-2016, you can then easily calculate the seasonal average temperatures for each season.
87+
For the winter, you can use ``year - 1`` to find the temperature for December, assuming ``year`` is your variable for the current year in your ``for`` loop.
88+
89+
In `this week's lesson <https://geo-python.github.io/2017/lessons/L7/pandas-plotting.html#selecting-data-based-on-time-in-pandas>`__ we saw how to select a range of dates, but we did not cover how to take the mean value of the slice and store it.
90+
Because a slice of a DataFrame is still a DataFrame object, we can simply use the ``.mean()`` method to calculate the mean of that slice.
91+
92+
.. code:: python
93+
94+
meanValue = dataFrame['2016-12':'2017-02']['TEMP'].mean()
95+
96+
This would assign the mean value for the ``TEMP`` field between December 2016 and February 2017 to the variable ``meanValue``.
97+
In terms of storing the output value, we can use the ``DataFrame.loc()`` function.
98+
For example:
99+
100+
.. code:: python
101+
102+
dataFrame.loc[year, 'coldSeason'] = 5
103+
104+
This would store the value ``5`` in the column ``coldSeason`` at index ``year`` of ``dataFrame``.
105+
That's a tricky sentence, but hopefully the idea is clear :).
106+
107+
Labels and legends
108+
------------------
109+
110+
In the plot for Problem 2 you're asked to include a line legend for each subplot.
111+
To do this, you need to do two things:
112+
113+
1. You need to add a ``label`` value when you create the plot using the ``plt.plot()`` function.
114+
This is as easy as adding a parameter that say ``label='some text'`` when you call ``plt.plot()``.
115+
2. You'll need to display the line legend, which can be done by calling ``plt.legend()`` for each subplot.

0 commit comments

Comments
 (0)