Anomaly detection on office temperature: Numenta Anomaly Benchmark (NAB) data.
-
The dataset contains two columns: timestamp and the temperature values.
-
The timestamps are at an interval of an hour from the start date 2013-07-04 to 2014-05-28.
-
There were no Null values in the dataset but few hours missing, so the hours were added into the dataset and empty values forward filled. (261 rows)
-
Understanding the distribution of the temperature values.
-
New features extracted from the datetime and temperature value column:
a. Hour, day, month, weekday, quarter.
b. Weekend column if the day of week is Sunday or Saturday.
c. Working Hours (assuming working hours from 8am to 8pm).
d. Temperature lag (24-hours lag variable).
e. Change in lag (difference between current and 24-hour lag temperature). -
Examining weekend temperature trends.
a. On most weekends there is a continuous drop in temperature as shown in the graph below.
b. Possible inference: the office is in a cold region where the temperature drops continuously if heating is not on. -
Examining working hours temperature trends.
a. On most working hours the temperature keeps on rising as shown in the graph below.
b. Possible inference: the office is in a cold region where the temperature is maintained by heating during working hours.
For More Details go through the Analysis Report.pdf file and the Jupyter Notebook. Do let me know if you try out something new.