-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathData_Source.Rmd
101 lines (71 loc) · 2.75 KB
/
Data_Source.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
title: "Data_Source"
author: "Bocong Zhao"
date: "14/10/2020"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Data_Source
**The data comes from Tidytuesday:**
**https://github.com/rfordatascience/tidytuesday/tree/master/data/2020/2020-01-07**
```{r}
# Get the Data
# rainfall <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-07/rainfall.csv')
# temperature <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-07/temperature.csv')
# IF YOU USE THIS DATA PLEASE BE CAUTIOUS WITH INTERPRETATION
# nasa_fire <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-07/MODIS_C6_Australia_and_New_Zealand_7d.csv')
# For JSON File of fires
# url <- "http://www.rfs.nsw.gov.au/feeds/majorIncidents.json"
# aus_fires <- sf::st_read(url)
# Or read in with tidytuesdayR package (https://github.com/thebioengineer/tidytuesdayR)
# Either ISO-8601 date or year/week works!
# Install via devtools::install_github("thebioengineer/tidytuesdayR")
# tuesdata <- tidytuesdayR::tt_load('2020-01-07')
tuesdata <- tidytuesdayR::tt_load(2020, week = 2)
rainfall <- tuesdata$rainfall
temperature <- tuesdata$temperature
AusNew <- tuesdata$MODIS_C6_Australia_and_New_Zealand_7d
write.csv(AusNew,"AusNew.csv")
write.csv(rainfall,"rainfall.csv")
write.csv(temperature,"temperature.csv")
```
## Briefly view the data
**Reduce the size of data, (i.e. reduce unnecessary variables & filter years)**
### (1) Rainfall
```{r}
library(visdat)
library(dplyr)
# Rainfall data in year 2019
rainfall_tidy <- rainfall %>%
select(-station_code, -period, -station_name)%>%
#filter(year >= 2019 & year < 2020) %>%
na.omit()
rainfall_tidy$date <- as.Date(with(rainfall_tidy, paste(year, month, day,sep="-")), "%Y-%m-%d")
rainfall_tidy <- rainfall_tidy %>%
select(-year) %>%
rename( c("longitude"="long","latitude"="lat"))
rainfall_tidy <-rainfall_tidy [,c("date", "month","day","city_name","rainfall","quality", "latitude", "longitude")]
#rainfall_tidy %>%
# group_by(station_name) %>%
# count()
rainfall_tidy %>%
group_by(city_name) %>%
count()
write.csv(rainfall_tidy,"rainfall_tidy.csv")
```
### temperature
```{r}
temp_tidy<-temperature %>%
dplyr::mutate(year = lubridate::year(date),
month = lubridate::month(date),
day = lubridate::day(date))
#filter(date >= as.Date("2019-01-01") & date <= as.Date("2019-12-31")) %>%
#select(-site_name)
#temp_tidy<- setNames(temp_tidy, replace(names(temp_tidy), names(temp_tidy) == 'Brisbane', 'BRISBANE'))
temp_tidy %>%
group_by(city_name) %>%
count()
write.csv(temp_tidy,"temp_tidy.csv")
```