-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdata_analysis_notebook.Rmd
473 lines (353 loc) · 24.8 KB
/
data_analysis_notebook.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
---
title: "data_analysis_foreclosures"
output: html_document
date: "2023-11-10"
data: https://opendata.maryland.gov/Housing/Maryland-Notices-of-Intent-to-Foreclose-by-Zip-Cod/ftsr-vapt
---
#Memo details: "Be sure to describe your most important and newsworthy findings to me as if you were pitching a story about them, including some discussion of what your reporting plans would entail - what would you need/want to do to make this a better story?"
# Strongest story idea: One of our most important news findings was that following the end of the foreclosure moratorium in Maryland due to the pandemic, the highest percent changes of intent to foreclose notices were mostly concentrated around southern and northeastern Maryland, including Baltimore to some extent. Plus, in less than three years, NOIs have jumped back to pre-pandemic levels. In 2022 alone, the percent change in total NOIs was nearly 75%. We would need more detailed pre-pandemic data to compare, however, which we would need to request via MPIA, to see what zip codes returned to pre-pandemic levels sooner than others.
#When we imported Census data, we found that there was a weak relationship between a zip code’s median income and the number of NOIs per capita. The higher the median income, the fewer NOIs per capita the zip code was likely to have. However, the zip codes with lower median incomes were more likely to have a wider variety in the number of NOIs per capita than those with higher median incomes.
#The next step in reporting this story would be getting data from before the start of the moratorium and through it in case there were some notices that slipped through. This way we can look at if the percent change within the same amount of time is normal. We would also want the data of how many of these NOI became actual foreclosures by zip code, which wasn't readily available. With this data, we would explore more of the demographics of these zip codes with a focus on the ones with the more notable percent changes.
#We would look at what percentage of NOI turned into actual foreclosures and plot it on a map to compare zip codes. We would also look at the population and median income of these zip codes again to start off. It would also be interesting to look at the age distribution in each zip code as well as race/ethnicity.
#Some of the zip codes we would keep an eye on is 70772, which had the highest number of NOI overall and hit the highest number of NOI in August 2023 at 154 notices. We would also keep an eye on 21056 (highest income), 20874 (close to $100,000), 20721 (close to $15,000) and 21542 (lowest income).
#From this list, we would try to obtain the records of the actual foreclosures and see if we can contact the old owners and ask about their experience. What are the more common reasons as to why they lost their homes, and does it match up with what the data tells us? Did the moratorium end too soon for them?
#Another interesting point were the overall spikes in intent to foreclose notices in April, August and December 2022. What happened here? That would be a question for the Office of Financial Regulation, most likely. Does this happen every year? The data on the website only goes back to Nov. 2019, so it’s difficult to see what had occurred pre-pandemic. What did surprise us was that there were substantial troughs throughout the year. We expected a steady rise, but February and May showed some of the lower totals.
#The consons of our data include that it was organized in an odd way with the dates, so we had to go through the process of transposing it for some questions. It's also aggregated, which made it difficult to find a dataset to join to properly. Also, it only has one full year. The other two years included are only partial years. As for pros, it was interesting data to look at and once we got a feel for it, we were able to tell pretty easily what we could and couldn't do.
#There were a few times we had to check our analysis. We had to write a code to double check the sum of the total number of NOIs by date and double check the variable we pulled from the Census since the median income looked off at first.
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r}
#install.packages("zoo")
```
```{r}
library(tidyverse)
library(dplyr)
library(janitor)
library(zoo)
library(ggplot2)
library(tidycensus)
library(lubridate)
```
# Dataset: Maryland Notices of Intent to Foreclose (July 2021 - Oct. 2023)
```{r}
# Read in and clean the data
foreclosures <- read_csv("Maryland_Notices_of_Intent_to_Foreclose.csv") |>
clean_names() |>
rename(february_2022 = february_2020)
foreclosures
```
# Cleaning
In cleaning the data, we recognized that there was an error in the original data so we contacted the creator of the data to confirm that February 2020 was supposed to be marked February 2022. We made sure to rename the column after reading in the data and using clean_names() to take the spaces out and make the column names uniform. There wasn’t any other obvious cleaning needed from what we can tell. Here's our quick map: https://datawrapper.dwcdn.net/M2Foe/2/
```{r}
# Here we wanted to try and make a quick map to see what we were looking at other than just a table of numbers.
foreclosures_sum <- foreclosures |>
group_by(zip) |>
summarise(foreclosures_sum = sum(across(contains("_")))) |>
arrange(desc(foreclosures_sum))
foreclosures_sum
write_csv(foreclosures_sum, "foreclosure_sum.csv")
```
We decided to go question by question to get a better understanding of the limitations present and explore some strategies to overcome those limitations and answer the questions that we posed.
# Q1
From July '21- Oct'23 "the end of the pause on foreclosure notices", which zip code has had the most notices recorded?
**This question is pretty much addressed above by adding the number of notices across the data set, grouping the results by zip code, and arranging them in descending order. The zip code 20772 in Prince George's County (including Upper Marlboro) had the highest number over the course of this timespan. See this map for a quick vizualizeation: https://datawrapper.dwcdn.net/M2Foe/1/**
# Findings: We found that the zip code 20772 has the highest number of intent to foreclose notices, which is in PG County. We're currently trying to figure out how that compares to other zip codes. Transposing the data was the largest hurdle here and for the rest of the questions as we had to figure out how to format different aspects.
```{r}
foreclosures_sum |>
arrange(desc(foreclosures_sum))
```
#Q2
How has have foreclosure notices in this zip code changed over time? How does this compare to the other zip codes?
**Since the dataset is already ordered chronologically, simply filtering for the zip code that had the most notices will give us an idea of how many notices were given in that zip code month-by-month. We might need to contact the data set owner again to see if he has the pre-pandemic data.**
#Findings: The number of intent to foreclose notices dramatically rose in this zip code. We decided to compare it to a zip code with a simarl amount of NOI and a similar median income (21206) as well as a few other zip codes from different median incomes and then charted the number of foreclosure notices over time. We saw that across the zip codes, there were consistent peaks in April 2022, Aug. 2022 and May 2023. In Aug. 2023, our chosen zip code (20772) and the other similar zip code (21206) peaked while the others slightly rose or fell.
#As for the overall aspect, the zip code 20772 had a percent change of 681.25%. That's a lot, but nowhere near the top in terms of which zip code had the highest percent change. When we filtered for only zip codes in Prince George's County, it has the 11th highest percent change out of 27 of PG counties with a calculated percent change. We actually found that there was one zip code, 20712, that saw a negative percent change -- it went from two notices in July 2021 to one notice in Oct. 2023. However, there were some years in between where it had higher amounts. One month it had a total of eight notices. This also doesn't take into account the zip codes with lower population compared to more populated zip codes.
#The zip code with the highest percent change -- 20746 -- jumped from 1 foreclosure notice to 40 by October 2023. In comparison, 20772 jumped from 16 notices to 125.
```{r}
# We transposed the data set to tackle this question. We had a lot of questions for ChatGPT https://chat.openai.com/share/a3d690f6-baec-4fc7-beb2-4844daf93de6 since there was an issue with getting R to recognize the first column as the first column.
# Here we made another attempt at transposing to format the dates using lubridate instead of the zoo package. However we continued to run into the issue of getting a character column instead of a date. We were, however, able to solve our issue with the date column being in a weird month_year format by using the format function.
transposed_redo_2 <- foreclosures |>
gather(date, value, -1) |>
spread(key = zip, value = value) |>
mutate(
date= mdy(date),
date=floor_date(date, "month")
) |>
arrange(date)
transposed_redo_2
#Now we're just cleaning a few things up. This dataframe is specific to the zip code 20772 for each month of the year from July 2021 to October 2023, so we just renamed the 20772 column "foreclosures" and kept the date and foreclosures columns before arranging by date. Because the date column is a character rather than an actual date, it's arranged in alphabetical order. We can work with that for now and separate them if we join later.
top_foreclosure_zip <- transposed_redo_2 |>
rename(foreclosures = "20772") |>
select(date, foreclosures) |>
arrange(date)
top_foreclosure_zip
write_csv(top_foreclosure_zip, "top_foreclosure_zip.csv")
```
```{r}
#Here we decided to compare the 20772 zip code to a few others based on where they fell with median income levels and foreclosure numbers (see datawrapper chart for details).
foreclosures_zip <- transposed_redo_2 |>
select(date, "20772", "21056", "20874", "20721", "21542") |>
arrange(date)
foreclosures_zip
write_csv(foreclosures_zip, "foreclosures_zip.csv")
#21056 highest income
#20874 close to $100,000
#20721 close to $15,000
#21542 lowest income
#20772 largest notices
```
```{r}
# Time to chart it!
#datawrapper: https://datawrapper.dwcdn.net/vVXFh/1/
top_foreclosure_zip |>
ggplot() +
geom_line(aes(x=date, y=foreclosures)) +
labs(
title="Foreclosure notices rose drastically in this PG County zip code",
x = "month",
y = "foreclosures notices issued",
caption = "Source: Office of Financial Regulation") +
theme(
axis.text.x = element_text(angle = 45, hjust=1)) +
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y")
```
```{r}
#Arranging by date/renaming to make it more clear
foreclosures_zip_all <- transposed_redo_2 |>
arrange(date)
foreclosures_zip_all
write_csv(foreclosures_zip_all, "foreclosures_zip_all.csv")
```
```{r}
# I want to make a dataframe that has the total number of notices by month/year combo. Here I am just readjusting the transposed_redo data set so I can see the month_year column at the front and get rid of the original date column
transposed_redo_arranged <- transposed_redo_2 |>
#[, c(ncol(transposed_redo_2), 1:(ncol(transposed_redo_2)-1))] |>
select(-date)
transposed_redo_arranged
```
```{r}
#Okay, now that we've cleaned that up a bit, let's count the sum of the rows. This first part stores the dataframe inside a new dataframe without the first column so we can add a column (mutate) that is the sum of all the other columns without pulling the date into the calculation.
col_sum <- transposed_redo_arranged |>
mutate(sum_col = rowSums(transposed_redo_arranged[,-1]))
#This next line creates a new dataframe, where we take our fresh one ^ and ask it to pull the new column from the end (sum_col) and place it at the front.
sum_arranged <- col_sum[, c(ncol(transposed_redo_2), 1:(ncol(col_sum)-1))]
sum_arranged
```
```{r}
# Here I'm double checking that the first row is correct
rowSums(transposed_redo_arranged[1,-1])
```
```{r}
#I want to look at the percent change of notices from July 2021 and October 2023 since the moratorium on residential foreclosures in Maryland expired June 30, 2021 and foreclosures resumed July 1, 2021. This is for all the zip codes in Maryland. Farther down, we'll compare just PG County zip codes.
# Here is the datawrapper map: https://datawrapper.dwcdn.net/mkGLb/2/
perc_change_foreclosures <- foreclosures |>
mutate(perc_change = ifelse(july_2021 != 0, (october_2023 - july_2021) / july_2021 * 100, NA)) |>
select(zip, perc_change) |>
arrange(desc(perc_change))
perc_change_foreclosures
write_csv(perc_change_foreclosures, "perc_change_foreclosures.csv")
```
```{r}
#Now for PG County
#Here is the Datawrapper map: https://datawrapper.dwcdn.net/Ujoax/1/
perc_change_pg_foreclosures <- foreclosures |>
filter(zip %in% c ("20772", "20607", "20608", "20613", "20623", "20703", "20704", "20705", "20706", "20707", "20708", "20710", "20712", "20715", "20716", "20717", "20718", "20719", "20720", "20721", "20722", "20725", "20735", "20737", "20740", "20741", "20742", "20743", "20744", "20745", "20746", "20747", "20748", "20750", "20757", "20762", "20769", "20770", "20771", "20781", "20782", "20783", "20784", "20785", "20787", "20788")) |>
mutate(perc_change = ifelse(july_2021 != 0, (october_2023 - july_2021) / july_2021 * 100, NA)) |>
select(zip, perc_change) |>
arrange(desc(perc_change))
perc_change_pg_foreclosures
write_csv(perc_change_pg_foreclosures, "perc_change_pg_foreclosures.csv")
```
#Q3
Using census data, what are the demographics (income) of this zip code? How does this compare to other zip codes and their demographics? Is there a relationship between a zip code's median income and the number of foreclosures per capita?
#Findings: There very weak relationship between median income and notices per capita -- however, there is noticeably more variation in foreclosure notice rates in poorer zip codes compared to wealthier zip codes. The zip code 20772 falls around the middle of the median income range, yet has had the highest number of foreclosures compared to other zip codes. When we mapped out these zip codes, we found that zip codes near the border of Charles and PG County had a large cluster of zip codes with a high rate of foreclosure notices per capita. There were also some noticeable zip codes in Dorchester and Wicomico counties.
```{r}
#Here we are using the zip from the census data to pull the population by zip. We then rename geoid, estimate and moe to zip, population and moe_pop
acs5_variables <- load_variables(2021, "acs5", cache = TRUE)
acs5_variables
mdzip <- get_acs(geography="zcta", variables = "B02001_001", year=2021)
mdzip <- mdzip |>
clean_names() |>
#separate(name, c('county', 'state'), sep=",") |>
rename(zip=geoid, population=estimate, moe_pop=moe) |>
group_by(zip) |>
arrange(desc(population))
mdzip
#Here we are now joining the foreclosures_sum dataframe (the one that shows the total number of foreclosures for each zip code) with the mdzip data frame (which we made above). We then selected only the zip, foreclosure_sum, population and moe_pop columns.
foreclosure_state_pop <- left_join(foreclosures_sum, mdzip, join_by(zip))
foreclosure_state_pop <- foreclosure_state_pop |> select(zip, foreclosures_sum, population, moe_pop)
foreclosure_state_pop
#Here we're pulling the income from the census
md_income <- get_acs(geography="zcta", variables = "B19013_001", year=2021)
md_income <- md_income |>
clean_names() |>
rename(zip=geoid, income=estimate, moe_inc=moe) |>
group_by(zip) |>
arrange(desc(income))
md_income
foreclosure_pop_income <- left_join(foreclosure_state_pop, md_income, join_by(zip)) |>
select(zip, foreclosures_sum, population, moe_pop, income, moe_inc) |>
arrange(desc(income))
foreclosure_pop_income
```
```{r}
#We wanted to first plot out the sum of intent to foreclose notices by zip code.
ggplot(foreclosure_pop_income, aes(x=income, y=foreclosures_sum)) +
geom_point()+
labs(
title="Zip codes with higher median incomes had less variety in NOI totals",
x = "Median Income",
y = "Sum of NOI",
caption = "Source: Office of Financial Regulation") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_text(check_overlap = TRUE, size = 2, aes(label = zip), hjust = 0.5, vjust = -0.5)
```
```{r}
#How spread out is our data? It looks like the zip codes lean under 150,000, which is why we decided to use the Kendall method for the correlation test.
ggplot(foreclosure_pop_income, aes(x = income)) +
geom_histogram(binwidth = 5, fill = "skyblue", color = "black") +
labs(title = "Distribution of median income",
x = "Median Income",
y = "Frequency")
```
```{r}
#Here we are charting the relationship between a zip code's median income and the number of intent to foreclose notices.
foreclosure_pop_income |>
ggplot() +
geom_point(aes(x=income,y=foreclosures_sum)) +
geom_smooth(aes(x=income,y=foreclosures_sum), method="lm")+
labs(title = "Relationship between median income and sum of NOIs",
x = "Median Income",
y = "Sum of NOIs")
```
```{r}
# The Kendall Method shows that there isn't a strong relationship between these two.
cor.test(foreclosure_pop_income$income, foreclosure_pop_income$foreclosures_sum, method = "kendall")
```
```{r}
per_capita <- foreclosure_pop_income |>
mutate(percapita = (foreclosures_sum/population)*10000) |>
arrange(desc(percapita))
per_capita
```
```{r}
#Okay, now to plot the median income and the number of intent to foreclose notices by zip code.
ggplot(per_capita, aes(x=income, y=percapita)) +
geom_point()+
labs(
title="A weak relationship exists between NOIs per capita and median income",
x = "Median Income",
y = "NOIs Per Capita",
caption = "source: Source: Office of Financial Regulation") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_text(check_overlap = TRUE, size = 2, aes(label = zip), hjust = 0.5, vjust = -0.5)
```
```{r}
per_capita |>
ggplot() +
geom_point(aes(x=income,y=percapita)) +
geom_smooth(aes(x=income,y=percapita), method="lm")+
labs(
title="A weak relationship exists between NOIs per capita and median income",
x = "Median Income",
y = "NOIs Per Capita",
caption = "source: Source: Office of Financial Regulation")
```
```{r}
#There very weak relationship between median income and notices per capita -- however, there is noticeably more variation in foreclosure notice rates in poorer zip codes compared to wealthier zip codes.
cor.test(per_capita$income, per_capita$percapita, method = "kendall")
```
Q4. How do the unemployment rates across PG County compare to the intent to foreclose notices for each month?
#Findings: It seems like there is no relationship. As unemployment rates decrease following July 2021, intent to foreclose notices continue to rise.
# New Join Data
```{r}
#Here we're loading the unemployment rate data, cleaning it and joining it to the transposed_redo_2 dataframe, joining it by the date column. We then selected the zip codes that were specific to PG County.
#Cleaning
pg_unemployment_rate <- read_csv("pg_unemployment_rate.csv") |>
clean_names() |>
rename(pg_unemployment=mdprin5urn)
pg_unemployment_rate
#Here we are trying to isolate just the years 2021, 2022 and 2023
multiple_years <- c(2021, 2022, 2023)
pg_unemployment_rate <- pg_unemployment_rate |>
filter(as.numeric(str_extract(date, "\\d{4}")) %in% multiple_years)
pg_unemployment_rate
#We got "NA" for some rows before we realized the original data didn't include those months, so we took them out.
pg_unemployment_notices <- left_join(pg_unemployment_rate, transposed_redo_2, join_by(date)) |>
filter(date!="2021-01-01" & date!="2021-02-01" & date!="2021-03-01" & date!="2021-04-01" & date!="2021-05-01"& date!="2021-06-01") |>
group_by(date, pg_unemployment) |>
summarize(across(c("20607", "20608", "20613", "20623", "20703", "20704", "20705", "20706", "20707", "20708", "20710", "20712", "20715", "20716", "20717", "20718", "20719", "20720", "20721", "20722", "20725", "20735", "20737", "20740", "20741", "20742", "20743", "20744", "20745", "20746", "20747", "20748", "20750", "20757", "20762", "20769", "20770", "20771", "20781", "20782", "20783", "20784", "20785", "20787", "20788", "20772"), sum)) |>
mutate(pg_notices = rowSums(across(c("20607", "20608", "20613", "20623", "20703", "20704", "20705", "20706", "20707", "20708", "20710", "20712", "20715", "20716", "20717", "20718", "20719", "20720", "20721", "20722", "20725", "20735", "20737", "20740", "20741", "20742", "20743", "20744", "20745", "20746", "20747", "20748", "20750", "20757", "20762", "20769", "20770", "20771", "20781", "20782", "20783", "20784", "20785", "20787", "20788", "20772")))) |>
select(date, pg_unemployment, pg_notices)
pg_unemployment_notices
# These are the PG county zip codes that didn't exist in the original data: "20731", "20790", "20791", "20799", "20753", "20749", "20768", "20709", "20726", "20738", "20797", "20697", "20752", "20775", "20792"
write_csv(pg_unemployment_notices, "pg_unemployment_notices.csv")
```
# We charted this as well. The foreclosure notice numbers were too big to compare on the same chart as the unemployment rates, unfortunately, so we had to put them in separate charts. (We thought of comparing the month-over-month percent change of foreclosure sums but couldn't figure out the code for it in the time that we had).
Prince George's County NOI totals: https://datawrapper.dwcdn.net/gh9wi/1/
Prince George's Unemployment rates: https://datawrapper.dwcdn.net/nqCrK/1/
```{r}
#Here we are finding the p
perc_change_pg_unemployment_notices <- pg_unemployment_notices |>
arrange(date)
perc_change_pg_unemployment_notices
```
Q5. What are the differences in foreclosure notices by season (warm months vs cold months)? The only full year we have is 2022, so let's look at that year to start.
#From the chart we created, we can see there were dramatic spikes in NOI notices in April, August and December. The year as a whole saw a percent change of 73%. At a quick glance from annual state data, January 2019 saw a similar rise from November up until January 2020. We would love to get the data going farther back to see if this trend continues (we didn't pull this data for other questions because it isn't broken up by zip codes. However, it might be possible to request that info via MPIA. This question morphed a little from "do colder months have more notices than warmer months?" to "What did the progression of foreclosures look like over the course of 2022? for Maryland as a whole?" It looks like there was some ebb in NOI totals during the summer until August hit though. And then numbers were relatively low at the start of the year.
```{r}
#Bringing this back down so we can see the code.
col_sum <- transposed_redo_arranged |>
mutate(sum_col = rowSums(transposed_redo_arranged[,-1])) |>
select(sum_col, everything())
col_sum
#This next line creates a new dataframe, where we take our fresh one ^ and ask it to pull the new column from the end (sum_col) and place it at the front.
sum_arranged <- col_sum[, c(ncol(transposed_redo_2), 1:(ncol(col_sum)-1))]
sum_arranged
```
```{r}
#Taking the col_sum code and adding the date column from foreclosures_zip_all
foreclosures_sum_date <- col_sum |>
clean_names() |>
mutate(date = foreclosures_zip_all$date) |>
select(date, everything())
foreclosures_sum_date
names(foreclosures_sum_date) <- str_replace(names(foreclosures_sum_date), "x", "")
foreclosures_sum_date
```
```{r}
#Here we are selecting only the 2022 months
foreclosure_notices_2022 <- foreclosures_sum_date |>
mutate(year = (year(date))) |>
filter(year == "2022") |>
select(!year)
foreclosure_notices_2022
```
```{r}
# Time to chart. We had to consult Chat GPT on making sure the x-axis shows all the months involved and in combining the dataframes above. https://chat.openai.com/c/49571382-6e25-4419-acd4-11f967423976
foreclosure_notices_2022 |>
ggplot() +
geom_line(aes(x= date, y=sum_col)) +
labs(
title="Intent to foreclose notices across Maryland in 2022",
x = "month",
y = "intent to foreclose notices issued",
caption = "Source: Office of Financial Regulation") +
theme(
axis.text.x = element_text(angle = 45, hjust=1)) +
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y")
```
```{r}
#I want to make a bar chart to see how it compares. Here we can see the peaks in August, December and April once more.
foreclosure_notices_2022 |>
ggplot() +
geom_bar(aes(x=date, weight=sum_col), fill="lightblue", color="lightblue") +
coord_flip() +
theme(
plot.title = element_text(margin = margin(b = 5))
) +
labs(
title = "Intent to foreclose notices across Maryland in 2022",
x = "Date",
y = "Intent to foreclose notices issued",
caption = "source: Office of Financial Regulation")
```