Skip to content

Commit

Permalink
Merge pull request #5 from nrosed/main
Browse files Browse the repository at this point in the history
pipeline QC markdown analysis added
  • Loading branch information
nrosed authored Feb 3, 2021
2 parents a0f15ec + 0bc5de3 commit 912b366
Show file tree
Hide file tree
Showing 51 changed files with 872 additions and 43 deletions.
Binary file added .RData
Binary file not shown.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@
.DS_Store
.Rhistory
.Rproj.user
*.Rproj
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
title: "location_analysis"
author: "Natalie Davidson"
date: "1/22/2021"
output: github_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
require(data.table)
require(here)
require(ggplot2)
require(caret)
proj_dir = here()
source(paste(proj_dir, "/analysis_scripts/analysis_utils.R", sep=""))
source(paste(proj_dir, "/utils/plotting_utils.R", sep=""))
```

## Nature News Location Bias

This document is a working analysis of the quotes extracted from Nature News content to see if there are differences in gender representation.
The data we will be working with are the following:

1) `./data/benchmark_data/benchmark_quote_table_raw.tsv` is the output after scraping a randomly selected set of 10 articles from 2010, 2015, or 2020 (`./nature_news_scraper/run_scrape_benchmark.sh`) then running it through coreNLP with additional processing (`./process_scraped_data/run_process_target_year.sh`)
2) `./data/scraped_data/quote_table_raw_20*.tsv` are the output after scraping all articles from a year between 2001 2020 (`./nature_news_scraper/run_scrape_benchmark.sh`) then running it through coreNLP with additional processing (`./process_scraped_data/run_process_target_year.sh`)


**All analysis shown below depends on the functions described in `/analysis_scripts/analysis_utils.R`**

Loading

0 comments on commit 912b366

Please sign in to comment.