Merge pull request #412 from worldbank/la-word-search

Add marias suggestions
worldbank · Feb 26, 2020 · 8ee1277 · 8ee1277
2 parents f30a0e7 + c783462
commit 8ee1277
Showing 1 changed file with 8 additions and 6 deletions.
diff --git a/chapters/data-analysis.tex b/chapters/data-analysis.tex
@@ -90,7 +90,8 @@ \subsection{Organizing your folder structure}
 
 \subsection{Breaking down tasks}
 
-We divide the process of transforming raw datasets to analysis-ready datasets to research results into four steps:
+We divide the process of transforming raw datasets to research outputs into 
+four steps:
 de-identification, data cleaning, variable construction, and data analysis.
 Though they are frequently implemented concurrently,
 creating separate scripts and datasets prevents mistakes.
@@ -206,7 +207,8 @@ \section{De-identifying research data}
 as you can always go back and remove variables from the list of variables to be dropped,
 but you can not go back in time and drop a PII variable that was leaked
 because it was incorrectly kept.
-Examples include respondent names and phone numbers, enumerator names, tax payer numbers, and addresses.
+Examples include respondent names and phone numbers, enumerator names, taxpayer 
+numbers, and addresses.
 For each confidential variable that is needed in the analysis, ask yourself:
 \textit{can I encode or otherwise construct a variable that masks the confidential component, and
 then drop this variable?}
@@ -362,7 +364,7 @@ \subsection{Documenting data cleaning}
 or that you intend to release as part of a replication package or data publication.
 
 Another important component of data cleaning documentation are the results of data exploration.
-As clean your dataset, take the time to explore the variables in it.
+As you clean your dataset, take the time to explore the variables in it.
 Use tabulations, summary statistics, histograms and density plots to understand the structure of data,
 and look for potentially problematic patterns such as outliers,
 missing values and distributions that may be caused by data entry errors.
@@ -380,9 +382,9 @@ \section{Constructing analysis datasets}
 as planned during research design\index{Research design},
 and using the pre-analysis plan as a guide.\index{Pre-analysis plan}
 During this process, the data points will typically be reshaped and aggregated
-so that level of the dataset goes from the unit of observation
-(one item in the bundle) in the survey to the unit of analysis (the household).\sidenote{
-	\url{https://dimewiki.worldbank.org/Unit\_of\_Observation}}
+so that level of the dataset goes from the unit of observation in the survey 
+to the unit of analysis.\sidenote{\url{
+https://dimewiki.worldbank.org/Unit\_of\_Observation}}
 
 
 A constructed dataset is built to answer an analysis question.