-
Notifications
You must be signed in to change notification settings - Fork 63
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
49 changed files
with
974 additions
and
156 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,3 +4,4 @@ | |
.Ruserdata | ||
/_site/ | ||
/.quarto/ | ||
.DS_Store |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
--- | ||
title: "R/SAS Chi-Squared Comparision" | ||
--- | ||
|
||
Chi-Squared test is a hypothesis test for independent contingency tables, dependent on rows and column totals. The test assumes: | ||
|
||
- observations are independent of each other | ||
|
||
- all values are 1 or more and at least 80% of the cells are greater than 5. | ||
|
||
- data should be categorical | ||
|
||
The Chi-Squared statistic is found by: | ||
|
||
$$ | ||
\chi^2=\frac{\sum(O-E)^2}{E} | ||
$$ | ||
|
||
Where O is the observed and E is the expected.\ | ||
For an r x c table (where r is the number of rows and c the number of columns), the Chi-squared distribution's degrees of freedom is (r-1)\*(c-1). The resultant statistic with correct degrees of freedom follows this distribution when its expected values are aligned with the assumptions of the test, under the null hypothesis. The resultant p value informs the magnitude of disagreement with the null hypothesis and not the magnitude of association | ||
|
||
For this example we will use data about cough symptoms and history of bronchitis. | ||
|
||
```{r} | ||
bronch <- matrix(c(26, 247, 44, 1002), ncol = 2) | ||
row.names(bronch) <- c("cough", "no cough") | ||
colnames(bronch) <- c("bronchitis", "no bronchitis") | ||
``` | ||
|
||
To a chi-squared test in R you will use the following code. | ||
|
||
```{r} | ||
chisq.test(bronch) | ||
``` | ||
|
||
To run a chi-squared test in SAS you used the following code. | ||
|
||
```{SAS} | ||
#| eval: false | ||
proc freq data=proj1.bronchitis; | ||
tables Cough*Bronchitis / chisq; | ||
run; | ||
``` | ||
|
||
The result in the "Chi-Square" section of the results table in SAS will not match R, in this case it will be 12.1804 with a p-value of 0.0005. This is because by default R does a Yate's continuity adjustment. To change this set `correct` to false. | ||
|
||
```{r} | ||
chisq.test(bronch, correct = FALSE) | ||
``` | ||
Alternatively, SAS also produces the correct chi-square value by default. It is the "Continuity Adj. Chi-Square" value in the results table. |
Oops, something went wrong.