-
Notifications
You must be signed in to change notification settings - Fork 63
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #116 from PSIAIMS/Regression
Regression
- Loading branch information
Showing
2 changed files
with
77 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,36 @@ | ||
--- | ||
title: "Linear Regression" | ||
output: html_document | ||
date: "2023-04-22" | ||
date: last-modified | ||
date-format: D MMMM, YYYY | ||
--- | ||
|
||
To demonstrate the use of linear regression we examine a dataset that illustrates the relationship between Height and Weight in a group of 237 teen-aged boys and girls. The dataset is available at (../data/htwt.csv) and is imported to the workspace. | ||
|
||
```{r} | ||
### Descriptive Statistics | ||
|
||
The first step is to obtain the simple descriptive statistics for the numeric variables of htwt data, and one-way frequencies for categorical variables. This is accomplished by employing summary function. There are 237 participants who are from 13.9 to 25 years old. It is a cross-sectional study, with each participant having one observation. We can use this data set to examine the relationship of participants' height to their age and sex. | ||
|
||
```{r setup, include=true} | ||
knitr::opts_chunk$set(echo = TRUE) | ||
htwt<-read.csv("../data/htwt.csv") | ||
summary(htwt) | ||
``` | ||
|
||
In order to create a regression model to demonstrate the relationship between age and height for females, we first need to create a flag variable identifying females and an interaction variable between age and female gender flag. | ||
|
||
```{r} | ||
htwt$female <- ifelse(htwt$SEX=='f',1,0) | ||
htwt$fem_age <- htwt$AGE * htwt$female | ||
head(htwt) | ||
``` | ||
### Regression Analysis | ||
Next, we fit a regression model, representing the relationships between gender, age, height and the interaction variable created in the datastep above. We again use a where statement to restrict the analysis to those who are less than or equal to 19 years old. We use the clb option to get a 95% confidence interval for each of the parameters in the model. The model that we are fitting is ***height = b0 + b1 x female + b2 x age + b3 x fem_age + e*** | ||
```{r setup, include=true} | ||
regression<-lm(HEIGHT~female+AGE+fem_age, data=htwt, AGE<=19) | ||
summary(regression) | ||
``` | ||
|
||
From the coefficients table b0,b1,b2,b3 are estimated as b0=28.88 b1=13.61 b2=2.03 b3=-0.92942 | ||
|
||
The resulting regression model for height, age and gender based on the available data is ***height=28.8828 + 13.6123 x female + 2.0313 x age -0.9294 x fem_age*** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters