Skip to content

mgamboam/GCD_Course_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GCD Course Project

by: Mario Gamboa

Introduction

The current project uses data collected from the accelerometers from the Samsung Galaxy S smartphone. This assignment uses data from the UC Irvine Machine Learning Repository, a popular repository for machine learning datasets. In particular, it uses the "Human Activity Recognition Using Smartphones Data Set". The datasource can be found at:

  • Dataset: Samsung Galaxy S accelerometer data [60Mb]

  • Description: Measurements of accelerometer data on Samsung Galaxy S smartphones taken to 30 volunteers within an age bracket of 19-48 years. Subjects performed six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING). The data captured includes information of the 3-axial linear acceleration and 3-axial angular velocity at a constant rate of 50Hz. Additional information can be obtained here.

Scripts and files for this solution

The following scripts are included with this solution:

  • README.md: This file. General explanation of files in the solution and how the scripts work.

  • CodeBook.md: Description of the transformation and cleanup process to convert the original datasets into a tidy dataset. It also includes a general description of the resulting variables on the dataset.

  • run_analysis.R: Script that performs the retrieval, clean-up, transformation, merge and rename of the data in order to create the resulting tidy datasets.

  • data: Directory in which the original .zip dataset is downloaded and were the resulting tidy datasets are created. tidySet1.csv corresonds to the initial dataset showing only variables related to the 'mean' and 'std' columns from the original set. tidySet2.csv corresponds to the final dataset where only mean values are included by Subject and Activity.

Additional Notes

A few additional considerations:

  • The run_analysis.R script downolds the original dataset and unzips it if it is not found on the specified /data directory. The file is large so please exercise caution if running the script.

  • Both the original dataset [60MB] and the first of the two resulting datasets [10MB] are rather large, so you need to have space and enough RAM to run the script without problems.

  • The code in the run_analysis.R script is documented to make it easier to read and follow, besides the CodeBook.md file, it is recommended to review the code on that file for additional details.

About

Course project for the Getting and Cleaning Data class

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages