GitHub - snersnass/GettingCleaningData: Coursera Getting and Cleaning Data submission repository

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CodeBook.txt		CodeBook.txt
README.txt		README.txt
run_analysis.R		run_analysis.R

Repository files navigation

## Run Analysis notes

# rawFeatures is the 561 column headings for test and train X data files

# There were 30 volunteers identified by numbers 1 to 30

# rawTestSubj and rawTrainSubj identify which volunteer number was for each row in test and train data files

# rawTestY and rawTrainY identify what the volunteer was doing for each row in test and train data files

# There were 6 activities link the number used in the TestY files, to a proper description of the activity
#	These activities are in RawActLabels

# marker please note I changed the order of requirements so I could understand it better
#    if you want to test on a mac, please change which pathPrefix is commented out and should work

## 1. Merge the data sets

# Test X + Train X gives 10299 rows x 561 colums
# Test Y + Train Y gives 10299 activity rows
# Test Subject + Train Subject gives 10299 subject rows

# 2. Extract only measurements of mean and standard deviation
#    Use grep to get just the matching rows
# 		Column 1 of the result is the vector of selected rows
#		Column 2 of the result is the vector of required names
#	 Apply the names

# 3. Add the volunteer and activity as the first two columns

#		These come into the first two columns and both are numbers at this stage

# 4. Change the activity to a name

#		Use RawActLabels to find the activity name that relates to the description
#		Use mutate to swap out this column to labels instead of numbers

# 5. Create a tidy data set with average of each variantl for each activity and each subject