Here are some hints and ideas how to evaluate the gathered data.

Software

One possibility is to use RStudio. (Install r: https://cran.rstudio.com/, and then install RStudio: https://posit.co/download/rstudio-desktop/)

All of the scripts below are for RStudio, but for example also jamovi (https://www.jamovi.org/) can be used (which provides most functionality for statistic tests in a simple GUI, but is not so versatile for plotting).

Demographics

If you gathered demogrpahic data, e.g., with SoSciSurvey, you should report who participated, by giving their number n=... participants, their gender count and at least age with mean (M) and standard deviation (SD).

SoSciSurvey data ca be easily loaded into RStudio with the script provided alongside the data on ScoSciSurvey.

For all the following examples you abviously have to adapt the names of the data frames (here, e.g., soscisurveyData and columns, here, e.g., D001)

This is a script for counting occurences

add_descriptive_statistics <- function(data, name){
  print(name)
  print(table(data))
}

#can be used, e.g., for:
add_descriptive_statistics(soscisurveyData$D001, "Gender")
add_descriptive_statistics(soscisurveyData$D004, "VR Frequency")

Here is a script for quantitative statistics

add_qualitative_statistics <- function(data, name, demographics_stats){
  old_names = rownames(demographics_stats)
  demographics_stats= rbind(demographics_stats, data.frame("Mean"=mean(data), 
                                                           "SD"=sd(data), 
                                                           "Min"=min(data), 
                                                           "Max"=max(data)))
  rownames(demographics_stats) = append(old_names, name)
  demographics_stats
}


#and applied to age data from a SoSciSurvey data frame
soscisurveyData$D002_01 <- as.numeric(soscisurveyData$D002_01)
demographics_stats = data.frame()
demographics_stats = add_qualitative_statistics(soscisurveyData$D002_01, "Age",demographics_stats)

Data Preparation

The data gathered by the study framework should already be in an easily digestable format and can be loaded by:

ActData <- read.csv(file = 'Phase_Act.csv')

If you want to exclude, e.g., single participants this snippet can be helpful (and some more potential cleanup)

# maybe participant 20 dropped out during the study
excludedParticipants = c('20')
#excludedParticipants = c('20', '7', '11' ) #11 and 7 also did not understand task 2 correctly, exclude?

#sometimes data is misclassified as numerical or simple character, so tell R: this is a factor!
ActData$TurnTaking <- as.factor(ActData$TurnTaking)
ActData$ParticipantID <- as.factor(ActData$ParticipantID)

#remove excluded participants
library(dplyr)
ActData <- filter(ActData, ! ParticipantID %in% excludedParticipants)

Hypothesis Testing

Normally you want to use the gathered data to proof that a factor you evaluated has a significant effect on the outcome.

For this I recommend the following article on how to apply repeated-measures ANOVAs in R: https://www.datanovia.com/en/lessons/repeated-measures-anova-in-r/

In general, you should check perform performing paramteric test (like ANOVA or t-tests) that your data is actually normally distributed and fullfills the requirements (see website above) and otherwise use a non-parametric test should be used (text books have an exclusion for this for large numbers (Central Limit Theorem), which often refers to 30 being a large enough number). Otherweise check this graph for which tests might be useful (taken from the second book under Further Reading):
{width=500px}

Plotting your data

ggpubr is a very versatile tool to generate nice looking plots in R (see, e.g., https://rpkgs.datanovia.com/ggpubr/)

Here is some example code how to create bar plots which are very configurable

library(dplyr)
library(ggpubr)

library(showtext)
font_families()

ActDataGaps$TurnTaking <- recode_factor(ActDataGaps$TurnTaking, None = "None", 
InhaleOnly = "Breath", GestureOnly = "Gesture", GazeOnly = "Gaze", Full = "Full")

ActGaps_table <- ActDataGaps %>% 
  group_by(TurnTaking) %>% 
  get_summary_stats(GapTimes, type = "mean_se")

plot <- ggplot(ActGaps_table, aes(x=TurnTaking, fill=TurnTaking, y = mean*1000)) + 
  geom_bar(stat = "identity", show.legend = FALSE) +
  geom_errorbar(aes(ymin=1000*(mean-se), ymax=1000*(mean+se)), width=.3) +
  geom_signif(comparisons = list(c("Breath", "Gesture"),c("Breath", "Full")), annotation = c("*","**"), y_position = c(1000, 1100)) +
  xlab("Turn-Taking Cues") + ylab("Gap Length [ms]") +
  ylim(0, 1200) + 
  theme_light() +
  theme( text=element_text(size=8, family="serif"), 
         axis.text.x = element_text(size = 5),
         axis.text.y = element_text(size = 5, angle = 45)) + 
  scale_fill_manual(values = c("#868686", "#0073c2", "#efc000", "#cd534c", "#7aa6dc"))
print(plot)
ggsave("plots/Act-Gaps.pdf", width = 4.235, height = 6, units = "cm")

This script creates this graph:
{width=300px}
which could ne directly included in a Latex document with column width 4.235 cm.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Statistics.md

Statistics.md

Software

Demographics

Data Preparation

Hypothesis Testing

Plotting your data

Further Reading

Files

Statistics.md

Latest commit

History

Statistics.md

File metadata and controls

Software

Demographics

Data Preparation

Hypothesis Testing

Plotting your data

Further Reading