Skip to content

Commit

Permalink
text analysis for breast pump user data
Browse files Browse the repository at this point in the history
  • Loading branch information
kanarinka committed Dec 15, 2014
1 parent 9c9edd6 commit ec69e7b
Show file tree
Hide file tree
Showing 8 changed files with 513 additions and 0 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.Rproj.user
.Rhistory
.RData

*.Rproj
1 change: 1 addition & 0 deletions textanalysis/bigrams.csv

Large diffs are not rendered by default.

Binary file added textanalysis/frequencyByCategory-IWouldLove.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added textanalysis/frequencyByCategory.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions textanalysis/grouptrigrams.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
library(plyr)
library(ggplot2)
df = read.csv('trigrams_categories.csv')
#filter those without categories
df=df[df$category!="",]

#sum categories
df = ddply(df, 'category', function(x) c(freq=sum(x$freq), separate_trigrams=nrow(x)))
df=df[df$category!="",]

# Possibly remove "IWOULDLOVE"
#df=df[df$category!="IWOULDLOVE",]

#order by frequency
df[ order(df$freq), ]

#plot to verify we did it right
p <- ggplot(df, aes(y=freq,fill=category))
df$category <- reorder(df$category, -df$freq)
p + ylab("Frequency of Mention")+xlab("")+geom_bar(aes(x=category), data=df, stat="identity") +theme(axis.text=element_text(angle=0)) + coord_flip()
Loading

0 comments on commit ec69e7b

Please sign in to comment.