R is a very powerful and flexible statistics package and programming language.
This repository contains a number of 'howto' files aimed to providing an introduction to R and some os its possibilities.
You can install R and RStudio with the following links:
Some other great sites for learning R are:
- OpenIntro statistics with a number of good statistics 'labs' in R
- Quick-R with explanations and sample code for a wide array of applications
- Advanced R Programming for (much) more information on what is really going on.
(see also this short overview of useful R functions)
- Getting started
- Data preparation
- Data analysis
- Advanced modeling
For textual data, we have also developed two R packages to communicate with the AmCAT text analysis framework and to deal with corpus analysis and topic models. We also wrote two relevant howto's:
- Corpus Analysis: Term document Matrices, frequency analysis, and topic modeling (source))
- Claues Analysis: Using grammatical analysis for semantic network analysis (source))
Below are also some handouts that do not depend on AmCAT, based on a Dutch data set:
- Corpus Analysis: Term Document Matrices (source)
- LDA topic modeling (source)
- Lemmatization (source)
- Machine Learning with RTextTools (source)
(The last part of the 'semantic network analysis' demo above also has a simplistic network analysis at the end)