-
Notifications
You must be signed in to change notification settings - Fork 17
/
index.Rmd
271 lines (199 loc) · 13.2 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
---
output: github_document
---
## RMINC
RMINC is a package for the R statistical environment designed to read and write MINC volumes as well as perform optimized statistical routines at every voxel of a set of files. Supports getting
and writing volumes, running voxel-wise linear models, correlations, correcting for multiple comparisons with False Discovery Rate control, and more.
## Installation
The easiest way to get up and running is by using devtools:
```{r, eval = FALSE}
devtools::install_github("Mouse-Imaging-Centre/RMINC"
, upgrade_dependencies = FALSE)
```
or use the most recent development version
```{r, eval = FALSE}
devtools::install_github("Mouse-Imaging-Centre/RMINC@develop"
, upgrade_dependencies = FALSE)
```
see the [INSTALL](https://github.com/Mouse-Imaging-Centre/RMINC/blob/master/INSTALL) file
for more details. Particularly you will need the c library libminc, available through
the minc-toolkit.
## Getting Help
The best place for function specific help is the manual, which is best accessed through `?`/`help` interfaces from within R, or via the [reference tab](/RMINC/reference) of this page. If you notice any bugs, or anything that can be improved in RMINC, we're
pretty responsive on our [issues page](https://github.com/Mouse-Imaging-Centre/RMINC).
## Tutorials
- [Voxel-wise Statistics](https://rawgit.com/Mouse-Imaging-Centre/RMINC/master/inst/documentation/VBMstats.pdf) (somewhat out of date).
- [How RMINC Parallel Works](https://rawgit.com/Mouse-Imaging-Centre/RMINC/master/inst/documentation/RMINC_Parallelism.html):
- [RMINC 2D Visualization](https://rawgit.com/Mouse-Imaging-Centre/RMINC/master/inst/documentation/visualizationTutorial.html)
- [Visualizing 3D Objects with RMINC](https://rawgit.com/Mouse-Imaging-Centre/RMINC/master/inst/documentation/RMINC_rgl.html)
- [Analyzing Anatomy with Hierarchical Atlases](https://rawgit.com/Mouse-Imaging-Centre/RMINC/master/inst/documentation/hierarchiesTutorial.html)
## Walk Throughs
These are quite old, so buyer beware. The tutorials are much more up to date. I highly recommend
[Analyzing Anatomy with Hierarchical Atlases](https://rawgit.com/Mouse-Imaging-Centre/RMINC/master/inst/documentation/hierarchiesTutorial.html) instead of the ROI based analysis, and [How RMINC Parallel Works](https://rawgit.com/Mouse-Imaging-Centre/RMINC/master/inst/documentation/RMINC_Parallelism.html)
instead of the parallel examples.
### ROI based anatomical analyses
This section presents a simplified ROI based analysis of neuroanatomy.
Before you start it is import to make sure:
1. the label file you use accurately segments out your final non-linear model
otherwise your analysis will be meaningless
2. you use the Jacobian determinants that capture the absolute differences
(in our case the *log-determinant-scaled* )
3. you use little/no blurring, for instance for files with a 56 micron resolution use 100 micron blur (in our case the *log-determinant-scaled-fwhm0.1* )
To analyze individual structures in the brain, you will need two things: a set of registered images and an atlas that segments the final average of your registration pipeline into classified labels. This is known as a classified atlas. (Information about how you can align a segmented set of labels to your data set can be found here: Atlas to Atlas Segmentation) The main idea is the following: Through the registration pipeline we have information about the change in volume of the voxels for each of the individual mouse brains. This is captured in the Jacobian determinant files. Using the classified atlas, we can integrate the information captured in the Jacobian determinants to find out the volume of the structures for each of the individual input files.
The Jacobian determinant fields are fairly noisy if you use them unprocessed. For this reason we blur them before we do analysis on them. The amount of blurring you want to use depends on what kind of changes you want to capture. When looking at neuroanatomy it is important to use only a small amount of blurring, because you want to avoid smoothing out information around borders of structures. The mouse brains we generally look at have a resolution of 56 micron, and we do structural analysis on Jacobian determinants that have been blurred using a 100 micron kernel.
A second thing to keep in mind is that the registration pipeline produces two types of Jacobian determinants. One that reflects relative changes (overall scaling or brain size differences have been taking out) and absolute changes which still contain the brain size differences. When analyzing neuroanatomical structures you want to use the determinants that reflect the absolute changes. Currently the naming convention for those files is as follows:
FILENAME_BASE-log-determinant-scaled-fwhm_KERNEL_.mnc
Here is an example of how to do the analysis in R
```{r, eval = FALSE}
library(RMINC)
# There is a variable that was loaded into R containing the genotype and the location of the Jacobian determinants called filenames, it contains the following:
filenames
```
---
```
scaled_jacobians genotype
1 01_wt-log-determinant-scaled-fwhm0.1.mnc wt
2 02_wt-log-determinant-scaled-fwhm0.1.mnc wt
3 03_wt-log-determinant-scaled-fwhm0.1.mnc wt
4 04_wt-log-determinant-scaled-fwhm0.1.mnc wt
5 05_mut-log-determinant-scaled-fwhm0.1.mnc mut
6 06_mut-log-determinant-scaled-fwhm0.1.mnc mut
7 07_mut-log-determinant-scaled-fwhm0.1.mnc mut
8 08_mut-log-determinant-scaled-fwhm0.1.mnc mut
```
---
```{r, eval = FALSE}
# The first thing to do now is to get the volume information of all the structures, the atlas with the labels is called resampled_atlas.mnc
volumes <- anatGetAll(filenames$scaled_jacobians, "resampled_atlas.mnc")
# This atlas contains information about both left and right structures, but in this case we will combine the information of left and right structures and only look at combined volumes:
volumes_combined <- anatCombineStructures(volumes)
# Run a linear model to get information about f-statistics and t-statistics on the structures. The first argument is the formula to be used, in this case we want to look at differences between the genotype, the second argument is the data which is stored in the variable filenames, and lastly the actual volume information
anatLm(~ genotype, filenames, volumes_combined)
# The output will be a list of all the structures with a number of columns for the values of the f-statistics and t-statistics. Next to find out which of these changes survive multiple comparison corrections:
anatFDR(anatLm(~ genotype, filenames, volumes_combined))
```
---
```
N: 20 P: 2
Beginning vertex loop: 62 3
Done with vertex loop
Computing FDR threshold for all columns
Computing threshold for F-statistic
Computing threshold for (Intercept)
Computing threshold for genotypewt
Multidimensional MINC volume
Columns: F-statistic (Intercept) genotypewt
NULL
Degrees of Freedom: c(1, 18) 18 18
FDR Thresholds:
F-statistic (Intercept) genotypewt
0.01 NaN 25.36429 NaN
0.05 20.80325 25.36429 6.172145
0.1 17.94643 25.36429 4.542356
0.15 14.99036 25.36429 3.871739
0.2 14.99036 25.36429 3.871739
There were 11 warnings (use warnings() to see them)
```
Which will give a table indicating what is significant at which FDR thresholds. In this case, all structures that have a t-statistic of at least 6.172145 or at most -6.172145 are significant at a 5% FDR threshold, all structures that have a t-statistic of at least 4.542356 or at most -4.542356 are significant at a 10% FDR threshold, etc. There are no structures significantly different at a 1% FDR threshold as indicated by the NaN.
Analyzing differences in neuroanatomy with multiple sets of labels
In addition to the procedure described above (which uses a single atlas and jacobian determinants for each individual brain) we can generate a unique set of labels for each individual brain. This can be done using MAGeT. Once you have a set of labels for each brain, analysis can proceed as described above, but with one major difference: the anatGetAll command must be called with a different set of arguments. Let's look at this function more closely:
```{r, eval = FALSE}
anatGetAll
function (filenames, atlas, method = "jacobians", defs = "/projects/mice/jlerch/cortex-label/c57_brain_atlas_labels.csv",
dropLabels = FALSE, side = "both")
```
The default values for method, defs, dropLabels and side are set for users at MICe who do 56 micron ex-vivo registrations. In these cases, the anatGetAll call is as shown above:
```{r, eval = FALSE}
volumes <- anatGetAll(filenames$scaled_jacobians, "resampled_atlas.mnc")
```
For labels generated using MAGeT, you would need the following:
- filenames: Instead of the scaled_jacobian files, you would instead include the labels for each brain generated by MAGeT.
- atlas: This argument is NOT USED when each file has its own set of labels. Unfortunately, you still need to specify something or the function will fail. An example of what to specify is the labels for the first file – filenames$labels[1]
- method: method = "labels" must be specified
- defs: This is critical if the label definitions for the files you are looking at differ from the standard set of 62, described in Dorr, et. al. This will also need to be specified for users who wish use the set described in Dorr et.al, but are not using the machines at MICe.
- dropLabels and side can continue to use the defaults.
So, putting it all together, here is what your anatGetAll call and following analysis would look like:
```{r, eval = FALSE}
#anatGetAll call, with slightly different arguments
volumes <- anatGetAll(filenames$labels, filenames$labels[1], method="labels", defs="brain_label_mappings.csv")
#combining structures, anatLm, anatFDR proceed as above:
volumes_combined <- anatCombineStructures(volumes, defs="brain_label_mappings.csv")
anatLm(~ genotype, filenames, volumes_combined)
anatFDR(anatLm(~ genotype, filenames, volumes_combined))
```
### Doing analyses in parallel
These instructions are largely superseded by the [parallel tutorial](https://rawgit.com/Mouse-Imaging-Centre/RMINC/master/inst/documentation/RMINC_Parallelism.html).
pMincApply, just as mincApply, can be used to run any function on all voxels in your input files. The difference with mincApply, is that pMincApply can be parallelized (hence the p). You can use snowfall to run it locally using multiple cores on your machine, or sge to submit jobs to a batch system. Here is a short example of how to use pMincApply to run a function on your input files that is not implemented by any of the standard MINC functions.
```{r, eval = FALSE}
# load the RMINC library
library(RMINC)
# load the mapping of your input files and in this case genotype
# the aim of this example is to test the variance of the Jacobian determinants between wild types and mutants
gf <- read.csv("filenames_and_genotype.csv")
# Figure out how your function works on your data. For example test
# it on a single voxel:
voxel <- mincGetVoxel(gf$jacobians, 0,0,0)
# Run the function you want to use
# the "car" library provides "leveneTest"
library(car)
leveneTest(voxel, gf$Strain, center=mean)
```
---
```
Levene's Test for Homogeneity of Variance (center = mean)
Df F value Pr(>F)
group 2 1.5514 0.2188
74
```
---
```{r, eval = FALSE}
# in this example we will extract the F value, and the
# p value which can be done as so:
leveneTest(voxel, gf$Strain, center=mean)[1,2]
leveneTest(voxel, gf$Strain, center=mean)[1,3]
# also it's smart to encapsulate the return value by
# as.numeric to make sure it's returned as a number
# write a function that will be passed on to pMincApply. This
# function will directly refer to variables in the gf object
# this function will return the F-value and p value from the Levene's Test
# which comes from row 1, column 2 (as a numeric value) and column 3
leveneTestForRMINC <- function(x, gf) {
fout <- as.numeric(leveneTest(x, gf$Strain, center=mean)[1,2])
pout <- as.numeric(leveneTest(x, gf$Strain, center=mean)[1,3])
return(c(fout, pout))
}
```
### Local Parallel
Run parallel on your local machine.
it can be useful to set the environment variable
MINC_CACHE_MB=1 or some other small number to save RAM
```{r, eval = FALSE}
outLeveneTest <-
pMincApply(gf$jacobians
, leveneTestForRMINC
, gf = gf
, mask="mask.mnc"
, local = TRUE
, batches = 4
, global=c("gf","leveneTestForRMINC")
, packages="car")
# write your stats out to file
mincWriteVolume(outLeveneTest, "f_values.mnc", 1)
mincWriteVolume(outLeveneTest, "p_values.mnc", 2)
```
### Cluster Parallel
Instead of running the pMincApply in parallel on your machine
you can also use a cluster. Clusters are configured and
jobs are managed by the excellent `batchtools` package.
See the parallel tutorial above for more details.
```{r, eval = FALSE}
outLeveneTest <-
pMincApply(gf$jacobians
, leveneTestForRMINC
, gf = gf
, mask="mask.mnc"
, local = FALSE
, batches = 4
, global=c("gf","leveneTestForRMINC")
, packages="car")
```