-
Notifications
You must be signed in to change notification settings - Fork 14
Creating degenerate annotations from volume based parcellations
One limitation of CVU is that it inherently assumes a surface-based parcellation. In fMRI data, however, often volume-based atlases are used. Generally, doing analysis in volume is undesirable and surface-based analyses are generally superior. However, CVU is adaptable to this type of data, with some work.
The UCLA multimodal connectivity database (UMCD) is a very cool thing. It is a database where the processed adjacency matrices for a variety of multi-modal connectomics studies are available for anyone to download. The database can produce embedded network visualizations using mayavi, although cvu's visualizations have more features.
Like cvu, UMCD represents its connectomes using an NxN connectivity matrix. However instead of providing surface-based parcellations, UMCD's datasets were created using a volume-based parcellation. The nature of this parcellation is stored in the database as a text file describing the ROI names in order (basically an ordering file as in cvu, except not necessarily compliant with cvu's rules about handling hemispheres), and another text file containing Nx3 matrix describing the ROI centers in XYZ coordinates.
CVU was originally designed to show surface parcellations, which makes the visualization of connectivity inside a glass brain easier. While it might be possible to allow parcellations from volume files directly in such a visualization, in order to correctly portray node locations the volumes must be registered to the same anatomical coordinate space as the surface. There is no way for CVU to ensure that this is true without requiring the user to provide a registration matrix between the anatomical space and the surface space.
The recommended workflow is instead to create a "degenerate" annotation which creates a parcellation with each volume-based ROI in its corresponding surface coordinates. The resulting annotation is mostly blank, but assigns individual vertices to regions at the best fit locations.
In more detail, the steps required are as follows:
-
- The ROI centroids must be extracted from the volumetric parcellation. (This step is not described further in this tutorial)
-
- For each node, create a label file with exactly one vertex corresponding to a point at the minimum distance on the label from the ROI centroid.
-
- Create an annotation based on these labels.
-
- Create an ordering file based on these parcellations. This is the most tricky step as cvu has a number of rules about parsing ordering files.
A short python script is provided below to do steps 2-4. (This script depends on MNE python).
This tutorial will (loosely) walk through the steps to do this for a sample study. The study used is the NKI rockland
study from the UMCD which you can download here.
Extract this study somewhere and then create a local subjects directory. The directory should look like this
somewhere/
↳ NKI_fc_avg_connectivity_matrix_file.txt
↳ NKI_fc_avg_region_names_abbrev_file.txt
↳ NKI_fc_avg_region_names_full_file.txt
↳ NKI_fc_avg_region_xyz_centers_file.txt
↳ <lots of other study-specific files that are not currently of interest> ...
↳ fsavg5 (you should mkdir this subdirectory)
↳ label/
↳ surf/
↳ lh.pial
↳ rh.pial
Symlink the surfaces lh.pial
and rh.pial
to the surfaces provided with cvu at path/to/cvu/cvu/fsavg5/surf/*
. You will put annotation files (called *h.nki.annot
) in the label
directory.
Here is the script which does the bulk of the work.
#!/usr/bin/env python
import numpy as np
import mne
names_file='NKI_fc_avg_region_names_full_file.txt'
centers_file='NKI_fc_avg_region_xyz_centers_file.txt'
surface='pial'
new_names_file='nki_region_names_fix.txt'
annot_name='nki'
names=[]
with open(names_file,'r') as fd:
for ln in fd:
names.append(ln.strip())
centers=np.loadtxt(centers_file)
lhsurf=mne.read_surface('lh.%s'%surface)
rhsurf=mne.read_surface('rh.%s'%surface)
labs_lh=[]
labs_rh=[]
region_nums={}
with open(new_names_file,'w') as fd:
#find correct number for this label
for i,(c,n) in enumerate(zip(centers,names)):
if n in region_nums:
nr=region_nums[n]+1
else:
nr=1
region_nums[n]=nr
if n[0]=='L':
hemi='lh'
#name=n[5:]
surf=lhsurf[0]
elif n[0]=='R':
hemi='rh'
#name=n[6:]
surf=rhsurf[0]
else:
fd.write('delete\n')
continue
#create unique entry in new names file
name=('%s_%i'%(n,nr)).strip().replace(' ','_').lower()
fd.write('%s_%s\n'%(hemi,name))
#find closest vertex
closest_vertex=-1
dist=np.inf
nr_collisions=2
for i,v in enumerate(surf):
vert_dist=np.linalg.norm(v-c)
if vert_dist<dist:
dist=vert_dist
closest_vertex=i
if vert_dist==dist:
if np.random.random() < 1./nr_collisions:
dist=vert_dist
closest_vertex=i
nr_collisions+=1
#create label file with 1 vertex at closest vertex
lab = mne.Label(vertices=(closest_vertex,),pos=c.reshape(1,3),hemi=hemi,
name=name)
if hemi=='lh':
labs_lh.append(lab)
else:
labs_rh.append(lab)
mne.parc_from_labels(labs_lh, None, annot_fname='fsavg5/label/lh.%s.annot'%annot_name,
overwrite=True)
mne.parc_from_labels(labs_rh, None, annot_fname='fsavg5/label/rh.%s.annot'%annot_name,
overwrite=True)
Note that there is a little bit of "magic" in this script to handle ordering files. Specifically, it assumes that every ROI in the noncompliant ordering file starts with "L","l","R", or "r". This is true of the sample nki data but obviously isn't true of every dataset in general.
This annotation includes some subcortical ROIs which are displayed on the nearest point on the surface, instead of in volume. Most of the subcortical ROIs could alternately be omitted from the parcellation and then they would show up as normal subcortical ROIs. There are two problems:
- These ROIs would have to be manually hardcoded in the above script to not be added to the surface parcellation.
- Any subcortical ROIs with multiple entries (i.e., in the
NKI rockland
study the thalamus is divided intothalamus_1
andthalamus_2
) would not be supported this way because this differs from the hardcoded allowed subcortical ROIs in cvu. (i.e, more precisely, cvu does not and cannot know where the divisions ofthalamus_1
andthalamus_2
by manually examining the segmentation, which is how it tells where to put a subcortical region calledthalamus
).
Here is a visualization of a degenerate annotation, using the fMRI data from the NKI rockland study averaged across all participants.
Notice a few things:
- Subcortical ROIs are shown on the nearest point on the surface.
- Showing scalar values by coloring patches of the surface will not work, because the annotation only has a single vertex assigned to each label.
- The ordering chosen for this visualization -- which is the order of items in the matrix -- is really terrible. I'm not sure why this odd ordering was chosen -- it is probably the result of a random hash function that happens to not affect UMCD's visualizations.
Shown below are two improved orderings:
In the top plot, the items in the matrix are alphabetized. Perhaps counterintuitively, the alphabetic strategy yields a much, much better ordering than a random ordering. The reason for this is that ROIs have a much higher than random chance of having high connectivity with alphabetically nearby ROIs, especially if there are many ROI names of the type superior_frontal_6
, superior_frontal_7
, etc.
Below that is an even better ordering, which was made by manually (and subjectively) adjusting the order so that the order of ROIs roughly follows a semicircular pattern beginning at the frontal pole, wrapping around the parietal and occipital cortices, and then ending at the temporal pole. The ordering is clearly better than the alphabetic choice (the matrix shows greater clustering and there is less background noise in the circle because more short range connections are clustered together). However, creating an alphabetical ordering is easy and creating an anatomically principled ordering is not. It took me between 30-45 minutes to create this ordering by manual examination and then manually editing the ordering file. So, for many purposes, the alphabetical ordering is probably good enough.