Skip to content

Creating degenerate annotations from volume based parcellations

aestrivex edited this page Oct 2, 2014 · 8 revisions

One limitation of CVU is that it inherently assumes a surface-based parcellation. In fMRI data, however, often volume-based atlases are used. Generally, doing analysis in volume is undesirable and surface-based analyses are generally superior. However, CVU is adaptable to this type of data, with some work.

The UCLA multimodal connectivity database (UMCD) is a very cool thing. It is a database where the processed adjacency matrices for a variety of multi-modal connectomics studies are available for anyone to download. The database can produce embedded network visualizations using mayavi, although cvu's visualizations have more features.

Like cvu, UMCD represents its connectomes using an NxN connectivity matrix. However instead of providing surface-based parcellations, UMCD's datasets were created using a volume-based parcellation. The nature of this parcellation is stored in the database as a text file describing the ROI names in order (basically an ordering file as in cvu, except not necessarily compliant with cvu's rules about handling hemispheres), and another text file containing Nx3 matrix describing the ROI centers in XYZ coordinates.

CVU was originally designed to show surface parcellations, which makes the visualization of connectivity inside a glass brain easier. While it might be possible to allow parcellations from volume files directly in such a visualization, in order to correctly portray node locations the volumes must be registered to the same anatomical coordinate space as the surface. There is no way for CVU to ensure that this is true without requiring the user to provide a registration matrix between the anatomical space and the surface space.

The recommended workflow is instead to create a "degenerate" annotation which creates a parcellation with each volume-based ROI in its corresponding surface coordinates. The resulting annotation is mostly blank, but assigns individual vertices to regions at the best fit locations.

In more detail, the steps required are as follows:

    1. The ROI centroids must be extracted from the volumetric parcellation. (This step is not described further in this tutorial)
    1. If the volume-based parcellation is not in the anatomical coordinate space in the they must be registered to the anatomical space. (This step is not described further in this tutorial, but see [1] [2] )
    1. For each node, create a label file with exactly one vertex corresponding to a point at the minimum distance on the label from the ROI centroid.
    1. Create an annotation based on these labels.
    1. Create an ordering file based on these parcellations. This is the most tricky step as cvu has a number of rules about parsing ordering files.

A short python script is provided below to do steps 2-4. (This script depends on MNE python).

This tutorial will (loosely) walk through the steps to do this for a sample study. The study used is the NKI rockland study from the UMCD which you can download here.

Extract this study somewhere and then create a local subjects directory. The directory should look like this

somewhere/
↳ NKI_fc_avg_connectivity_matrix_file.txt
↳ NKI_fc_avg_region_names_abbrev_file.txt
↳ NKI_fc_avg_region_names_full_file.txt
↳ NKI_fc_avg_region_xyz_centers_file.txt
↳ <lots of other study-specific files that are not currently of interest> ...
↳ fsavg5 (you should mkdir this subdirectory)
  ↳ label/
  ↳ surf/
    ↳ lh.pial
    ↳ rh.pial

Symlink the surfaces lh.pial and rh.pial to the surfaces provided with cvu at path/to/cvu/cvu/fsavg5/surf/*. You will put annotation files (called *h.nki.annot) in the label directory.

Here is the script which does the bulk of the work.

#!/usr/bin/env python

import numpy as np
import mne

names_file='NKI_fc_avg_region_names_full_file.txt'
centers_file='NKI_fc_avg_region_xyz_centers_file.txt'
surface='pial'

new_names_file='nki_region_names_fix.txt'
annot_name='nki'


names=[]

with open(names_file,'r') as fd:
	for ln in fd:
		names.append(ln.strip())

centers=np.loadtxt(centers_file)

lhsurf=mne.read_surface('lh.%s'%surface)
rhsurf=mne.read_surface('rh.%s'%surface)

labs_lh=[]
labs_rh=[]
region_nums={}

with open(new_names_file,'w') as fd:
	#find correct number for this label
	for i,(c,n) in enumerate(zip(centers,names)):
		if n in region_nums:
			nr=region_nums[n]+1
		else:
			nr=1
		region_nums[n]=nr

		if n[0]=='L':
			hemi='lh'	
			#name=n[5:]
			surf=lhsurf[0]
		elif n[0]=='R':
			hemi='rh'
			#name=n[6:]
			surf=rhsurf[0]
		else:
			fd.write('delete\n')
			continue

		#create unique entry in new names file
		name=('%s_%i'%(n,nr)).strip().replace(' ','_').lower()
		fd.write('%s_%s\n'%(hemi,name))

		#find closest vertex
		closest_vertex=-1
		dist=np.inf
		nr_collisions=2
		for i,v in enumerate(surf):
			vert_dist=np.linalg.norm(v-c)
			if vert_dist<dist:
				dist=vert_dist
				closest_vertex=i
			
			if vert_dist==dist:
				if np.random.random() < 1./nr_collisions:
					dist=vert_dist
					closest_vertex=i
				nr_collisions+=1

		#create label file with 1 vertex at closest vertex
		lab = mne.Label(vertices=(closest_vertex,),pos=c.reshape(1,3),hemi=hemi,
			name=name)
		if hemi=='lh':
			labs_lh.append(lab)
		else:
			labs_rh.append(lab)

mne.parc_from_labels(labs_lh, None, annot_fname='fsavg5/label/lh.%s.annot'%annot_name,
	overwrite=True)
mne.parc_from_labels(labs_rh, None, annot_fname='fsavg5/label/rh.%s.annot'%annot_name,
	overwrite=True)

Note that there is a little bit of "magic" in this script to handle ordering files. Specifically, it assumes that every ROI in the noncompliant ordering file starts with "L","l","R", or "r". This is true of the sample nki data but obviously isn't true of every dataset in general.

This annotation includes some subcortical ROIs which are displayed on the nearest point on the surface, instead of in volume. Most of the subcortical ROIs could alternately be omitted from the parcellation and then they would show up as normal subcortical ROIs. There are two problems:

  1. These ROIs would have to be manually hardcoded in the above script to not be added to the surface parcellation.
  2. Any subcortical ROIs with multiple entries (i.e., in the NKI rockland study the thalamus is divided into thalamus_1 and thalamus_2) would not be supported this way because this differs from the hardcoded allowed subcortical ROIs in cvu. (i.e, more precisely, cvu does not and cannot know where the divisions of thalamus_1 and thalamus_2 by manually examining the segmentation, which is how it tells where to put a subcortical region called thalamus).

Here is a visualization of a degenerate annotation, using the fMRI data from the NKI rockland study averaged across all participants.

Notice a few things:

  • Subcortical ROIs are shown on the nearest point on the surface.
  • Showing scalar values by coloring patches of the surface will not work, because the annotation only has a single vertex assigned to each label.
  • The ordering chosen for this visualization -- which is the order of items in the matrix -- is really terrible. I'm not sure why this odd ordering was chosen -- it is probably the result of a random hash function that happens to not affect UMCD's visualizations.

Shown below are two improved orderings:

In the top plot, the items in the matrix are alphabetized. Perhaps counterintuitively, the alphabetic strategy yields a much, much better ordering than a random ordering. The reason for this is that ROIs have a much higher than random chance of having high connectivity with alphabetically nearby ROIs, especially if there are many ROI names of the type superior_frontal_6, superior_frontal_7, etc.

Below that is an even better ordering, which was made by manually (and subjectively) adjusting the order so that the order of ROIs roughly follows a semicircular pattern beginning at the frontal pole, wrapping around the parietal and occipital cortices, and then ending at the temporal pole. The ordering is clearly better than the alphabetic choice (the matrix shows greater clustering and there is less background noise in the circle because more short range connections are clustered together). However, creating an alphabetical ordering is easy and creating an anatomically principled ordering is not. It took me between 30-45 minutes to create this ordering by manual examination and then manually editing the ordering file. So, for many purposes, the alphabetical ordering is probably good enough.