Voxel importance in classifier ensembles based on sign consistency patterns

Stand-By Time

Thursday, June 29, 2017: 12:45 PM  - 2:45 PM 

Submission No:

3893 

Submission Type:

Abstract Submission 

On Display:

Wednesday, June 28 & Thursday, June 29 

Authors:

Jussi Tohka1, Vanessa Gomez-Verdejo2, Emilio Parrado-Hernandez2

Institutions:

1University of Eastern Finland, Kuopio, Finland, 2Universidad Carlos III de Madrid, Leganes, Spain

First Author:

Jussi Tohka    -  Lecture Information | Contact Me
University of Eastern Finland
Kuopio, Finland

Introduction:

An important problem in using voxel-based supervised classification algorithms for brain imaging applications is that the dimensionality of data (the number of voxels in the images of a single subject) far exceeds the number of training subjects available. This has led to a number of papers studying feature or voxel selection within brain imaging. However, in addition to selecting a set of important voxels, it is interesting to rank and study their relative importance to the classification. This problem, termed voxel importance determination, has received significantly less attention in brain imaging.

Methods:

We introduce and study a new variable importance measure based on sign consistency of the weights in an ensemble of linear support vector machines (SVMs). The advantages of this method include: 1) it provides an importance score to every voxel, 2) correlated voxels receive similar importance scores, and 3) the method scales well to the cases where there are tens of thousands voxels. In this method which we call sign consistency bagging (Verdejo-Gomez 2016, Parrado-Hernandez 2014), we train an ensemble of 20000 SVMs using only a part of the subjects available for each SVM in the ensemble. Assuming the voxel values are positive, the main idea is to use sign consistency, i.e., the number of times a voxel weight is positive (or negative), to define the importance of a voxel. We thereafter still refine the voxel importance by using the ideas from conformal analysis. In this refinement, we supplement the training set with test data with randomly generated labels, iterate with different random labels and obtain refined importances (SCBconf) by averging the SCB importance scores. This transductive refinement filters out voxels that obtain spuriously high importance after analyzing the ensemble.

We demonstrate the method by considering normal control (NC) versus mild cognitive impairment (MCI) classification using the ADNI structural MRI data (www.adni-info.org). The T1-weighted MRIs (404 MCIs and 231 NCs) were processed into representations of the gray matter density with VBM8 software (http://www.neuro.uni-jena.de/vbm/download/) using a pipeline described in (Moradi 2015), with 4 x 4 x 4 mm voxel size and 29562 voxels.

Results:

Figure 1 shows the voxel importances, scaled to lie in the unit interval, using the proposed methods (SCB and SCBconf), the standard t-test, and elastic-net penalized logistic regression (ENET, Friedman 2010) with different parameter values. SCB methods produced dense voxel importances, with a number of voxels receiving the normalized importance score greater than 0.2 (heuristically selected). Also, the t-test gave an importance score to every voxel and for a number of voxels, this score exceeded the FDR-corrected threshold at q = 0.05. Instead, as expected, both ENET models produced only a sparse pattern of voxels with non-zero weights. This is a clear disadvantage if a voxel ranking is desired. The pattern of the highest importances were more similar between SCBs and ENETs than between SCBs and t-test.

To demonstrate the utility of trasnsductive refinement step, we applied a split-half resampling approach akin to (Tohka 2016) and studied the reproducibility of voxel importances when the training set was completely changed. The average absolute importance difference was 0.32 for SCB and 0.15 SCBconf, which demonstrates the added reproducibility of SCBconf importances.
Supporting Image: imp_ohbm2017_smallfont.PNG
   ·Figure 1
 

Conclusions:

We have introduced and evaluated a new voxel importance measure based on sign consistency on the classifier ensembles. The experiments demonstrated that the approach was able to produce robust voxel importance estimates that were different from ones provided by massively univariate hypothesis testing and elastic-net penalized logistic regression. While the ideas of random subsampling and random relabeling are widely used for variable importance and selection, the idea of sign consistency is novel and much less exploited.

Imaging Methods:

Anatomical MRI 2

Modeling and Analysis Methods:

Classification and Predictive Modeling 1

Poster Session:

Poster Session - Thursday

Keywords:

Data analysis
Machine Learning

1|2Indicates the priority used for review

Would you accept an oral presentation if your abstract is selected for an oral session?

Yes

I would be willing to discuss my abstract with members of the press should my abstract be marked newsworthy:

Yes

Please indicate below if your study was a "resting state" or "task-activation” study.

Other

By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute the presentation in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels or other electronic media and on the OHBM website.

I accept

Healthy subjects only or patients (note that patient studies may also involve healthy subjects):

Patients

Internal Review Board (IRB) or Animal Use and Care Committee (AUCC) Approval. Please indicate approval below. Please note: Failure to have IRB or AUCC approval, if applicable will lead to automatic rejection of abstract.

Not applicable

Please indicate which methods were used in your research:

Structural MRI

For human MRI, what field strength scanner do you use?

1.5T

Which processing packages did you use for your study?

SPM
Other, Please list  -   VBM8

Provide references in author date format

Friedman, J., Hastie, T., & Tibshirani, R. , 2010. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1), pp. 1-22.

V. Gomez-Verdejo, E. Parrado-Hernandez, J. Tohka . Voxel importance in classifier ensembles based on sign consistency patterns: Application to sMRI. IEEE International workshop on Pattern Recognition in Neuroimaging 2016.

E. Moradi, A. Pepe, C. Gaser, H. Huttunen, and J. Tohka . Machine learning framework for early MRI-based Alzheimer's conversion prediction in MCI subjects. NeuroImage , 104: 398 - 412, 2015.

E. Parrado-Hernandez, V. Gomez-Verdejo, M. Martinez-Ramon, J. Shawe-Taylor, P. Alonso, J. Pujol, J. M. Menchon, N. Cardoner, ´
and C. Soriano-Mas, “Discovering brain regions relevant to obsessive–compulsive disorder identification through bagging and transduction,”Medical image analysis, vol. 18, no. 3, pp. 435–448, 2014.

J. Tohka , E. Moradi, and H. Huttunen. Comparison of feature selection techniques in machine learning for anatomical brain MRI in dementia. Neuroinfomatics ,14(3):279 - 296, 2016 .