Unsupervised Discovery of Emphysema Subtypes in a Large Clinical Cohort

P. Binder, N. Batmanghelich, R. J. Estepar, P. Golland, Unsupervised Discovery of Emphysema Subtypes in a Large Clinical Cohort. 7th International Workshop on Machine Learning in Medical
Imaging (MLMI), Held in Conjunction with International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), LNCS, pp 180-187, 2016.

Emphysema is one of the hallmarks of Chronic Obstructive Pulmonary Disorder (COPD), a devastating lung disease often caused by smoking. Emphysema appears on Computed Tomography (CT) scans as a variety of textures that correlate with disease subtypes. It has been shown that the disease subtypes and textures are linked to physiological indicators and prognosis, although neither is well characterized clinically. Most previous computational approaches to modeling emphysema imaging data have focused on supervised classification of lung textures in patches of CT scans. In this work, we describe a generative model that jointly captures heterogeneity of disease subtypes and of the patient population. We also describe a corresponding inference algorithm that simultaneously discovers disease subtypes and population structure in an unsupervised manner. This approach enables us to create image-based descriptors of emphysema beyond those that can be identified through manual labeling of currently defined phenotypes. By applying the resulting algorithm to a large data set, we identify groups of patients and disease subtypes that correlate with distinct physiological indicators.