Multicenter validation of a deep learning detection algorithm for focal cortical dysplasia

Ravnoor Singh Gill, Hyo-Min Lee, Benoit Caldairou, Seok-Jun Hong, Carmen Barba, Francesco Deleo, Ludovico D'Incerti, Vanessa Cristina Mendes Coelho, Matteo Lenge, Mira Semmelroch, Dewi Victoria Schrader, Fabrice Bartolomei, Maxime Guye, Andreas Schulze-Bonhage, Horst Urbach, Kyoo Ho Cho, Fernando Cendes, Renzo Guerrini, Graeme Jackson, R. Edward Hogan, Neda Bernasconi, Andrea Bernasconi. Neurology. 2021 Oct 19;97(16):e1571-82


BACKGROUND & OBJECTIVES: To test the hypothesis that a multicenter-validated computer deep learning algorithm detects MRI-negative focal cortical dysplasia (FCD).

METHODS: We used clinically acquired 3-dimensional (3D) T1-weighted and 3D fluid-attenuated inversion recovery MRI of 148 patients (median age 23 years [range 2–55 years]; 47% female) with histologically verified FCD at 9 centers to train a deep convolutional neural network (CNN) classifier. Images were initially deemed MRI-negative in 51% of patients, in whom intracranial EEG determined the focus. For risk stratification, the CNN incorporated bayesian uncertainty estimation as a measure of confidence. To evaluate performance, detection maps were compared to expert FCD manual labels. Sensitivity was tested in an independent cohort of 23 cases with FCD (13 ± 10 years). Applying the algorithm to 42 healthy controls and 89 controls with temporal lobe epilepsy disease tested specificity.

RESULTS: Overall sensitivity was 93% (137 of 148 FCD detected) using a leave-one-site-out cross-validation, with an average of 6 false positives per patient. Sensitivity in MRI-negative FCD was 85%. In 73% of patients, the FCD was among the clusters with the highest confidence; in half, it ranked the highest. Sensitivity in the independent cohort was 83% (19 of 23; average of 5 false positives per patient). Specificity was 89% in healthy and disease controls.

DISCUSSION: This first multicenter-validated deep learning detection algorithm yields the highest sensitivity to date in MRI-negative FCD. By pairing predictions with risk stratification, this classifier may assist clinicians in adjusting hypotheses relative to other tests, increasing diagnostic confidence. Moreover, generalizability across age and MRI hardware makes this approach ideal for presurgical evaluation of MRI-negative epilepsy.

CLASSIFICATION OF EVIDENCE: This study provides Class III evidence that deep learning on multimodal MRI accurately identifies FCD in patients with epilepsy initially diagnosed as MRI negative.