Benchmark on Deep Learning Algorithms
For the segmentation challenge, iSeg-2017
Based on evaluations, in terms of the whole brain, small ROIs, and gyral curves, we can observe that none of these 8 top-ranked methods has achieved a strong, statistically significant better performance than all other methods.
- All methods directly apply well-established models (e.g., U-Nets) on the challenge, without considering any prior knowledge of infant brain images. In other words, the deep neuro-networks have a great generalization ability that the same architecture can also apply for liver/lung/kidney/prostate segmentations, but they are not aware of specificities of studied subjects, e.g., human cortical thickness is within a certain range .
- All methods ignore a fact that tissue contrast between CSF and GM is much higher than that between GM and WM. Therefore, it might be reasonable to identify CSF first from infant brain images to reconstruct the outer cortical surface and use it as a guidance to estimate the inner cortical surface, since cortical thickness is within a certain range. Preliminary work on 6-month infant subjects with risk of autism demonstrates the effectiveness of this kind of strategy [1,2].
- Augmentation is important. Among all 13 testing subjects, we find that all methods consistently performed badly on the 2nd and 10th testing subjects, which were acquired with motion artifacts/different scan pose. Therefore, the models with robustness to the motion or the scan pose are highly desired, since the motion is inevitable and these types of scan variation are normal during image acquisition. A possible solution to address these issues is to augment the training images with different rotation degrees, flipping, and simulated motion artifacts.
- All these 8 top-ranked methods randomly selected samples (2D/3D patches) from the training images using moving windows, without evaluating the importance of each sample. For example, in the conventional machine learning algorithms, adaptive boosting is an effective strategy to learn features from those error-prone regions to improve the performance. For example, by selecting more training samples from those error-prone regions, the performance of these segmentation algorithms could be further improved.
- In addition, the patch size used in these 8 top-ranked methods varies dramatically from 24×24×24 to 80×80×80, which could be further optimized for achieving better results.
For more details, please refer to the review article .
 “Volume-Based Analysis of 6-Month-Old Infant Brain MRI for Autism Biomarker Identification and Early Diagnosis,” in MICCAI, 2018, pp. 411-419.
 “Anatomy-Guided Joint Tissue Segmentation and Topological Correction for 6-Month Infant Brain MRI with Risk of Autism,” Human Brain Mapping, vol. 39, pp. 2609-2623, Jun 2018.
 Li Wang, Dong Nie, Guannan Li, Élodie Puybareau, Jose Dolz, Qian Zhang, Fan Wang, Jing Xia, Zhengwang Wu, Jiawei Chen, Kim-Han Thung, Toan Duc Bui, Jitae Shin, Guodong Zeng, Guoyan Zheng, Vladimir S. Fonov, Andrew Doyle, Yongchao Xu, Pim Moeskops, Josien P.W. Pluim, Christian Desrosiers, Ismail Ben Ayed, Gerard Sanroma, Oualid M. Benkarim, Adrià Casamitjana, Verónica Vilaplana, Weili Lin, Gang Li, and Dinggang Shen. “Benchmark on Automatic 6-month-old Infant Brain Segmentation Algorithms: The iSeg-2017 Challenge.” IEEE Transactions on Medical Imaging, 2019, doi: 10.1109/TMI.2019.2901712.