Urinary Stone Detection on CT Images Using Deep Convolutional Neural Networks: Evaluation of Model Performance and Generalization

Anushri Parakh; Hyunkwang Lee; Jeong Hyun Lee; Brian H Eisner; Dushyant V Sahani; Synho Do

doi:10.1148/ryai.2019180066

Urinary Stone Detection on CT Images Using Deep Convolutional Neural Networks: Evaluation of Model Performance and Generalization

Radiol Artif Intell. 2019 Jul 24;1(4):e180066. doi: 10.1148/ryai.2019180066. eCollection 2019 Jul.

Authors

Anushri Parakh¹, Hyunkwang Lee¹, Jeong Hyun Lee¹, Brian H Eisner¹, Dushyant V Sahani¹, Synho Do¹

Affiliation

¹ Departments of Radiology (A.P., H.L., D.V.S., S.D.) and Urology (B.H.E.), Massachusetts General Hospital, 55 Fruit St, White 270, Boston, MA 02114; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Mass (H.L.): and Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea (J.H.L.).

Abstract

Purpose: To investigate the diagnostic accuracy of cascading convolutional neural network (CNN) for urinary stone detection on unenhanced CT images and to evaluate the performance of pretrained models enriched with labeled CT images across different scanners.

Materials and methods: This HIPAA-compliant, institutional review board-approved, retrospective clinical study used unenhanced abdominopelvic CT scans from 535 adults suspected of having urolithiasis. The scans were obtained on two scanners (scanner 1 [hereafter S1] and scanner 2 [hereafter S2]). A radiologist reviewed clinical reports and labeled cases for determination of reference standard. Stones were present on 279 (S1, 131; S2, 148) and absent on 256 (S1, 158; S2, 98) scans. One hundred scans (50 from each scanner) were randomly reserved as the test dataset, and the rest were used for developing a cascade of two CNNs: The first CNN identified the extent of the urinary tract, and the second CNN detected presence of stone. Nine variations of models were developed through the combination of different training data sources (S1, S2, or both [hereafter SB]) with (ImageNet, GrayNet) and without (Random) pretrained CNNs. First, models were compared for generalizability at the section level. Second, models were assessed by using area under the receiver operating characteristic curve (AUC) and accuracy at the patient level with test dataset from both scanners (n = 100).

Results: The GrayNet-pretrained model showed higher classifier exactness than did ImageNet-pretrained or Random-initialized models when tested by using data from the same or different scanners at section level. At the patient level, the AUC for stone detection was 0.92-0.95, depending on the model. Accuracy of GrayNet-SB (95%) was higher than that of ImageNet-SB (91%) and Random-SB (88%). For stones larger than 4 mm, all models showed similar performance (false-negative results: two of 34). For stones smaller than 4 mm, the number of false-negative results for GrayNet-SB, ImageNet-SB, and Random-SB were one of 16, three of 16, and five of 16, respectively. GrayNet-SB identified stones in all 22 test cases that had obstructive uropathy.

Conclusion: A cascading model of CNNs can detect urinary tract stones on unenhanced CT scans with a high accuracy (AUC, 0.954). Performance and generalization of CNNs across scanners can be enhanced by using transfer learning with datasets enriched with labeled medical images.© RSNA, 2019Supplemental material is available for this article. : An earlier incorrect version appeared online. This article was corrected on August 6, 2019.

2019 by the Radiological Society of North America, Inc.