Interreader reliability of LI-RADS version 2014 algorithm and imaging features for diagnosis of hepatocellular carcinoma: A large international multireader study
- Authors: Fowler K.J.; Tang A.; Santillan C.; Bhargavan-Chatfield M.; Heiken J.; Jha R.C.; Weinreb J.; Hussain H.; Mitchell D.G.; Bashi M.R.; Costa E.A.C.; Cunha G.M.; Coombs L.; Wolfson T.; Gamst A.C.; Brancatelli G.; Yeh B.; Sirlin C.B.
- Publication year: 2018
- Type: Articolo in rivista
- OA Link: http://hdl.handle.net/10447/435862
Abstract
Purpose: To determine in a large multicenter multireader setting the interreader reliability of Liver Imaging Reporting and Data System (LI-RADS) version 2014 categories, the major imaging features seen with computed tomography (CT) and magnetic resonance (MR) imaging, and the potential effect of reader demographics on agreement with a preselected nonconsecutive image set. Materials and Methods: Institutional review board approval was obtained, and patient consent was waived for this retrospective study. Ten image sets, comprising 38-40 unique studies (equal number of CT and MR imaging studies, uniformly distributed LI-RADS categories), were randomly allocated to readers. Images were acquired in unenhanced and standard contrast material-enhanced phases, with observation diameter and growth data provided. Readers completed a demographic survey, assigned LI-RADS version 2014 categories, and assessed major features. Intraclass correlation coefficient (ICC) assessed with mixed-model regression analyses was the metric for interreader reliability of assigning categories and major features. Results: A total of 113 readers evaluated 380 image sets. ICC of final LI-RADS category assignment was 0.67 (95% confidence interval [CI]: 0.61, 0.71) for CT and 0.73 (95% CI: 0.68, 0.77) for MR imaging. ICC was 0.87 (95% CI: 0.84, 0.90) for arterial phase hyperenhancement, 0.85 (95% CI: 0.81, 0.88) for washout appearance, and 0.84 (95% CI: 0.80, 0.87) for capsule appearance. ICC was not significantly affected by liver expertise, LI-RADS familiarity, or years of postresidency practice (ICC range, 0.69-0.70; ICC difference, 0.003-0.01 [95% CI: 20.003 to 20.01, 0.004-0.02]. ICC was borderline higher for private practice readers than for academic readers (ICC difference, 0.009; 95% CI: 0.000, 0.021). Conclusion: ICC is good for final LI-RADS categorization and high for major feature characterization, with minimal reader demographic effect. Of note, our results using selected image sets from nonconsecutive examinations are not necessarily comparable with those of prior studies that used consecutive examination series.