Methods: Between April 2016 and January 2021, a total of 827 lymph nodes of 259 patients (211 males, 48 females; mean age: 61.1±7.2 years; range, 41 to 79 years) who underwent endobronchial ultrasound procedure for diagnosis and/or staging of lung cancer and diagnosis of mediastinal lymphadenopathy of unknown origin were retrospectively analyzed. This external validation study was designed to compare the diagnostic yields of the prediction tools developed by Shafiek et al., Alici et al., and Canada Lymph Node Score (CLNS). Endobronchial ultrasoundguided transbronchial needle aspiration results and predictions were compared to gold-standard tool.
Results: Overall, endobronchial ultrasound-guided transbronchial needle aspiration had a sensitivity, specificity, positive and negative predictive value, and accuracy of 95.6%, 100%, 100%, 97.6%, and 98.4%, respectively. Diagnostic performances of proposed tools were quite remarkable. Among them, Alici algorithm had a higher sensitivity and negative predictive value, which were matched by excellent specificity and positive predictive value offered by CLNS ≥3 and Shafiek tool. The area under the curve value of CLNS ≥3 was higher than Shafiek tool and CLNS ≥2.
Conclusion: Conventional prediction tools relying on simple real-time sonographic features were found to be consistent by the means of diagnostic performance in this external validation dataset. Despite being inferior to cytology, their superior performance was proven with defined individual strengths and weaknesses.
The patients with benign diseases (sarcoidosis, tuberculosis, pneumoconiosis, etc.) and patients who received chemotherapy and/or radiotherapy were excluded as these conditions may alter the architecture and, thus, sonographic appearance of lymph nodes. Malignant lymph nodes diagnosed with EBUS-TBNA were assumed as true positives. All negative results were confirmed with surgical biopsy or at least six months of progression-free follow-up. The EBUS-TBNA results and predictions by tools were compared to the gold standard (surgical results, true positive EBUS-TBNA results or six months of progression-free follow-up). Positive or surgically confirmed results between July 2020 and January 2021 were included, but negative results without surgical confirmation were excluded.
Endobronchial ultrasound and simultaneous transbronchial needle aspiration were performed under moderate-to-deep sedation with midazolam and remifentanil in the operating room. A convex probe EBUS (BF-UC180F, Olympus, Tokyo, Japan) was used to examine the lymph nodes, and the ultrasound images were processed with a dedicated scanner (EU-ME1, Olympus, Tokyo, Japan).
Statistical analysis
Statistical analysis was performed using the IBM
SPSS for Windows version 20.0 software (IBM Corp.,
Armonk, NY, USA). Descriptive data were expressed
in mean ± standard deviation (SD) for continuous
variables and in number and frequency for categorical
variables. Three systems were compared with receiver
operating characteristics (ROC) curve analysis. The
diagnostic performances of three prediction tools in comparison with EBUS-TBNA cytological yield by
the means of sensitivity, specificity, positive predictive
value (PPV), negative predictive value (NPV), positive
and negative likelihood ratios, area under curve (AUC)
and accuracy were given. A p value of < 0.05 was
considered statistically significant with 95% confidence
interval (CI).
Table 1. Characteristics of lymph nodes and primary condition
Sonographic features of the lymph nodes are given in Table 2. Malignant nodes were larger, more hypoechoic, and more heterogeneous. They were mostly round and had distinct margins. The presence of calcification and necrosis was similar between malignant and benign lymph nodes. The absence of central hilar structure was mostly seen in malignant lymph nodes.
Table 2. Sonographic features of lymph nodes and percent of positive scores
In these patients, EBUS-TBNA had a sensitivity, specificity, PPV, and NPV, and accuracy of 95.6%, 100%, 100%, 97.6% and 98.4%, respectively. The AUC value for malignant disease was 0.978 (95% CI: 0.965-0.987; p=0.0001). Diagnostic performances of proposed prediction tools were quite good despite being significantly lower than cytology (Figure 1, Tables 3 and 4). Compared to each other, all tools performed well. Among them, the Alici algorithm had a higher sensitivity and NPV, which were matched by excellent specificity and PPV offered by Canadian Lymph Node Score (CLNS) ≥3 and the Shafiek tool. The AUC value of CLNS ≥3 was higher than the Shafiek tool and CLNS ≥2.
Figure 1. Comparative receiver operating characteristics (ROC) curves.
Table 3. Comparison of the diagnostic performances of scoring systems
Table 4. Statistical significance of head-to-head comparisons
The first tool was published by Shafiek et al.[5] in 2014. They used two discrete datasets to challenge and validate the tool. One of the main strengths of this study was the presence of two raters to decide on the sonographic features. The authors initially focused on six criteria (round shape, distinct borders, heterogeneous echogenicity, absence of visible central hilar structure, size of ≥10 mm, and presence of hyperechogenic density). Primary analysis revealed the non-significant effect of presence of a hyperechogenic density in the interior of a lymph node. Therefore, they modified the criteria which was partly borrowed from Schmid-Bindert et al.[10] and performed a validation in the second arm. Variable multiplier factors were defined for size, shape, margin, echogenicity, and absence of central hilar structure. The computed result was analyzed with the ROC curve and modified tool with a score of >5 showed a sensitivity of 78% and specificity of 86% in the detection of a malignant nodes (AUC=0.852; 95% CI: 0.743-0.928; p=0.0001).
Second tool by Alici et al.[3] w as d eveloped w ith decision tree analysis, thus, formed as an algorithm. In the beginning, they randomly divided the cases into two groups: experimental arm and study arm. The algorithm was developed in the experimental arm and validated in the study group. The findings in the second arm resulted in a modification of the algorithm in its final form. They reported the sensitivity, specificity, NPV, and PPV, and accuracy of the algorithm as 100%, 51.2%, 50.6%, 100%, and 67.5%, respectively. When its high sensitivity and PPV were considered, they concluded that the tool might be useful in choosing true positive nodes in a particular nodal station rather than indicating an unnecessary sampling. The study attracted attention with its detailed visual classification atlas based on Fujiwara's original criteria.[2] The atlas clearly defines the sonographic features with corresponding images. Despite offering a useful tool which can be used during real-time EBUS procedures, the study failed to give a proper interrater variability which was clearly mentioned as a prominent limitation of the study.
Hylton and et al.[9] have been studying prediction tools for years. After a systematic review on this subject, they reported an easy-to-use and reliable prediction tool, the CLNS, recently.[4] They conducted the study in two parts. The first part was data collection and assessment of validity. In this part, data on sonographic features were collected prospectively according to Fujiwara's criteria[2] and video files were obtained by a screen-recording device. They developed the predictive tool with logistic regression analyses on those data with taking account of β-coefficients. In the second part, those video files were re-evaluated by a Canada-wide pre-educated rater team. The tool showed a cut-off value of ≥2 in the study population, but they decided to increase the cut-off value to ≥3 regarding the probability of malignant disease in comparative groups. The CLNS (cut-off ≥3) had given a sensitivity, specificity, PPV, and NPV of 31.5%, 96.3%, 65.4%, and 86.5%, respectively. With its high specificity and NPV, the tool seems to be useful in finding true negatives which may not be sampled during EBUS procedure. The methodology of the study is also remarkable with unmatched data on inter-rater variability. As an easy-to-use tool, the inter-rater reliability of a CLNS ≥3 was very good: 0.81±0.02 (95% CI: 0.77-0.85).
There were other prediction tools by Schmid-Bindert et al.,[10] which served as a basis for the study by Shafiek et al.,[5] and by Evison et al.[11] as one of the first studies on this issue. We did not involve those tools in the comparative validation analyses for some reasons. First, Schmid-Bindert et al.[10] involved benign diseases in their study (sarcoidosis 10% and tuberculosis 10%). These diseases may alter the architecture of the lymph node with enlargement, clarification of the borders, heterogeneous texture (particularly in case of tuberculosis), necrosis, calcification, and deformation of the hilum. The data based on such lymph nodes would be erroneous. Pre-selection of cases is of particular importance as the results heavily depend on those. This was the case in CLNS study, Hylton et al.[4] excluded the cases who received neoadjuvant chemotherapy to avoid a potential confounding factor. In their study, Shafiek et al.[5] did not give detailed inclusion/exclusion criteria, but it seems reasonable that they did not involve a patient who received such treatments as they reported that the patients underwent EBUS procedure for lung cancer staging and investigation of suspected malignancy.[5] However, there were no detailed data on the composition of non-malignant group. Therefore, the external validation dataset in this study is of particular importance, as we only included malignant nodes and reactive hyperplasia. Other benign diseases such as sarcoidosis may act as a confounding factor and result in underestimation of diagnostic performance of the tools. Another reason not to involve Schmid-Bindert tool was that the proposed criteria was already shared by Shafiek et al.[5] with a minimal modification.
Another tool by Evison et al.[11] was developed to make a risk stratification model for negative results by EBUS. The authors aimed to guide the multidisciplinary teams while deciding which patients with a negative result by EBUS needed further staging procedures. They involved the findings in CT and PET to weigh the risk of false negative results. Endosonographists mostly have information on CT and/or PET findings of the patient before EBUS procedure and these findings may guide them while performing the procedure. However, these are complex scores and could not be compared directly with simple predictive tools involved in this study.
In this head-to-head comparison, all tools failed to rival cytological analysis. Still, there is no score to substitute cytological analyses. All three tools (Alici, CLNS ≥3 and Shafiek) were accurate. The AUC values were statistically not different between Alici and CLNS ≥3; the AUC of Shafiek tool was slightly lower than CLNS ≥3. All were useful, however, not in the same way. The Alici algorithm still overestimates the possibility of malignant nodes, but may be useful in deciding the node to be sampled. The CLNS ≥3 and Shafiek tools have an excellent specificity to be used in deciding not to sample. The CLNS ≥2 offers a good sensitivity to point out malignant nodes; however, it cannot find true benign nodes. Its superiority in sensitivity is attenuated by the lowest specificity and accuracy. Of them, the simplest tools are CLNS and Alici which can be easily used during real-time EBUS procedure.
This study is remarkable with its high lymph node number and methodology which offers a head-to-head comparison of existing prediction tools. However, there are also several limitations to the study. First, it is a retrospective analysis. A prospective validation should be carried out in the future. Second, the procedures were done by a single endosonographist and a proper inter-rater variability could not be reported. Finally, the data did not give information on a possible place of non-conventional sonographic analyses such as vascular pattern or elastography. In conclusion, conventional prediction tools relying on simple real-time sonographic features were found to be consistent by means of diagnostic performance in this validation dataset. Despite individual strengths and weaknesses, their superior performance was clearly proven. However, they did not offer the desired diagnostic yield to be used instead of cytology. Different characteristics shall be studied and implied to serve as a basis to a probable future"s "digital sampling".
Ethics Committee Approval: The study protocol was approved by the Dr. Suat Seren Chest Diseases and Surgery Training and Research Hospital Local Ethics Committee (date: 12.01.2021, no: 8). The study was conducted in accordance with the principles of the Declaration of Helsinki.
Patient Consent for Publication: A written informed consent was obtained from each patient.
Data Sharing Statement: The data that support the findings of this study are available from the corresponding author upon reasonable request.
Author Contributions: All authors contributed equally to the article.
Conflict of Interest: The authors declared no conflicts of interest with respect to the authorship and/or publication of this article.
Funding: The authors received no financial support for the research and/or authorship of this article.
1) Silvestri GA, Gonzalez AV, Jantz MA, Margolis ML, Gould
MK, Tanoue LT, et al. Methods for staging non-small cell
lung cancer: Diagnosis and management of lung cancer, 3rd
ed: American College of Chest Physicians evidence-based
clinical practice guidelines. Chest 2013;143(5 Suppl):e211Se250S.
doi: 10.1378/chest.12-2355
2) Fujiwara T, Yasufuku K, Nakajima T, Chiyo M, Yoshida
S, Suzuki M, et al. The utility of sonographic features
during endobronchial ultrasound-guided transbronchial
needle aspiration for lymph node staging in patients with
lung cancer: A standard endobronchial ultrasound image
classification system. Chest 2010;138:641-7. doi: 10.1378/
chest.09-2006
3) Alici IO, Yılmaz Demirci N, Yılmaz A, Karakaya J, Özaydın
E. The sonographic features of malignant mediastinal lymph
nodes and a proposal for an algorithmic approach for
sampling during endobronchial ultrasound. Clin Respir J
2016;10:606-13. doi: 10.1111/crj.12267
4) Hylton DA, Turner S, Kidane B, Spicer J, Xie F, Farrokhyar
F, et al. The Canada Lymph Node Score for prediction of
malignancy in mediastinal lymph nodes during endobronchial
ultrasound. J Thorac Cardiovasc Surg 2020;159:2499-507.e3.
doi: 10.1016/j.jtcvs.2019.10.205
5) Shafiek H, Fiorentino F, Peralta AD, Serra E, Esteban
B, Martinez R, et al. Real-time prediction of mediastinal
lymph node malignancy by endobronchial ultrasound.
Arch Bronconeumol 2014;50:228-34. doi: 10.1016/j.
arbres.2013.12.002
6) El-Sherief AH, Lau CT, Wu CC, Drake RL, Abbott GF,
Rice TW. International Association for the Study of Lung
Cancer (IASLC) lymph node map: Radiologic review with
CT illustration. Radiographics 2014;34:1680-91. doi: 10.1148/
rg.346130097
7) Muehling B, Wehrmann C, Oberhuber A, Schelzig H, Barth T,
Orend KH. Comparison of clinical and surgical-pathological
staging in IIIA non-small cell lung cancer patients. Ann Surg
Oncol 2012;19:89-93. doi: 10.1245/s10434-011-1895-9
8) Cetinkaya E, Turna A, Yildiz P, Dodurgali R, Bedirhan
MA, Gürses A, et al. Comparison of clinical and surgicalpathologic
staging of the patients with non-small cell lung
carcinoma. Eur J Cardiothorac Surg 2002;22:1000-5. doi:10.1016/S1010-7940(02)00581-X
9) Hylton DA, Turner J, Shargall Y, Finley C, Agzarian J,
Yasufuku K, et al. Ultrasonographic characteristics of lymph
nodes as predictors of malignancy during endobronchial
ultrasound (EBUS): A systematic review. Lung Cancer
2018;126:97-105. doi: 10.1016/j.lungcan.2018.10.020