Introduction

Coronavirus 2-2019 pandemic (COVID-19) is caused by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) [1]. This new infectious disease dramatically impacted global health and healthcare system of countries due to easily transmitting from human to human besides being a serious infection. Thus, hospitals are currently facing undesirable crowdedness, costs, and inadequacy of resources to take care of COVID-19 patients.

Optimizing initial triage of patients could help to decrease health adverse impact of the disease, through better clinical management, and healthcare systems’ load, via efficient prioritization of cases and timely discharge of admitted patients. However, the variable and unpredictable progression of COVID-19 patients [2,3,4] makes it difficult to establish a system to divide patients into different risk groups. Efforts have been taken to propose the potential factors influencing the progression of disease or outcome in COVID-19 cases; nevertheless, they were mainly focused on signs and symptoms and/or demographic and laboratory data in a relatively small population [2, 5].

Although being normal in some symptomatic patients [6], the diagnostic value of chest computed tomography (CT) scan is already recognized for COVID-19 in having higher sensitivity compared to real-time reverse transcription polymerase chain reaction (rRT-PCR) (97.2% vs. 83.3 [7]; 98% vs. 71% [8]); besides, it is a relatively accessible imaging modality for pneumonia diagnosis in many secondary and tertiary healthcare facilities. Therefore, we hypothesized that initial imaging features in chest CT scan of COVID-19 patients, particularly if combined with their demographic and clinical characteristics, have a high predictive value in differentiating outcome of the disease. To test the hypothesis, we assessed the accuracy of a novel semi-quantitative scoring system for pulmonary involvement (PI) in predicting COVID-19 patients’ outcome in patients with both highly clinically suspicious and established COVID-19 and compared their outcome, via studying all patients, who were referred to respiratory triage of a tertiary referral hospital, from 27 March 2020 to 26 April 2020.

Materials and methods

Study design

The current study was carried out in a tertiary referral university hospital, 27 March 2020 to 26 April 2020. This study was reviewed and approved by the Institutional Review Board of our institute and conducted in concordance with World Medical Association Declaration of Helsinki. Given the anonymous medical records, informed consent requirement was waived by the ethics committee of our institute. All diagnostic and admission criteria and therapeutic approaches were based on national protocols.

Participants

Any patient with a dry cough, dyspnea, chills, or pharyngitis, regardless of fever, with positive exposure history (travel to recognized epidemic regions or contact with COVID-19 patients within the last 14 days) were considered clinically highly suspicious cases (739 patients) [9]. We included all clinically highly suspicious patients referred to respiratory triage (established after COVID-19 pandemic) if they had undergone a chest CT scan in their initial assessment. Patients with incomplete medical documents were excluded. A total of 439 patients were further evaluated and recognized as confirmed COVID-19 cases if they either had positive rRT-PCR assay or were highly clinically suspicious with typical chest CT scan manifestations [10]. rRT-PCR samples were from a nasopharyngeal or oropharyngeal swap or endotracheal aspirate. Typical chest CT scan manifestations were defined as peripheral multi-lobar or multifocal ground-glass opacity (GGO) with or without rounded morphology, with or without consolidation, crazy paving, or reverse halo [10].

Data collection

Population characteristics

All patients’ demographic variables (age and gender) and vital signs (heart rate (HR, per minute), respiratory rate (RR, per minute), temperature (T, Celsius), and SpO2 (percent)), recorded at their initial evaluation, were collected from medical documents. Patients with RR ≥ 24 and SpO2 ≤ 93% were separately investigated as clinically severe cases [9]. The site of management was considered as outcome of interest: (a) outpatient (Fig. 1); (b) ordinary-ward admitted (Fig. 2); and (c) ICU admitted (Fig. 3), besides survival status (Fig. 4) during hospitalization.

Fig. 1
figure 1

A 34-year-old male patient with confirmed COVID-19; managed in outpatient setting; pulmonary involvement: ground-glass opacity (GGO) with peripheral, pleural sparing distribution with reverse halo sign. Total pulmonary involvement (PI) score, PI density score, and SpO2 were 1, 1, and 97%, respectively and he was stratified as a low-risk patient in both ICU and death predictive models

Fig. 2
figure 2

A 47-year-old female patient with positive COVID-19 PCR; admitted in an ordinary ward; pulmonary involvement: bilateral and peripheral pleural-based ground-glass opacity (GGO) in four lung lobes with reverse halo; pulmonary involvement (PI) score was 11 (5 for GGO, 6 for consolidation), and PI density score and SpO2 were 2.75 and 96%, respectively; she was stratified as a low-risk patient in both ICU and death predictive models

Fig. 3
figure 3

An 82-year-old male patient with COVID-19; admitted in ICU; pulmonary involvement. Diffuse bilateral ground-glass opacity (GGO) with consolidative changes in the base of both lungs and crazy paving in the upper lobes. RR, SpO2, and PI score were 30, 80%, and 13 (11 for GGO, 2 for consolidation), respectively; he was stratified as a high-risk patient in both ICU and death predictive models, indicating the critical condition requiring higher hospital care and monitoring

Fig. 4
figure 4

A 64-year-old male laboratory confirmed COVID-19; admitted in ICU and expired; pulmonary involvement: diffuse ground-glass opacity with diffuse consolidative changes in both lower lobes; total pulmonary involvement (PI) score of 26 (4 for GGO, 22 for consolidation) and PI density score of 5.2. RR and SpO2 were 28 and 85, respectively; he gained the highest possible risk in ICU and death predictive models, indicating the critical status requiring more intensive hospital care

Image acquisition and interpretation

Chest CT scan images were obtained at the time of presentation, in supine position, and full inspiration without contrast administration. The examinations were performed on the Lightspeed 64-detector CT (GE Healthcare) or the Siemens SOMATOM Emotion (16 slices) MDCT scanner. The imaging parameters were set at 2–3-mm section thickness; 0.6–2 mm beam collimation; 120 kVp tube voltage; 50–150 mAs tube current; 0.75 s tube rotation speed; and 0.5–0.75 s gantry rotation time.

Two fellowship-trained diagnostic radiologists, with 9 and 13 years of experience in thoracic radiology and blinded to patients’ clinical data, independently interpreted chest CT scans, reviewed in both lung and mediastinal windows. In case of any non-concurrence, interpretation was finalized by consensus. Chest CT scan findings were recorded according to Fleischner Society glossary and published literature on viral pneumonia [11]. Chest CT scan features included (a) predominant pattern: GGO, consolidation, or mixed; (b) dominant distribution pattern: peripheral (peripheral one-third of the lung)/pleural based (Fig. 2), peripheral/pleural sparing (Fig. 1), axial (medial two-thirds of the lung), or diffuse; (c) number of involved lobes; (d) other morphologies: crazy paving (Fig. 3), reverse halo sign (Figs. 1 and 2), intralesional traction bronchiectasis, parenchymal band, and Mesh-like opacity; and (e) additional findings: underlying pulmonary disease such as mosaic attenuation, bronchiectasis, emphysema, interstitial lung disease, cardiomegaly, pleural effusion (unilateral or bilateral), subsegmental atelectasis, dilation of pulmonary trunk, mediastinal or hilar lymphadenopathy, pericardial effusion, and pleural thickening.

Pulmonary involvement scoring system

To assess PI, a novel semi-quantitative scoring system was designed. All five lung lobes (RUL, right upper lobe; RML, right middle lobe; RLL, right lower lobe; LUL, left upper lobe; LLL, left lower lobe) were visually reviewed twice and separately for GGO and consolidation and scored from 0 to 5 for each pattern based on involvement percentage (0, no involvement; 1, ≤ 5%; 2, 6–25%; 3, 26–50%, 4, 51–75%; and 5, ≥ 76%). The total GGO and consolidation scores were the sum of all lobes’ scores. Total PI score was calculated as either (Fig. 5):

  • Sum of total GGO scores and total consolidation scores;

  • Sum of GGO and consolidation score of all five lobes

Fig. 5
figure 5

Different CT scores of left lower lung lobe (LLL) based on our proposed semi-quantitative scoring system on axial, sagittal, and coronal views. a LLL score: 1 (ground-glass opacity (GGO) ≤ 5%, consolidation: 0%). b LLL score: 2 (GGO: 0%, consolidation: 6–25%). c LLL score: 3 (GGO: 6–25%, consolidation ≤ 5%). d LLL score: 4 (GGO: 51–75%, consolidation: 0%). e LLL score: 5 (GGO: 26–50%, consolidation: 6–25%). f LLL score: 6 (GGO: 51–75%, consolidation: 6–25%). g LLL score: 7 (GGO: 51–75%, consolidation: 26–50%)

As the maximum total score of each lobe could be 7, PI score ranged from 0 (no involvement) to 35 (maximum involvement). Finally, PI density index was calculated by dividing the total PI score by the number of involved lobes.

Data analyses

We performed analyses in SPSS (Windows ver. 18; IBM SPSS Inc.). Descriptive data are presented in mean ± SD/frequency and percentage. We evaluated data normality by Kolmogorov-Smirnov test. We conducted comparisons by (1) independent sample t test/one-way analysis of variance (ANOVA) and further Tukey test for continuous variables with normal distribution; (2) Mann-Whitney U test or Kruskal-Wallis test for continuous not-normal and ordinal variables. Comparison between subgroups was done by Mann-Whitney U test considering Bonferroni correction, and (3) Chi-square test for nominal variables. All p values less than 0.05 were considered statistically significant.

Multivariate logistic regression (backward stepwise) was employed to assess association of independent variables with ICU admission or death via separate models. The final models of backward stepwise were considered models with the highest accuracy. To define optimum cutoff values for most significant chest CT scan findings in outcome prediction, receiver operating characteristic (ROC) curves were drawn and Youden’s J index [12] was used. Area under the ROC curve (AUC) was considered the indicator for efficacy of variable in ROC analysis.

Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy (and their 95% confidence intervals (CIs)) were calculated for PI score and for combinations of 2 or 3 of remaining significant findings. We assessed PI scoring system accuracy in predicting ICU admission and death in patients with confirmed COVID-19.

For sensitivity analysis, all association and accuracy analyses were done including both 739 high clinically suspicious patients and confirmed cases; the results did not significantly differ, and thus the presented results are limited to confirmed cases, if not indicated.

Results

Study population characteristics

In total, 739 clinically highly suspicious COVID-19 patients were referred to the studied setting, consisting of 419 (56.7%) male patients with mean age of 49.2 ± 17.2 years old (range, 11–97) (Table 1). Their mean RR and SpO2 were 21.3 ± 3.8/min (12–38) and 93.4% ± 5.3 (64–100), respectively. Among all 739 cases, 439 patients were considered confirmed cases based on chest CT scan and/or PCR (59.6%) results. Table 2 presents confirmed COVID-19 patients’ characteristics. In addition, 248 (33.6%) patients were admitted inhospital; of those, 176 were admitted in ordinary ward (23.8% of all patients and 71% of all hospitalizations) and 72 were admitted in ICU (9.7% of all patients and 29% of all hospital admissions). Of 739 patients, 28 patients succumbed to death (3.8%). All of the deceased patients showed positive CT, 25 showed positive PCR (89.3%), 20 were hospitalized in ICU (71.4%), and 8 were admitted in the ordinary ward (28.6%).

Table 1 Demographic, clinical, and imaging parameters in clinically suggestive patients
Table 2 Demographic, clinical, and imaging parameters in confirmed COVID-19 patients

Chest CT scan findings

The most prevalent CT features among clinically suggestive patients vs. confirmed COVID-19 patients were GGO (55.5% vs. 93.3%), pleural-based peripheral distribution (35.8% vs. 60.3%), multi-lobar (47.4% vs. 79.7%), bilateral (45.1% vs. 76.6%), and lower lobes (RLL and/or LLL) involvement (43.1% vs. 89.1%). The details of the dominant pattern and PI score for each lobe and the whole lung are presented in Tables 1, 2, and 3.

Table 3 Mean scores of GG, consolidation, and their sum score in different lobes and the whole lung

Patients’ characteristics’ association with ICU admission and death

Gender did not significantly associate with hospitalization (p value = 0.3). However, age, RR, and SpO2 significantly associated with treatment setting; age and RR were highest and SpO2 was lowest in ICU cases, followed by ordinary-ward-admitted and outpatient cases (all p values < 0.001). Further details of variables’ association with hospitalization are presented in Table 2.

Comparing mean GG and consolidation scores and sum of GG and consolidation score of each lobe, as well as total whole lung GG, consolidation, and PI score in patients with different management settings, mean score of ICU-admitted patients and outpatient was the highest and lowest, respectively (p values < 0.01, Table 2). In addition, all scores were higher in patients with RR ≥ 24 and SpO2 ≤ 93% (p values < 0.01). ICU-admitted patients had higher diffuse-distribution pattern, number of involved lobes, and PI density scores than others (p value < 0.001). Bilateral lung involvement was more prevalent in ICU-admitted vs. ordinary-ward-admitted patients (p value < 0.001) and ordinary-ward-admitted vs. outpatient cases (p value < 0.001, Tables 1 and 2).

Ordinary-ward and ICU-admitted patients had a significant increasing trend of cardiomegaly, both unilateral and bilateral pleural effusion, pericardial effusion, dilated pulmonary trunk, parenchymal band, crazy paving, intralesional bronchiectasis, and Mesh-like opacity (p values < 0.01, Table 2).

ICU admission and death predictive models

In binary logistic regression model—with age, gender, RR, SpO2, number of involved lobes, dominant distribution pattern, and PI score as independent variables—age, SpO2, and total PI score showed significant association with outcome-of-interest, in distinctly specified models (Table 4).

Table 4 Cutoff values based on ROC curve and Youden’s J analyses

In ROC analysis, PI score’s AUCs were 0.77 and 0.8 for ICU admission and death, respectively (Table 4). Using Youden’s J index, cutoff values of age and SpO2 were 53 and 91, respectively, for both outcomes. However, PI score cutoff differed for ICU admission and death (8 vs. 15) (Table 4). Sensitivity, specificity, PPV, NPV, and accuracy were calculated for ICU admission and death predictive models based on only PI score or its combinations with age and SpO2 (Table 5). A combination of all three parameters showed the best accuracy (ICU admission, 81.95; death, 91.46) and specificity (ICU admission, 89.1; death, 95.0) for both outcomes (Figs. 1, 2, 3, 4). Sensitivity, specificity, PPV, NPV, and accuracy of combination of all three findings for ICU admission predictive model were 40.9, 89.1, 39.6, 89.6, and 81.9, respectively. The same values for death model were 40.7, 95.0, 36.6, 95.7, and 91.4, respectively (Table 5).

Table 5 Sensitivity, specificity, PPV, NPV, and accuracy of CT findings for predicting the ICU admission and death

Discussion

In the current study, we aimed to not only propose a novel semi-quantitative scoring system to accurately predict each patient’s clinical progression at initial visit, but also to provide details to body of literature for characteristics of COVID-19 patients and their association with outcome of the disease in a relatively large sample of COVID-19 patients with established diagnosis. In short, our major finding was that age, SpO2, and PI score were the best predictive variables for the outcome of interest to predict higher ICU admission and death. A combination of all three showed 89.1% and 95% specificity as well as 81.9% and 91.4% accuracy for ICU admission and death, respectively. To elaborate, patients with negative findings or even low PI scores will not experience ICU admission or death with more than 95% precision (NPVs > 95%) (Figs. 1, 2, 3, 4). In other words and considering the models’ sensitivity, patients with two positive findings or even only high PI scores are more susceptible to the poorer outcomes and should be prioritized for ICU units. However, given the low PPVs of our models, it cannot be assumed that these patients will definitely be expired; but in this situation, not missing the critical cases possesses the most significance.

Comparing major CT findings in our clinically suspicious group with those of other studies [13, 14], we observed our reported lung involvement features to be totally lower; probably caused by previous studies mostly reporting chest CT scan features of admitted patients, while 66.4% of our cases were managed outpatient and it could be considered a better representation of all COVID-19 patients in communities due to sample characteristics. In line with the previous statement, when we limited our analysis to admitted patients, our chest CT analyses mostly followed previously described typical imaging finding, as follows: pulmonary involvement of patients increased from 55.5 to 93.3% for GGO, from 35.8 to 60.3% for pleural-based peripheral distribution, from 47.4 to 79.7% for multi-lobar and from 45.1 to 76.6% for bilateral involvement. The equivalent measurements in other studies were 88% for GGO, 78% for multi-lobar, and 87% for bilateral and 76% for peripheral distributions of involvement (a systematic review of 30 studies, including 919 patients) [15] and 91% for GGO, 86% for involvement of dorsal segment of RLL, and 53% for subpleural distribution (80 patients with COVID-19 diagnosis) [14].

Although the literature is relatively rich in chest CT findings of COVID-19 patients, limited studies investigated the prognostic value of initial chest CT findings to predict the disease outcome: A previous study on 83 (25 critical and 58 ordinary cases) laboratory-confirmed patients showed relatively similar findings to ours [16] as age, chest CT scan score, consolidation, and number of involved lobes were significantly higher in critical patients. However, and in contrast to our findings, GGO incidence, frequency of some lobes’ involvement, and RR did not significantly differ between critical and ordinary cases. Their employed 25-scale scoring system showed a 80.0% sensitivity and 82.8% specificity with a cutoff value of 7 in discrimination of two groups (AUC = 0.87) [16], which showed lower sensitivity, higher specificity, and approximately similar cutoff value in comparison to our scoring system. In a retrospective cohort study, demographic and imaging variables (including 24-scale CT involvement score) were compared between non-emergency (n = 87) and emergency groups (n = 14) of patients, classified according to their clinical status [13]. In accordance with our findings, they found a significantly higher CT involvement score (12.8 vs. 5.3), architectural distortion (42% vs. 18%), and traction bronchiectasis (85% vs. 47%) in the emergency group [13]. As they did not evaluate vital signs, further comparisons cannot be made. In another study on 120 COVID-19 patients, the authors compared various CT manifestations between two groups (96 inward hospitalizations and 24 ICU admission and expired cases) [17]. The lung involvement was calculated as average involvement percentage of each lung zone. In line with our findings, GGO pattern with peripheral and lower lobes involvement was the most prevalent chest CT feature. Additionally, the consolidation, total lung involvement, air bronchograms, and crazy paving were significantly higher in combined ICU and expired cases [17]. However, they did not report predictive indices of their scoring system for ICU admission/expired cases, which limits our comparison. Finally, and considering additional chest CT scan findings that were significantly higher in ICU admitted patients, including underlying cardiac disease (cardiomegaly, pericardial effusion, and dilated pulmonary trunk), pleural effusion, parenchymal band, crazy paving, and intralesional bronchiectasis, our findings followed previously reported results, which indicated higher pleural effusion, pericardial effusion, crazy paving, and Mesh-like opacity in critical patients [16,17,18,19].

As our findings mainly followed the previous literature, our interpretations are limited to how it adds by providing, to our knowledge, the most accurate available system of scoring and more detailed findings. In detail, those patients who do not meet our criteria for high-risk patients will not develop severe types of disease which require ICU admission with high precision. Besides, consolidation scores were significantly higher in ICU and expired cases in our study. This is probably caused by the notion that patients with advanced disease mostly develop necrotizing bronchitis and diffuse alveolar damage leading to consolidation [20]; however, GGO is a common finding without distinguishing severe cases.

This study is the largest investigation on predictive value of initial chest CT scan features in clinical outcomes. On top of everything, the suggested scoring system can accurately predict the clinical outcome and another imaging index, PI density score, concurrently considers PI score and number of involved lobes, thus distinguishing between patients with similar PI score but different densities of involvement. Besides, it is an integration of demographic, clinical, and imaging features. Taken together, we believe the results of this study could provide researchers and clinicians with a better insight towards COVID-19 patients’ surveillance and accurately predict what would probably happen to the patient to adjust their treatment plan and its evaluation accordingly. Besides, our suggested models for ICU and death outcomes are easily clinically feasible, since radiologists are expected to report the PI and PI density scores in addition to qualitative parameters in their everyday practice. Yet, our findings should be interpreted in light of some limitations. Firstly, the PCR-positive patients comprised only a small group of enrolled cases, probably due to its low sensitivity. In the present study, we evaluated predictive value of the initial chest CT scans for clinical outcomes; however, the interval time between symptom onset and CT acquisition was uneven among enrolled patients. Besides, there were admitted patients with normal on-admission chest CT scan whose further CT scan showed lung involvement. The clinical symptoms, underlying disease (including history of lung diseases, malignancy, chemoradiation, corticosteroid usage, and diabetes), and laboratory data were not available for the present study. Additionally, disease progression or chest CT scan changes within the disease course were not assessed; thus, we recommend further studies to consider these limitations.

In conclusion, we proposed a novel 35-scale semi-quantitative scoring system based on the severity and extent of PI to predict patient outcome, when he is first visited, which utilizes demographic and clinical features, though being valuable if solely used. Our findings revealed that demographic (age), clinical (RR and SpO2), and imaging (each lobe and total PI score, PI density index, predominant distribution pattern, number of involved lobes, and laterality) parameters significantly associated with patients’ outcome. Finally, we recommended two predictive models for ICU admission (age ≥ 53, SpO2 ≤ 91, PI score ≥ 8) and death (age ≥ 53, SpO2 ≤ 91, PI score ≥ 15) with high accuracy to provide clinicians with a better estimation to plan patients’ therapeutic approaches and pay higher attention and care for patients with higher risks according to proposed predictive tool.