Computational Phenotyping of Obstructive Airway Diseases: A Systematic Review

Introduction

Chronic obstructive airway diseases, such as asthma and COPD, are heterogeneous conditions that exhibit diverse clinical presentations due to a variety of endogenous and exogenous factors.1,2 Obstructive airway diseases have distinct mechanistic pathways and heterogenous clinical presentations known as phenotype.3 Identification of specific phenotypes of airway diseases is important as this will help better to target therapies, personalize clinical interventions, and improve diagnostic accuracy.4

Over the past two decades, there has been an increase in the use of data-driven approaches in identifying phenotypes of chronic obstructive airway diseases.5 These approaches rely on unsupervised methods to extract latent patterns of the disease that are not known beforehand.6 This allows for the identification of disease subgroups that are more reflective of natural disease phenomena and that can guide clinical decision-making1,6. However, studies employing these methods, and the resulting phenotypes have been challenging to compare, perhaps due to differences in participants’ profiles, study settings, phenotyping methods employed, and number and types of variables used.2,6 To gain clear appreciation of the landscape of computational phenotyping of chronic obstructive airway diseases, a systematic synthesis of the underlying evidence is valuable. Through this, the methodological underpinning of studies can be ascertained, and the quality of evidence appraised, thus helping to identify potential research gaps in moving the field forward.

This review aimed at identifying, critically appraising, and synthesizing data from studies that have utilized computational approaches to phenotype chronic obstructive airway diseases in both children and adults. The review set out to characterize and compare the populations included in studies, assess and compare the criteria used to select participants, evaluate and compare the variables used to derive phenotypes of chronic airway diseases across studies, and assess the choices informing inclusion of variables. Additionally, the review described and compared the computational approaches used across studies and described and assess the number and characteristics of phenotypes derived across studies in terms of their clinical interpretation.

Methods Protocol and Registration

We developed a protocol that outlined the review processes and methods before undertaking this work, which was registered in PROSPERO (CRD42020164898) and published.7

Eligibility Criteria

Table 1 shows the full information on inclusion and exclusion criteria of studies into the review based on aspects of study design, setting, outcome, method of phenotyping, participants’ age, study year and language.

Table 1 Inclusion and Exclusion Criteria

Information Source

To identify relevant studies for the review, we searched PubMed, Embase, Web of Science, Scopus, and Google Scholar. For unpublished materials, such as conference proceedings, we searched databases of proceedings of conferences and databases of the literature, such as Open Grey. We also contacted experts in the field to request for any paper that was missed from our database searches. Finally, we screened the reference lists of included studies to identify any additional papers.

Search Strategy

We developed search strategies for all the databases to identify relevant studies for the review. The search strategies (Supplementary file 1) were first developed in PubMed and then adapted in searching the other databases.

Study Records Data Management and Selection Process

The search results from the different databases were exported to EndNote for screening. The first stage of the literature review involved removal of duplicates from the database searches; then, we performed title and abstract screening. Two reviewers independently screened the studies on the basis of the review inclusion and exclusion criteria; any discrepancies were resolved by discussion, or a third reviewer arbitrated if a consensus was not reached. The final stage involved full-text screening of the studies potentially meeting the eligibility criteria on the basis of the titles and abstracts. We documented the screening process using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flowchart.8

Data Collection Process

Reviewers, in pairs, independently extracted relevant data from included studies onto a data extraction form that was developed for the review; any discrepancies were resolved by discussion, or a third reviewer arbitrated if a consensus was not reached. We developed a data extraction form specifically designed for this review. The form was initially piloted on three included studies; any amendment was undertaken prior to using the form on all included studies.

Data Items

Information on the following data items were collected from included studies into the data extraction form: general information (author’s name, publication year and study time, aim of the study, and data source); information describing populations characteristics (population size, recruitment characteristics, sample size, children/adults, inclusion and exclusion criteria); type of chronic obstructive airway disease and how was the outcomes defined; information about the variables selected for phenotyping (number and description of variables, rational of selection, variable measurement and definition); type and features of computational approach used; and information of the derived phenotypes (number of phenotypes, characteristics of each phenotype, and clinical interpretation).

Outcome and Prioritization

We included studies focusing on computational phenotyping of the following chronic obstructive airway diseases:

Asthma

COPD and asthma and COPD overlap

Rhinitis

Emphysema

Quality Assessment of Included Studies

We appraised the general quality of included studies using an in-house developed checklist. Since, to our knowledge, there are no standard tools for assessing the quality of studies on computational disease phenotyping, we developed a checklist that enabled us to assess the quality of reporting specific aspects of the studies as they relate to performing a computational phenotyping. The aspects assessed were subjects’ selection and inclusion in the phenotyping sample; missing data; outcome definition; variables included for the phenotyping; clinical and scientific relevance of the derived phenotypes; and reproducibility of the phenotyping process. To evaluate reproducibility, we examined aspects such as the disclosure of detailed information on methods used for phenotyping, computational aspects of data processing, and the utilization of software and tools for reproducible research frameworks. Detailed information and form of quality assessment can be found in the supplementary material.

Data Synthesis

Data was narratively synthesized. We used tables and figures to summarize the results and different aspects of the studies, including study characteristics, methods of phenotyping, variables considered in phenotyping, counts of number of phenotypes, and description, as well as the results of the quality assessment.

Deviation from the Study Protocol

None of the identified studies addressed emphysema as an outcome to be phenotyped. Instead, emphysematous changes as features of phenotyping obstructive airway diseases were reported within studies on COPD. Further, studies that included subjects with asthma and COPD overlap (ACO) were reported as separate outcome.

Results Study Selection

A total of 3320 records were identified from the literature searches. After removal of duplicates, 2619 records were screened by title and/or abstract, of which 2460 records were excluded for not being eligible. A total of 159 records were considered for full-text review, of which 39 were excluded for different reasons, summarized in Table S1 in the supplementary material. Finally, 120 studies were included in this review analysis. Figure 1 shows the screening and selection of studies for this review.

Figure 1 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram illustrating the studies’ selection process.

Note: This figure was adapted from Page, Matthew J., et al. ‘The PRISMA 2020 statement: an updated guideline for reporting systematic reviews.’ bmj 372 (2021). https://doi.org/10.1136/bmj.n71.

Study Characteristics Asthma

A total of 60 studies were on asthma.9–68 The average number of subjects included in these studies was 1251, ranging from 50 to 9651 participants per study. The majority of studies were conducted among adults (n = 31)10,12–14,18–20,24,25,27,29–32,35–38,42–44,46–48,50,52,56,59–62 and the remaining (n = 9)21–23,28,33,34,40,45,51 in children, with remaining in mixed sample. Most studies were of cohort design (n = 30),12,14,15,17,20,23,24,28,30–33,36–38,41,42,47,50,51,54–60,62,65,66 while the rest were mostly cross-sectional (n = 17).10,11,13,16,18,21,22,25,27,29,33,35,40,43,45,46,48 Most studies were conducted in a clinical setting (n = 33)10–13,15,19–22,24,25,31,33–38,42,43,45,50,52–58,60–62,65 with patients variously recruited from hospitals, pulmonary rehabilitation centers, and primary or tertiary care respiratory or general clinics. Studies with subjects selected from general population were 14,14,18,23,32,36,40,41,44,46,47,51,59,63,68 while 819,28,29,31,35,40,44,53 studies did not report on the source of their participants. Full information on characteristics of studies in children and adults using unsupervised computational methods to phenotype asthma is presented in Table 2.

Table 2 Characteristics of Studies in Children and Adults Using Unsupervised Computational Methods to Phenotype Asthma and Severe Asthma

Severe Asthma

A total of 19 studies69–87 were on severe asthma. The average number of participants included in each study was 230, ranging from 40 to 1424 subjects per study. Most studies were conducted in a clinical setting (n = 17).69–72,74–77,79–87 One study78 in a general population setting and another study without a clear indication of setting.73 Most were cohort studies (n = 11),71,74–76,78–81,86,87 while the remaining were cross-sectional studies. Characteristics of studies in children and adults using unsupervised computational methods to phenotype severe asthma are presented in Table 2.

COPD

A total of 28 studies4,88–114 were on COPD. The average number of subjects per study was 5218, ranging from 46 to 104143 subjects per study. Most studies were conducted within a clinical setting (n = 17),4,88–93,95,97,98,100,107,109–113 with cohort studies88,89,92–96,98,100,101,109–111,114 being the most reported study design (n = 14), while the second common were of cross-sectional design (n = 7).4,90,91,97,107,112,113 Full information on characteristics of studies using unsupervised computational methods to phenotype COPD and ACO is given in Table 3.

Table 3 Characteristics of Studies Using Unsupervised Computational Methods to Phenotype COPD

Asthma and COPD Overlap (ACO)

Four of the included studies were on asthma and COPD overlap. The average number of participants included in these studies was 255, ranging from 47 to 435 participants per study. All were cross-sectional studies. Three studies were conducted in a clinical setting,115–117 while one was conducted in a general population setting.118 Characteristics of studies in adults using unsupervised computational methods to phenotype COPD and ACO are presented in Table 3.

Rhinitis

A total of 9 studies were on rhinitis.119–127 The average number of participants included in each study was 516, ranging from 115 to 1831 participants per study. Most studies were conducted in a clinical setting (n = 6),119–121,123,126,127 while three studies122,124,125 were conducted in the general population. Most were cohort studies (n = 7),119,121,123–127 one was cross-sectional,120 and one case–control study.122 Characteristics of studies using unsupervised computational methods to phenotype rhinitis are presented in Table 4.

Table 4 Characteristics of Studies Using Unsupervised Computational Methods to Phenotype Rhinitis

Phenotypes of Respiratory Diseases Asthma

In total, 251 phenotypes were reported in studies on asthma with considerable degree of overlap between them. In characterizing asthma phenotypes, atopy was the most common feature included in most studies,13–15,18,20,22,28,30,31,33,35,37,38,40,41,43,44,47,48,53,55,57,59–62,65,68 resulting in differentiation of atopic and non-atopic asthma phenotypes (reported in 29 studies and featured in 100 of the reported asthma phenotypes). Atopic status was defined mostly based on skin prick test, serum IgE levels or subjects’ report of familial atopy. Atopic asthma phenotype was reported in 28 studies,13–15,18,20,21,28,30,31,33,35,37,38,40,41,43,44,47,48,53,55,57,59–62,65,68 while non-atopic asthma was reported in 22 studies.14,15,18,20–22,30,35–38,40,41,43,44,53,55,57,59,61,65,68 The second feature was lung function measures, which featured in 85 of the reported phenotypes and was considered in 30 studies.10,14–16,18–22,24,27,28,30,33,36–38,40–44,50,52,55,60–62,65,68 Time at asthma onset featured in 74 phenotypes and reported in 27 studies.11,13,15,18,21–23,28,30,31,35–38,43,44,48,50,52,53,55,59–61,65 The definition of early and late onset asthma varied among different studies. When studying both children and adolescents, asthma that developed during childhood and adolescence was referred to as early onset while adulthood developed asthma as late-onset asthma. However, when examining only adults or children, researchers measured the average age at which asthma onset and the standard deviation across different phenotypes. In these cases, the terms early and late onset asthma were defined differently, with the groups having younger individuals labeled as early onset and those with older individuals classified as late-onset asthma. Early onset asthma phenotype was reported in 19 studies15,18,21,22,28,35,37,38,43,44,48,50,52,53,55,59–61,65 while late – onset asthma was reported in also 18 studies.11,13,18,21–23,30,31,35,37,38,43,50,53,55,59,60,65 Level of asthma control was also a commonly reported feature, occurring in 45 of phenotypes and reported in 17 studies.11,13,18,21,24,27,33,42–46,48,50,53,56,60 Well-controlled asthma phenotype was reported in 11 studies,21,24,42–46,48,53,56,60 while uncontrolled asthma was reported in 16 studies.11,13,18,21,24,27,33,42–44,46,48,50,53,56,60

Sex featured in 65 of the reported phenotypes. Female asthma phenotype was reported in 20 studies,10,11,13,16,18,20,21,30,31,35,42,43,45,48,52,53,55,60,61,63 while male asthma phenotype featured in 19 studies.10,11,13,18,20,22,28,30,40,42,43,45,47,52,53,55,60,61,63 Eleven studies reported on obesity-related asthma phenotypes.10,11,13,19,30,31,33,43,48,56,60

Disease activity was characterized variously across asthma studies, based on either symptoms’ activity or disease severity. Frequency of symptoms and rate of exacerbation featured phenotypes of high or low symptoms’ activity, while disease severity defined using standard criteria of asthma severity characterized phenotypes of mild, moderate or severe asthma. Severe asthma phenotypes as indicated by investigators were reported in 12 studies,10,16,18,20–22,28,34,35,48,56,67 while 20 studies10,14,15,18,21–24,33,41–43,50,55,57,59,60,62,66,68 reported on asthma phenotypes with high symptoms or exacerbation rates. Across identified studies, labeling a phenotype as severe was not entirely based on standard GINA criteria for defining severe asthma or physician decision, although some studies applied such approach.21,33,34,56 Otherwise, most investigators identified severity of phenotypes based on symptom frequency, need for high dosage of treatment and disease impairment of daily life, with no clear reporting on how severity was defined.10,16,20,28,35,48,67

Inflammation was considered in deriving asthma phenotypes using different indicators like inflammatory cell counts in peripheral or sputum induced samples, fractional exhaled nitric oxide (FeNO),10,18,44 or measure of inflammatory cytokines.58 A total of 36 phenotypes were described based on high or low levels of eosinophilic inflammatory cells in sputum or peripheral blood. Some of those were reported in 17 studies10,11,15,16,20,21,30,33,40,41,43,44,48,50,52,58,65 as asthma with high eosinophilia. Variants of neutrophilic asthma phenotypes, in turn, were less commonly reported in 10 studies.11,13,30,33,42–44,50,52,58 See full results on number of derived phenotypes and their descriptions for studies on asthma and severe asthma in Table 5.

Table 5 Number of Derived Phenotypes and Their Descriptions for Studies on Asthma

Severe Asthma

The total number of reported severe asthma phenotypes was 61 with considerable degree of overlap between them. The most reported features that differentiated severe asthma phenotypes were atopy, featuring in 28 phenotypes; age at disease onset, featuring in 25 phenotypes; treatment defined as medication dosage or treatment step; inflammation measures, featuring in 14 phenotypes; disease activity as frequency of symptoms and exacerbations, featuring in 14 phenotypes; and age and sex that featured 13 phenotypes.

Regarding allergic status and time at disease onset, 10 studies72,74–76,78–81,86,87 reported phenotypes of atopic severe asthma, while non-atopic severe asthma phenotypes were reported by 8 studies.74–76,78–80,86,87 Early onset severe asthma phenotypes were reported in 8 studies,70,72,74,75,78,79,86,87 while late-onset severe asthma phenotype variants were reported in 7 studies.70,72,74,78,79,86,87 Defining age of disease onset in most studies was based on measuring the mean and standard deviation of age at disease onset and comparing phenotypes.72,74,79,81,87 Only one study defined more than 12 years as cutoff for late onset.78

Disease activity in terms of symptoms differentiated phenotypes of severe asthma with high symptoms presentation in 6 studies,70,74,78,79,81,82 as well as in another 4 studies70,74,79,81 with low symptoms. Based on medication usage, phenotypes of severe asthma that require extra higher treatment were described in 6 studies.69,74,76,78,79,81 Those were in form of extra higher doses of ICS, oral corticosteroids (OCS), additional controller, regular use of systematic CS, or more frequent need for OCS and short controllers. In turn, lower to more moderate medication usage or requirement that was reported in 5 studies.69,74,76,79,81 Although spirometry measures were not as commonly reported as other indicators of disease activity, highly obstructed variants of severe asthma phenotypes were reported in 7 reports,72,74,75,78,79,86,87 while moderate to mild obstructed severe asthma in 5 reports.69,72,74,78,79

For demographic characteristics, variants of female severe asthma phenotypes were described in 4 studies,74,75,78,81 and male severe asthma phenotypes in similar count of records.74,75,78,80 Elderly related variants of severe asthma phenotypes were described in 4 studies,74,75,78,81 and young age severe asthma phenotypes in 3 records.75,78,82 Obesity-related variants of severe asthma phenotypes were reported in 5 studies.70,73,74,78,79 See full results on number of derived phenotypes and their descriptions for studies on severe asthma in Table 6.

Table 6 Number of Derived Phenotypes and Their Descriptions for Studies on Severe Asthma

COPD

The total number of reported COPD phenotypes was 57. The most reported feature for defining COPD phenotypes was lung function measured by spirometry that differentiated 44 phenotypes. Other commonly reported features were age, featuring in 26 phenotypes; symptoms and frequency of exacerbations, featuring in 24 phenotypes; sex, featuring in 17 phenotypes; and cardiovascular, metabolic, and psychiatric comorbidities, featuring in 14 −17 phenotypes.

Based on spirometry lung function measures, COPD phenotypes were classified as mild, moderate, or severely obstructed disease. Severe to moderately obstructed phenotypes of COPD were reported in 10 studies,4,88,91,93,95–97,102,105,109,111,112 while 5 studies91,93,96,102,112 reported mild obstructed COPD phenotypes. Other measures of lung function used for deriving COPD phenotypes were measures of accompanying emphysematous changes like lung diffusion capacity for carbon monoxide (DLCO),88 computed tomography (CT) measure of lung density and airway wall thickness.4,90,97,105 The latter identified COPD phenotypes with high, moderate to low emphysematous changes in 4 studies.4,90,97,105

Demographic and social characteristics like age, sex, body mass index and smoking were also used to define COPD phenotypes. Elderly related COPD phenotype was reported in 9 COPD studies,88,91–93,95,96,98,102,113 while 7 studies88,91,93,95,96,98,113 described COPD phenotypes that were characterized by young age. Variants of female-related COPD phenotypes were reported in 3 records,88,91,97 while male sex-related COPD was reported in 4 studies.88,91,96,102 Both over- and underweight were associated with COPD phenotypes when considering BMI. Obesity-related COPD was reported in two studies by Burgel et al4,91 while under or low weight-related COPD phenotypes were reported in 5 studies.4,93,95,111,113 Heavy, persistent, high rate or long duration smoking-related COPD phenotypes were reported in 3 studies,91,97,105 while 2 studies reported low smoking-related COPD phenotypes.91,102

Disease activity/severity was characterized in studies of COPD phenotyping variously using frequency of symptoms and exacerbations and level of treatment. COPD phenotypes with high frequency of symptoms and exacerbations were reported in 11 studies of COPD,4,91,93,95,97,100,102,107,109,111,112 while COPD phenotypes with low symptoms in 8 studies.4,91,93,97,100,102,107,112 Four studies91,93,97,105 reported on COPD phenotypes with utilization of high treatment doses and 2 others with low dosage treatment.91,97

Concerning comorbidities, the mostly reported ones to differentiate COPD phenotypes were cardiovascular diseases and diabetes and metabolic diseases,4,92,94,95,98 together with depression and anxiety.91,94,98,100,107 Additionally, features considered in characterizing COPD phenotypes were disease impairment on physical and daily activity, respiratory health, quality of life and mortality. COPD phenotypes with impaired quality of life were reported in 3 studies,97,109,113 while high mortality-related COPD phenotypes were reported in 5 studies.92,93,96,112,113 See full results on number of derived phenotypes and their descriptions for studies on COPD in Table 7.

Table 7 Number of Derived Phenotypes and Their Descriptions for Studies on COPD and Asthma COPD Overlap (ACO)

Asthma and COPD Overlap (ACO)

A total of 21 phenotypes of ACO were identified. The most reported features considered for differentiating ACO phenotypes were smoking status, which identified 7 phenotypes; inflammation status which identified 9 phenotypes; atopy that identified 7 phenotypes; spirometry measures identifying 5 phenotypes and disease activity/severity as per symptoms identifying 5 phenotypes.

Regarding socio-demographic aspect, smoking-related ACO phenotypes were reported in two studies,116,118 along with female ACO phenotype and obesity-related ACO,115 each one record. Lung function measures featured a highly obstructed ACO phenotype that was reported in two studies115,116 and a high symptom phenotype of ACO was reported in 1 study.115 With respect to inflammation status, eosinophilic variants of ACO were reported in one study,115 as well as neutrophilic ACO phenotype.115

For other disease characteristics, early onset ACO phenotypes were reported in one study,118 while 3 records reported a variant of atopic ACO phenotype.115,116,118 See full results on number of derived phenotypes and their descriptions for studies on COPD and ACO in Table 7.

Rhinitis

The total number of reported rhinitis phenotypes was 45. The most considered features for differentiating phenotypes of rhinitis were sex, which featured in 19 phenotypes; disease severity, which featured in 18 phenotypes; impairment on quality of life, which featured in 14 phenotypes and disease activity per symptoms that featured 10 phenotypes.

Considering socio-demographic characteristics, sex, age, and socio-economic status (SES) identified several rhinitis phenotypes. Variants of female-related rhinitis as well as male-related phenotypes of rhinitis were reported in near half of the reports (n = 5).119–121,124,126 Phenotypes of old age-related rhinitis were reported in 2 studies,121,126 as well as young age-related ones.126 SES featured phenotypes of high and low SES-related rhinitis which was reported by Lee et al.125 Alcohol intake further identified high intake-related phenotypes of rhinitis that was reported by Soler et al.126

Disease activity in terms of frequency of symptoms, classification of disease based on severity status as well as medication intake were commonly used to differentiate rhinitis phenotypes. High symptom phenotypes of rhinitis were reported in 3 studies.121,123,125 Severe rhinitis phenotypes, in turn, were reported in 3 studies,120,121,123 while Lee et al125 reported rhinitis phenotypes which require high treatment doses.

Measures of airways or lung function that were used to feature rhinitis included CT scanning diagnostics of rhinitis, endoscopy score, and FeNO, in addition to spirometry, bronchodilator reversibility, bronchial hyperresponsiveness in subjects with accompanying asthma.124,125 Two records described rhinitis phenotypes with highly obstructed airways.124,125 Rhinitis with high endoscopic and CT score were reported by two studies.119,126 One study reported on rhinitis phenotypes among asthmatics that is characterized by high inflammation indicated by FeNO.124 Among asthmatic with rhinitis, phenotypes of rhinitis with low to moderate BDR and BHR were also reported.124,125

Based on disease characteristics like time of onset and seasonality, atopy status and accompanying nasal polyposis, both early and late onset variants of rhinitis phenotypes were reported in two studies,121,124 while seasonal rhinitis and accompanying nasal polyposis phenotypes were reported by in the same count of studies.122,124 Variants of atopic rhinitis were reported in 3 studies,123–125 while polysensitization rhinitis phenotypes were also reported in 3 studies.120,122,123

The aspect of disease impairment on QOL and comorbidities was also frequently considered in featuring phenotypes of rhinitis. Phenotypes of rhinitis with impaired QOL were reported in 3 studies.121,126,127 Rhinitis phenotypes with related comorbidities of depression, fibromyalgia, diabetes, and dermatitis were reported in one study by Soler et al.126 See full results on number of derived phenotypes and their descriptions for studies on rhinitis in Table 8.

Table 8 Number of Derived Phenotypes and Their Descriptions for Studies on Rhinitis

Methods of Phenotyping

Various methods of unsupervised computational phenotyping of respiratory diseases were used across the reported studies (Figure 2). The most frequently implemented and reported unsupervised approaches for phenotyping of chronic airway diseases were hierarchical and non-hierarchical clustering10,32,38,40,46,54,70,81,

Comments (0)

No login
gif