This study complies with all relevant ethical regulations. The protocol for the Malawian study was approved by the National Health Scientific Research Committee in Malawi (protocol number 07/09/1913) and by the Medical Veterinary Life Sciences ethics committee in Glasgow (protocol number 200190041). The study protocol for the Brazilan study was approved by the local research ethics committee at Tropical Medicine Foundation Dr. Heitor Vieira Dourado, Manaus, Western Brazilian Amazon (protocol numbers CAAE:30152620.1.0000.0005 and CAAE:32077020.6.0000.0005). Additional studies on this cohort were published separately13,50. We also used open-access de-identified IMC data from a published US-based autopsy study conducted at New York Presbyterian/Weill Cornell Medicine Hospital, for which the study protocol was approved by the institutional review board at Weill Cornell Medical College34. Informed consent was taken from the families of deceased patients for all patients at all sites.
PatientsWe recruited patients aged 45–75 years who were admitted to QECH, Blantyre, between October 2020 and July 2021, during which there were two epidemiological waves driven by different SARS-CoV-2 variants: Beta (December 2020–February 2021) and Delta (May–July 2021)41. Patients admitted with respiratory signs were routinely tested for SARS-CoV-2 at QECH. We recruited patients into three groups based on clinical criteria: (1) a COVID-19 group (n = 9) with clinical features suggesting acute respiratory distress (ARDS, oxygen requirement and respiratory signs on either clinical examination or chest X-ray changes or both) and who had at least one nasal swab positive for SARS-CoV-2 on admission; (2) a non-COVID-19 LRTD group (n = 5) with clinical signs of ARDS but negative for SARS-CoV-2 on admission and during hospitalization; and (3) a no-LRTD, COVID-19-negative group (n = 2) with no oxygen requirement and no clinical signs of LRTD and for which the admission and any subsequent nasal swabs were negative for SARS-CoV-2 on polymerase chain reaction (PCR) (Fig. 1b and Extended Data Table 1). Clinical, premortem and postmortem laboratory data were entered into REDCap; double entry was used and checked by a third investigator, with discrepant results resolved by consulting the original source. The study only recruited patients who died between 24:00 and 12:00 to minimize the postmortem interval and to avoid doing any autopsies at night. None of the patients included had received any SARS-CoV-2 vaccine; only approximately 2% of the Malawian population had received a first dose by study completion.
Minimally invasive autopsyWe used minimally invasive tissue sampling (MITS) to conduct autopsies with large-bore needle biopsies of organ samples rather than full autopsy23. Being more culturally acceptable, MITS is widely used to determine cause of death in pediatric studies23,24,25, showing good concordance with full autopsy24. From our ongoing pediatric MITS studies in Malawi, we adapted protocols for adult patients with COVID-19 to obtain tissue suitable for scRNA-seq and IMC, based on the protocol from the Child Health and Mortality Prevention Surveillance (CHAMPS) network but with adaptations. A larger-caliber needle (11 gauge) was used for biopsies to obtain larger tissue samples. Samples were taken from the brain through supraorbital sampling from both left and right sides. From each lung, samples were taken from lower-middle and upper zones from a single entry point, angling the needle to sample different areas. Nasal cells were collected from the nasal inferior turbinate using curettes (ASL Rhino-Pro, Arlington Scientific). Two curettes were collected from each nostril, and the cells were placed immediately into ice-cold HypoThermosol (STEMCELL Technologies). Cells were transported on ice in a cold box immediately to the laboratory and were spun at 300g for 5 min for either immediate processing for scRNA-seq or storage in a CryoStor 10 (see below). Nasal fluid was collected using matrix strips (Nasosorption, Hunt Developments). One strip was used per nostril. Personal protective equipment (PPE) was worn by all staff involved in the autopsies and for all work in the laboratory. Laboratory work on samples was performed in vented laminar flow hoods.
Processing and storage of samplesBiopsies from each organ were collected in three different ways for different downstream workflows: (1) for paraffin embedding for histology and IMC, put in 10% neutral buffered formalin; (2) for viable cells, put in ice-cold HypoThermosol (STEMCELL Technologies) for transport to the laboratory and then slow freeze in a CryoStor 10 (STEMCELL Technologies); and (3) for snap-frozen cells, put in cryovials and then seal and immediately submerge in liquid nitrogen.
Biopsies were fixed in 10% neutral buffered formalin for 4–8 h, rinsed in water and then embedded in paraffin blocks. Samples for viable cells were rinsed and cut into pieces of approximately 20–50 mm and then put into ice-cold CryoStor for 15–30 min before transfer to a −80 °C freezer in a chilled cryogenic storage container (CoolCell, Corning).
Blood cells collected into sodium heparin tubes were separated from plasma by spinning at 400g for 10 min. Plasma was then removed and spun for an additional 10 min at 1,500g, and plasma was frozen in aliquots at −80 °C. Cells were resuspended in 10% FBS in PBS, and PBMCs were separated using Ficoll-Paque with a 27-min spin at 450g and either used immediately for scRNA-seq or pelleted and resuspended in ice-cold CryoStor 10 and then moved to a −80 °C freezer in a chilled cryogenic storage container (CoolCell, Corning). The next day, samples were moved from the −80 °C freezer to liquid nitrogen for long-term storage. Snap-frozen samples were transferred in a liquid nitrogen dewar and then moved to liquid nitrogen storage tanks for long-term storage.
Pathology and organ-specific scoringFormalin-fixed tissues were paraffin embedded (FFPE) for lung, bone marrow, brain, spleen and liver to make blocks. FFPE blocks were sectioned at 2–4-μm thickness, mounted on glass slides and stained with H&E. A medical pathologist (S.K.) reviewed tissue slides, alongside patient histories and antemortem laboratory results per standard clinical practice, and completed an organ-specific scoring proforma that included COVID-19 features (Supplementary Table 3). Then, for a non-biased assessment, two additional pathologists, blinded to diagnosis, scored the lung pathology in all patients using systematic scoring criteria. Lung tissue was scored independently by two additional pathologists (C.A. and V.H.) who were blinded to patient history and previous diagnoses. After individual scoring, any discrepancies were discussed by joint review of the slides until a consensus was reached. The lung scoring was semi-quantitative for the parameters indicated in Extended Data Fig. 1a–c. Subsequently, we characterized each sample with a dominant histological characteristic—for example, fibrinopurulent inflammation/pneumonia in case the neutrophil infiltration with fibrin extravasation was marked next to a mild infiltrate of lymphocytes, plasma cells and macrophages. Whole-tissue slides from lung samples in our nine patients with COVID-19 can be accessed in their entirety and visualized at various magnifications, as if they were observed under a microscope, using our virtual microscope tool: https://covid-atlas.cvr.gla.ac.uk (de-identified slides will be uploaded and publicly viewable upon publication).
After scoring, in each lung biopsy, the most representative areas were manually selected based on the scoring performed on the H&E-stained section to create the TMAs with cores of 1 mm in diameter using the TMA Grand Master (3DHISTECH) and CaseViewer software (version 2.4.0119028). At least eight ROIs were taken from each case (four left, four right). From the newly created TMA-FFPE blocks, 4-mm-thick sections were cut and used for downstream IMC, in situ hybridization or bright-field immunohistochemistry.
Cause of death attributionA panel consisting of the pathologist who reviewed the patients, respiratory physician, intensive care physician, infectious disease physician and two trainee doctors reviewed all the patients to assign a cause of death. Codes assigning death were given according to International Classification of Diseases (ICD) codes and using the standard coding system used for death certification. The review consisted of a review of the clinical notes, premortem and postmortem laboratory results and the pathology report. Each member reviewed the documents independently and reached an individual verdict. When there were discrepancies, a consensus was reached through discussion.
Multiparameter cytokine assayCytokine levels were measured in plasma and nasal fluid samples using Luminex with the Inflammation 20-Plex Human ProcartaPlex panel (Thermo Fisher Scientific, EPX200-12185-901) according to the manufacturer’s protocol and levels measured with a Luminex MagPix device. Data were transformed with a log2 and for the visualization with ComplexHeatmap in R with a z-score by cytokine.
IMCSections from TMAs underwent deparaffinization, followed by antigen retrieval at 96 °C for 30 min in Tris-EDTA at pH 8.5. Non-specific binding was blocked with 3% BSA for 45 min, followed by incubation with lanthanide-conjugated primary antibodies (overnight at 4 °C), which were diluted in PBS with 0.5% BSA (Supplementary Information). Antibodies were conjugated with metals using Maxpar Antibody Labeling Kits (Standard BioTools) and were validated with positive control tissue (tonsil and spleen for immune-targeted antibodies). Slides were then washed with 0.1% Triton X-100 in PBS, followed by nuclear staining with iridium (1:400; Intercalator-Ir, Standard Bio Tools) for 30 min at room temperature and, finally, briefly (10 s) washed with ultrapure water and air dried. Images were acquired on a Hyperion imaging mass cytometer as per the manufacturer’s instructions (Standard BioTools). Each TMA core was imaged in a separate ROI.
IMC analysisPre-processing, imaging denoise, cell segmentation and extraction of single-cell features were performed using a combination of Python and R packages, including ImcSegmentationPipeline, IMC-Denoise51 and DeepCell13,34,52. For the single-cell analysis, the annotated data object was generated, and protein expression raw measurements were normalized at the 99th percentile to remove outliers. In Scanpy (version 1.9.1), principal component analysis (PCA), batch correction and Harmony data integration were performed to compute and plot the uniform manifold approximation and projection (UMAP) embeddings (umap-learn Python package, version 0.5.3). Next, automated cell type assignment using the Python package Astir (version 0.1.4) was applied to identify the major cell types expected to be found in the lung tissue according to the antibody panel used. For cell assignment with Astir, the following information to label cells based on a broad ontogeny (metaclusters and major cell types) and the proteins (lineage markers) to be most expressed in each expected cell type were used: (1) macrophage: CD163, CD206, CD14, CD16, CD68, CD11c, Iba1; (2) neutrophil: CD66b, Arginase1; (3) CD8 T cells: CD3, CD8; (4) CD4 T cells: CD3, CD4; (5) B cells: CD20; (6) endothelium: CD31; (7) fibroblast: Collagen1; (8) SMC: smooth muscle actin; epithelial: PanCK; RBCs: CD235ab.
After cell assignment, cells labeled as ‘other’ or ‘unknown’ were filtered out from downstream analysis, and the annotated data object was subset into the major cell types identified—that is, macrophages, neutrophils, lymphoid, vascular, epithelial and stromal—and Phenograph Louvain clustering (with 200 nearest neighbors) was performed for each cell population separately using a small set of specific lineage marker and functional proteins. The finer cell type annotation was used to evaluate the frequency and absolute counts of cell types across clinical groups, histopathological lesions and HIV status. Differential abundance analysis was also performed using the scanpro and scCODA Python packages53 and the miloR R package (version 1.4.0)54. Spatial statistics analysis based on the coordinates of the cells in the ROIs was performed using the Python package Squidpy (version 1.2.2)55. These coordinates were used to plot spatial graphs and to calculate and plot neighborhood enrichment scores13.
Integration of Malawian IMC data with other available IMC COVID-19 lung dataIMC COVID-19 data from postmortem lung samples from published Brazilian13 and US34 fatal cohorts were integrated with the Malawian IMC dataset. First, datasets were concatenated in Scanpy taking the ‘inner’ (intersection) of all common protein markers in the panels across the three IMC datasets. Then, with scvi-tools56, we applied different integration methods, such as Harmony and variational autoencoder (VAE)-based methods, such as scVI and scANVI. Analysis of the UMAP embedding of the integrated versus non-integrated data showed that Harmony and scANVI performed better, and, in downstream analysis, we used Harmony-integrated output. Next, cell identities were standardized (label harmonization), which refers to a process of checking that labels are consistent across the datasets that are being integrated. Finally, cell frequencies in the postmortem lung across all three cohorts were plotted, and differential abundance analysis was performed using scanpro (https://github.com/loosolab/scanpro) and scCODA Python packages57 and the miloR R package (version 1.4.0)58.
Dissociation of lung cells from frozen samples and single-nuclei preparationLung samples were dissociated both from fresh samples and from slow-frozen samples that had been stored in liquid nitrogen. Slow-frozen cells were defrosted in a water bath at 37 °C, and then pieces of tissue were transferred to RPMI 1640 medium with 25 mM HEPES and L-glutamine (Thermo Fisher Scientific) and 40% heat-inactivated FBS (Thermo Fisher Scientfic). Fresh or defrosted frozen cells were then dissociated, adapting a previously published protocol for lung dissociation57. Samples were dissociated in a buffer containing 400 mg ml−1 Liberase DL (Sigma-Aldrich), 32 U ml−1 DNAse I (Roche) and 1.5% BSA in PBS (without calcium and magnesium). The tissue was put in buffer (four times weight:volume) in a GentleMACS C-tube (Miltenyi Biotec, 130-096-334), minced using scissors and then run on a GentleMACS dissociator (Miltenyi Biotec, 130-093-235) on the manufacturerʼs program ‘C-lung 01_02’. Dissociation was achieved by warming the tissue on an orbital shaker in a chamber at 37 °C for 30 min and running ‘C-lung 01_02’ twice more: once at 15 min and once at 30 min. The enzyme was neutralized by diluting with 10 ml of ice-cold 20% FBS, containing 32 U ml−1 DNase. The sample was then filtered through a 100-µm strainer (Corning, 352360), and samples were subsequently kept on ice with all centrifuge and antibody incubation steps at 4 °C. Cells were pelleted by spinning at 300g for 5 min at 4 °C. RBCs were removed by incubating with ACK lysing buffer (Thermo Fisher Scientific, A1049201) for 5 min at room temperature. For frozen cells, debris and dead cells were removed using a debris removal solution (Miltenyi Biotec, 130-109-398) and a dead cell removal kit (Miltenyi Biotec, 130-090-101), respectively, according to the manufacturerʼs protocol.
Single nuclei were isolated from snap-frozen lung tissue samples using a previously published method7. Tissue was kept on dry ice/liquid nitrogen until processing was started. Tissue was placed into a GentleMACS C-tube containing 2 ml of freshly prepared nuclei extraction buffer that contained RNAse inhibitors: 0.2 U µl−1 RNaseIN Plus RNAse inhibitor (Promega) and 0.1 U µl−1 SUPERasin RNAse inhibitor (Thermo Fisher Scientific). Dissociation was achieved by running the C-tube on the GentleMACS dissociator on program ‘m_spleen_01’ for 1 min. The sample was filtered using a 40-µm strainer and spun at 500g for 10 min at 4 °C. Pellet was then resuspended in 500 µl of 1× ST without RNAse inhibitor and filtered again using a 35-µm strainer. A 10-µl volume was loaded on a hemocytometer for counting.
Single-cell and single-nuclei partitioning and library preparation10x 3′ v3 chemistry was used for all samples. For fresh lung samples, we loaded 10,000 cells into one channel of a 10x chip (1000120). For fresh nasal and blood samples, we labeled the nasal and blood samples with different hashtags and pooled them at a 1:1 ratio and loaded 10,000–20,000 cells. For frozen nuclei and single-cell samples, we pooled samples from 3–6 different patients aiming for equal ratios and loaded 20,000–40,000 cells per nuclei. Libraries were prepared according to the manufacturer’s protocol and sequenced with an Illumina NextSeq 2000. To make these data available for analysis by others, reads were submitted to ArrayExpress (E-MTAB-13544).
Single-cell data processingFor all tissue compartments, the data were analyzed through the following steps. (1) Processing of the raw reads. 5′ scRNA-seq data along with the 3′ snRNA-seq runs were demultiplexed using Cell Ranger ‘mkfastq’. Reads were mapped to a concatenated human GRCh38, SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, GenBank MN908947.3) and HIV (human immunodeficiency virus 1, GenBank AF033819.3) reference genome to generate count matrices using Cell Ranger ‘cellranger count’ (version 7.0). (2) Ambient RNA removal. To reduce potential noise driven from empty droplets or ambient RNA captured in our samples, we used the tool SoupX (version 1.6.2)58 and used corrected expression matrices in subsequent analyses. (3) Quality control and filtering. Data were analyzed using the Seurat package (version 4.3)59 in R (version 4.2) with mitochondrial gene expression thresholding applied on individual samples. In addition, cells that were expressing more than 150 genes were retained to maximize discovery of cell types. (4) Normalization and variance stabilization. Samples were merged and normalized using the SCTransform() function, selecting the top 3,000 variable genes to drive the downstream clustering. Additionally, effects of mitochondrial gene expression, ribosomal gene expression and cell cycle were regressed out. (5) Integration. PCA was run on all merged data objects. The embeddings were then fed into the standard Harmony (version 0.1.1)56 integration pipeline. (6) Clustering and dimensionality reduction. An appropriate number of principal components (PCs) were selected to generate the UMAP. PCs were used to determine the k-nearest neighbors for each cell for the shared nearest neighbor (SNN) graph construction, followed by clustering at resolution 0.3. (7) Cell type annotation. Identification of cluster markers for the lung and nasal datasets were calculated by running FindAllMarkers() using MAST, followed by Bonferroni multiple test correction. We specified that genes must be expressed in at least 25% of cells (min.pct = 0.25) with a log fold change of 0.25. Cell types were manually annotated, leveraging canonical cell type markers reported from existing literature and curated datasets. Peripheral blood clusters were annotated using the consensus label transfer algorithm SingleR (version 2.0.0)60 using the Azimuth Reference PBMC atlas (https://zenodo.org/records/4546839). Cells with low mapping scores were reanalyzed and manually annotated as above. (8) Gene Ontology (GO) and pathway analysis. DE genes across conditions were calculated using the FindMarkers() function using MAST. Genes were defined as DE with a significance threshold of less than 0.05 and a log fold change threshold of 0.25, followed by Bonferroni correction. Gene set enrichment analysis (GSEA) was done using the fgsea package (1.3.0)61 using 50 canonical hallmark gene sets as described in the Molecular Signatures Database (MSigDB) (version 7.5.1)62. (9) Module scoring. Gene module scoring was calculated using the AddModuleScore() function of gene sets taken from MSigDB and AmiGO 2 (ref. 63) that related to IFN responses (lambda (GO:0034342), alpha (GO:0035455), IFN-β (GO:0035456), IFN-γ (GO:0034341), IL6/JAK/STAT (HALLMARK_IL6_JAK_STAT3_SIGNALING) and TNF (HALLMARK_TNFA_SIGNALING_VIA_NFKB)). log fold changes in module scores were calculated using the log2 + 1 of the differential means across a cell type. (10) Cell–cell communication analysis. Inference of cellular communications was computed using the multinichenetR (version 1.0.3) package64 with a log fold change cutoff of 0.5 being expressed in at least 10% of cells across conditions.
Hashtag demultiplexingHashtag reads were quantified using CITE-seq-Count (version 1.4.4)65 and demultiplexed using cellHashR (version 1.0.1)66. The following methods were tested: BFFcluster, BFFraw (10), GMM-Demux67, Seurat HTODemux59 and DropletUtils hashedDrops68, with HTODemux resulting in the highest number of singlets that were used for analysis.
Single-nucleotide polymorphism splitting of multiplexed runsDemultiplexing of runs was carried out using the single-nucleotide polymorphism (SNP) clustering algorithm Souporcell69 to identify distinct genotypes and assign cells to different individuals. For each run, we set the number of clusters (k) to the expected number of genotypes in the run (k = 2–6), and cell barcodes were assigned to each cluster. Cluster barcodes were then used to subset the input BAM file across human leukocyte antigen (HLA) loci of the multiplexed runs, under the assumption that these would be distinct regions of the genome for each individual. Using Integrative Genomics Viewer (IGV), we visualized SNP distributions at a set allele frequency of 0.2 and compared the subset BAM files to BAM files from individual runs. Iteratively, Souporcell clusters were assigned to samples through the following rationale: (1) matching SNP distributions to independent sequencing runs, (2) through mapping to sex chromosomes or (3) through the process of elimination where an independent sequencing run genotype was not available. In scenarios where Souporcell failed to identify the expected number of genomes, we assigned cluster barcodes to matching genotypes from independent sample runs regardless of expected k. After successful demultiplexing, we identified which cells derived from which patient and were able to proceed with downstream single-cell analyses as outlined above (see ‘Single-cell data processing’ subsection).
HLCA integrationThe HLCA10 was filtered down, retaining cells that were taken from the lung and lung parenchyma. These included studies originating from the Northern Hemisphere, with lung cell data in COVID-19, pneumonia and healthy controls. Cell type annotations harmonized with our analyses (AT1, AT2, EC arterial, EC capillary, EC venous, Fibroblasts, Innate lymphoid cell, NK, Macrophages, Monocytes, T cell lineage) were selected. To have sufficient power for downstream analyses with our cohort, we randomly subsampled each cell type within each disease condition to create a normalized atlas of 100,000 cells to integrate with our lung atlas. Processing and integration steps were followed as described previously for the Malawian cohort using 38 PCs and a clustering resolution of 0.2. Manual cluster annotation was performed by running FindAllMarkers(), leveraging canonical cell type markers.
Pseudobulking single-cell nasal and bloodTo make our nasal and blood scRNA-seq comparable with Luminex cytokine data, we assigned all cells to a unified identifier (‘pseudo_cluster’) to pool cells belonging from different cell type clusters together. Then, the average expression of the different cytokines on the Luminex panel were visualized using ComplexHeatmap70 and a z-score of the counts (Supplementary Fig. 5). For the statistical tests of genes associated with the IFN-y pathway, we used a Welch two-sample t-test.
Exploring viral reads in samplesTo identify SARS-CoV-2-infected cells in our lung dataset, we quantified the number of unique molecular identifiers (UMIs) that were detected after mapping with Cell Ranger across our single-cell datasets. A given cell was deemed to be infected if it expressed at least two UMIs of genes mapping to the SARS-CoV-2 genome.
Integration of Malawian COVID-19 lung IMC data with Malawian COVID-19 lung snRNA-seq dataLung IMC and snRNA-seq data, exclusively from Malawian patients with COVID-19, were integrated with the recently developed integration tool MaxFuse, which integrates data across weakly linked modalities, such as protein and RNA expression, through cross-modality matching and iterative smoothed embedding43. Highly variable features (s.d. > 0.3 for the RNA expression and s.d. > 0.1 for the protein expression) shared between both datasets were retrieved based on a protein-to-gene correspondence list, produced by the MaxFuse authors and edited to include specific protein markers in our IMC panel (Supplementary Information). Cell counts used for each modality included IMC (53,762 cells) and snRNA-seq (36,616 cells). Previously normalized and batch-corrected IMC protein expression and snRNA-seq RNA expression were used as MaxFuse input. All values were capped between 5% and 95% quantiles for visualization purposes. With the resulting integration, expression levels of IFN-γ response-related genes (IFNGR1, IFNGR2, HLA-DRA, HLA-DRB1, C1QA, APOE, IFI30 and CD74) and IFN-γ signature score were determined and plotted in the lung cells derived from the IMC data.
In situ hybridization co-staining for CD3 and IFNG and CD206 (MRC1) and IFNGR2In situ staining was performed on TMAs with 138 ROIs using the same TMAs and patients used for IMC, covering multiple lung regions from left and right lungs in nine patients with COVID-19, three patients with LRTD and two non-LRTD patients. Consecutive slides were used for two dual staining panels: one for IFNG and CD3E and the other for IFNGR2 and MRC1 (CD206). Slides were stained according to the manufacturer’s instructions (product codes: 322452 and 322500, ACD, Bio Techne) using the probes Hs IFNG-C1, Hs IFNGR2-C1, Hs-MRC1-C2 and Hs CD3E-C2 (product codes: 310501, 553971-C2, 1269501-C1 and 583921-C2, ACD, Bio Techne) and positive and negative control probes PPIB/POLR2A and DapB (product codes: 321641 and 320751, ACD Bio Techne). Slides were digitized and scanned with standard settings at ×80 magnification using the Motic EasyScan Infinity 60 digital slide scanner (I. Miller Microscopes). For quantification of positive cells, we used HALO software (version 3.6.4134.362) with the AI module (3.6.4134) and the FISH module (version 3.2.3) for cell detection after deconvolution.
ImmunohistochemistryImmunohistochemistry was performed in an autostainer using the Envision kit and DAB chromogen (product codes: K4003 and K4001, Agilent Technologies) with anti-CD206/MRC1 (E2L9N) or anti-CD3 antibodies (product codes: 91992, Cell Signaling Technologies, and A0452, Agilent Technologies). Slides were digitized and scanned at ×20 magnification using an Aperio VERSA 8 slide scanner (Leica Biosystems) and Aperio VERSA 1.0.4.125 software (Leica Biosystems).
Statistics and reproducibilityNo statistical method was used to predetermine sample size. We excluded nine single-cell sequencing runs that had few to no cells and that did not pass standard quality control metrics. Within our lung atlas, a population of cells (n = 1,348) was excluded that we deemed to be low-quality cells that almost exclusively derived from one multiplexed single-nuclei sequencing run that exhibited extremely low UMI counts. Two non-COVID-19 patients with LRTD were excluded from IMC runs as they had evidence of active TB lung disease because of theoretical safety concerns, as IMC can generate aerosol. Pathologists were blinded to patient groups for systematic scoring of the lung, and investigators conducting the in situ validation experiments undertook staining and automated scoring on the TMAs blinded to which samples were from which case or group. For other experiments and analyses, investigators were not blinded to case groups. Samples were sequenced as multiplex, including patients from different groups, and IMC was run on TMAs as a single run, in both instances to reduce batch effect.
Ethics and inclusion statementMalawian researchers with clinical, laboratory, analysis and medical ethics expertise were involved throughout the research process from conception to manuscript preparation. The main research questions were determined by Malawian clinical and laboratory researchers alongside international researchers who were living and working in Malawi. Before conducting the study, we undertook a full sensitization process for the study with all staff on the recruiting wards in our hospital to discuss the study and consider the best way of sensitively conducting recruitment and informed consent. This work was led by two social scientists (L.S. and D.N.), one specialized in bioethics (D.N.). Details of our approach and considerations for recruitment are published as a chapter in a casebook separately71. Extensive research and laboratory infrastructure already exists in Malawi through a medical university (Kamuzu University of Health Sciences) and several internationally funded research programs. Building on this, as part of this project, local research capacity was enhanced by establishing a single-cell platform in Malawi and training local scientists and by additional training of local scientists in tissue processing. As a result, all tissue processing and cell partitioning and library preparation for single-cell and single-nuclei sequencing were done in Malawi. The research protocol was approved by a Malawian research ethics review committee (National Health Service Research Ethics Committee) and, in the United Kingdom, by the University of Glasgow Medicine Veterinary and Life Sciences Research Ethics Committee. Safety of staff was ensured by conducting renovations to create a dedicated autopsy room for COVID-19 autopsies and by providing PPE and cleaning solutions and training all staff that were patient facing or involved in sample collection and handling in their appropriate use. Laboratory work was conducted in a laminar flow hood using PPE. Local and regional research, including autopsy studies and investigative work, were considered throughout the study and are appropriately cited.
Reporting summaryFurther information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Comments (0)