The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.
Publications

Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity.

Multi-ancestry fine mapping implicates OAS1 splicing in risk of severe COVID-19.
The OAS1/2/3 cluster has been identified as a risk locus for severe COVID-19 among individuals of European ancestry, with a protective haplotype of approximately 75 kilobases (kb) derived from Neanderthals in the chromosomal region 12q24.13. This haplotype contains a splice variant of OAS1, which occurs in people of African ancestry independently of gene flow from Neanderthals. Using trans-ancestry fine-mapping approaches in 20,779 hospitalized cases, we demonstrate that this splice variant is likely to be the SNP responsible for the association at this locus, thus strongly implicating OAS1 as an effector gene influencing COVID-19 severity.

Whole-genome sequencing reveals host factors underlying critical COVID-19.
Critical COVID-19 is caused by immune-mediated inflammatory lung injury. Host genetic variation influences the development of illness requiring critical care1 or hospitalization2–4 after infection with SARS-CoV-2. The GenOMICC (Genetics of Mortality in Critical Care) study enables the comparison of genomes from individuals who are critically ill with those of population controls to find underlying disease mechanisms. Here we use whole-genome sequencing in 7,491 critically ill individuals compared with 48,400 controls to discover and replicate 23 independent variants that significantly predispose to critical COVID-19. We identify 16 new independent associations, including variants within genes that are involved in interferon signalling (IL10RB and PLSCR1), leucocyte differentiation (BCL11A) and blood-type antigen secretor status (FUT2). Using transcriptome-wide association and colocalization to infer the effect of gene expression on disease severity, we find evidence that implicates multiple genes—including reduced expression of a membrane flippase (ATP11A), and increased expression of a mucin (MUC1)—in critical disease. Mendelian randomization provides evidence in support of causal roles for myeloid cell adhesion molecules (SELE, ICAM5 and CD209) and the coagulation factor F8, all of which are potentially druggable targets. Our results are broadly consistent with a multi-component model of COVID-19 pathophysiology, in which at least two distinct mechanisms can predispose to life-threatening disease: failure to control viral replication; or an enhanced tendency towards pulmonary inflammation and intravascular coagulation. We show that comparison between cases of critical illness and population controls is highly efficient for the detection of therapeutically relevant mechanisms of disease.

A first update on mapping the human genetic architecture of COVID-19.
The COVID-19 pandemic continues to pose a major public health threat, especially in countries with low vaccination rates. To better understand the biological underpinnings of SARS-CoV-2 infection and COVID-19 severity, we formed the COVID-19 Host Genetics Initiative1. Here we present a genome-wide association study meta-analysis of up to 125,584 cases and over 2.5 million control individuals across 60 studies from 25 countries, adding 11 genome-wide significant loci compared with those previously identified2. Genes at new loci, including SFTPD, MUC5B and ACE2, reveal compelling insights regarding disease susceptibility and severity.

Proteomic characterization of acute kidney injury in patients hospitalized with SARS-CoV2 infection.
This proteomics-based study identified biomarkers of acute and long-term kidney dysfunction in hospitalized COVID-19 patients. In a discovery cohort (N = 437), 443 proteins were associated with stage 2/3 AKI, and 62 of these were validated in an external cohort (N = 261). Elevated markers of tubular injury (e.g., NGAL) and myocardial stress were linked to AKI. Among the validated proteins, 25 were also associated with decreased post-discharge eGFR, including desmocollin-2, trefoil factor 3, and cystatin-C, suggesting persistent tubular dysfunction. Overall, COVID-associated AKI appears to be multifactorial, involving both renal and cardiac injury pathways.

The dynamic changes and sex differences of 147 immune-related proteins during acute COVID-19 in 580 individuals
This study investigated temporal changes in immune-related proteins during acute COVID-19 and their association with disease severity and sex-based outcome differences. Using the SOMAscan aptamer-based proteomic platform, the researchers measured levels of 147 immune-related proteins in two large hospital-based cohorts from Canada and the U.S., comprising 580 individuals (mean age 64.3 years; 47% male). Generalized additive models with cubic splines were applied over the first 14 days from symptom onset, adjusting for age and sex, to identify proteins significantly associated with severe disease—defined as requiring invasive or non-invasive respiratory support. A total of 69 proteins showed significant differences between severe cases and controls (p < 3.4 × 10⁻⁴), although high intercorrelation among 108 proteins complicated causal inference. Notably, six proteins demonstrated sex-based differences in temporal expression, with three (CCL26, IL1RL2, IL3RA) also associated with severe disease, highlighting potential mediators of sex disparities in outcomes. These findings provide mechanistic insights into immune dysregulation in severe COVID-19 and underscore specific protein targets for further investigation.

Dehydration is associated with production of organic osmolytes and predicts physical long-term symptoms after COVID-19: a multicenter cohort study.
This study investigated the physiological response to dehydration—specifically, the aestivation response—and its association with acute and long-term outcomes in COVID-19 patients. Using data from 165 ICU-admitted patients in the Pronmed cohort and validated in 1,052 patients from the Biobanque Québécoise de la COVID‑19 (BQC19), dehydration was assessed via estimated osmolality (eOSM = 2Na + 2K + glucose + urea). Higher eOSM correlated with increased reliance on organic osmolytes and reduced contributions from sodium and potassium, consistent with a shift toward aestivation. Dehydration was significantly associated with adverse acute outcomes (death, invasive ventilation, acute kidney injury) and with higher scores of physical long-COVID symptoms—more so than mental symptoms—after adjustment for confounders. Metabolomic profiling further supported this by showing an enrichment of amino acids among metabolites displaying an aestivation-like pattern. These findings suggest that dehydration during acute COVID-19 triggers a metabolic stress response involving protein catabolism, which may contribute to long-term physical sequelae.

External Validation of the COVID-NoLab and COVID-SimpleLab Prognostic Tools.
Our objective was to externally validate 2 simple risk scores for mortality among a mostly inpatient population with COVID-19 in Canada (588 patients for COVID-NoLab and 479 patients for COVID-SimpleLab). The mortality rates in the low-, moderate-, and high-risk groups for COVID-NoLab were 1.1%, 9.6%, and 21.2%, respectively. The mortality rates for COVID-SimpleLab were 0.0%, 9.8%, and 20.0%, respectively. These values were similar to those in the original derivation cohort. The 2 simple risk scores, now successfully externally validated, offer clinicians a reliable way to quickly identify low-risk inpatients who could potentially be managed as outpatients in the event of a bed shortage. Both are available online (https://ebell-projects.shinyapps.io/covid_nolab/ and https://ebell-projects.shinyapps.io/COVID-SimpleLab/).

Exome-wide association study to identify rare variants influencing COVID-19 outcomes: Results from the Host Genetics Initiative.
Host genetics is a key determinant of COVID-19 outcomes. Previously, the COVID-19 Host Genetics Initiative genome-wide association study used common variants to identify multiple loci associated with COVID-19 outcomes. However, variants with the largest impact on COVID-19 outcomes are expected to be rare in the population. Hence, studying rare variants may provide additional insights into disease susceptibility and pathogenesis, thereby informing therapeutics development. Here, we combined whole-exome and whole-genome sequencing from 21 cohorts across 12 countries and performed rare variant exome-wide burden analyses for COVID-19 outcomes. In an analysis of 5,085 severe disease cases and 571,737 controls, we observed that carrying a rare deleterious variant in the SARS-CoV-2 sensor toll-like receptor TLR7 (on chromosome X) was associated with a 5.3-fold increase in severe disease (95% CI: 2.75–10.05, p = 5.41×10-7). This association was consistent across sexes. These results further support TLR7 as a genetic determinant of severe disease and suggest that larger studies on rare variants influencing COVID-19 outcomes could provide additional insights.

Association between Circulating Amino Acids and COVID-19 Severity.
The severity of the symptoms associated with COVID-19 is highly variable, and has been associated with circulating amino acids as a group of analytes in metabolomic studies. However, for each individual amino acid, there are discordant results among studies. The aims of the present study were: (i) to investigate the association between COVID-19-symptom severity and circulating aminoacid concentrations; and (ii) to assess the ability of circulating amino-acid levels to predict adverse outcomes (intensive-care-unit admission or hospital death). We studied a sample of 736 participants from the Biobanque Québécoise COVID-19. All participants tested positive for COVID-19, and the severity of symptoms was determined using the World-Health-Organization criteria. Circulating amino acids were measured by HPLC-MS/MS. We used logistic models to assess the association between circulating amino acids concentrations and the odds of presenting mild vs. severe or mild vs. moderate symptoms, as well as their accuracy in predicting adverse outcomes. Patients with severe COVID-19 symptoms were older on average, and they had a higher prevalence of obesity and type 2 diabetes. Out of 20 amino acids tested, 16 were significantly associated with disease severity, with phenylalanine (positively) and cysteine (inversely) showing the strongest associations. These associations remained significant after adjustment for age, sex and body mass index. Phenylalanine had a fair ability to predict the occurrence of adverse outcomes, similar to traditionally measured laboratory variables. A multivariate model including both circulating amino acids and clinical variables had a 90% accuracy at predicting adverse outcomes in this sample. In conclusion, patients presenting severe COVID-19 symptoms have an altered amino-acid profile, compared to those with mild or moderate symptoms.