Risk factors and prediction of hypoglycaemia using the Hypo-RESOLVE cohort: a secondary analysis of pooled data from insulin clinical trials

Data and cohort

Trial data from 26 clinical trials involving 12,247 people living with type 1 diabetes and 65 trials involving 34,007 people living with type 2 diabetes were provided by industry partners. All trials involved people with diabetes who were taking glucose-lowering medication with hypoglycaemia risk, mostly insulin. The raw trial data were standardised, harmonised and pooled in a unique hypo-RESOLVE database by the Swiss Institute of Bioinformatics, using the Clinical Data Interchange Consortium (CDISC) Study Data Tabulation Model Implementation Guide (SDTMIG 3.2) format [8] (see electronic supplementary material [ESM] Methods for details). In addition, the bespoke domain XH was created for hypoglycaemia event data, obtained from self-recorded episodes in participants’ diaries and serious adverse event declaration from clinical trials. The trials did not use continuous glucose monitoring (CGM). Some episodes were asymptomatic episodes noted on self-monitored blood glucose that met the agreed thresholds for hypoglycaemia, and some were symptomatic episodes. Level 3 (see below) episodes did not require a blood glucose measurement as this was not part of the definition, although it was often recorded. Some level 3 episodes were derived also from serious adverse event reporting. Each hypoglycaemia event was characterised by an event date, a blood glucose measurement (if available) and self-treatment status.

Despite the availability of raw data from each clinical trial, many trials had idiosyncratic data structures or collection procedures that precluded data harmonisation into the pooled database. These issues resulted in the exclusion of certain individuals and covariates due to the high levels of missingness introduced when integrating the data from these trials. We therefore first excluded individuals who met the following criteria: did not pass trial screening; lacked observation start or end dates; had missing age, sex or diabetes duration information; or had more than 20% missingness for hypoglycaemia event data. A hypoglycaemia event was considered missing if the event lacked a date of occurrence or it lacked a glucose measurement while simultaneously being either denoted as a self-treated event or the self-treatment status was missing.

Definitions of hypoglycaemia

Blood glucose measurements and whether assistance was required to handle each hypoglycaemia event was used to define hypoglycaemia in our analyses, irrespective of each trial’s own definition in the pooled dataset.

The International Hypoglycaemia Study Group (IHSG) [9] proposed three levels of hypoglycaemia that have been accepted recently by the European Medicines Agency (EMA) [10] and, as draft guidance, by the US Food and Drug Administration (FDA) [11]. Currently, these are as follows:

Level 1 hypoglycaemia alert events, defined as any event with a recorded blood glucose level of less than 3.9 mmol/l but not less than 3.0 mmol/l

Level 2 hypoglycaemia events, defined as any hypoglycaemia event with a recorded blood glucose level below 3.0 mmol/l

Level 3 hypoglycaemia events (severe hypoglycaemia), defined as any hypoglycaemia event in which the individual was unable to self-treat due to severe cognitive impairment, irrespective of glucose measurement

Within the pooled clinical trial dataset, level 3 was any event in the XH table that was both symptomatic and not self-treated.

Using these levels, we considered three separate classifications of hypoglycaemia event in our analyses:

Level 1 or worse: any hypoglycaemia event meeting the criteria of either level 1, level 2 or level 3

Level 2 or worse: any hypoglycaemia event meeting the criteria of either level 2 or level 3

Level 3

Candidate covariates

We sought to examine the association of subsequent hypoglycaemia with a wide range of variables that have either been previously reported as associated with hypoglycaemia or for which an association might reasonably be expected and for which data were available in a sufficient number of trials or participants. In addition to age, sex (as reported by the investigator of the clinical trial) and diabetes duration we considered the following candidate covariates in our analysis: total daily insulin dose; insulin regimen (basal, basal bolus, or premix); insulin origin (human vs analogue); self-monitored blood glucose; variability based on self-monitored blood glucose; HbA1c; eGFR as defined by the CKD-EPI equation [12]; systolic BP; diastolic BP; medical history of complications of diabetes (CVD, retinopathy, neuropathy, nephropathy); total cholesterol; LDL-cholesterol; HDL-cholesterol; triglycerides; BMI; ethnicity; and use of concomitant medications (glucose-lowering drugs, antihypertensives, systemic antibiotics, systemic oral anti-inflammatory agents, psychoactive agents, sex hormones, anti-epilepsy drugs, antithyroid drugs, cessation of systemic steroids).

Medical history covariates were defined by relevant Medical Dictionary for Regulatory Activities (MedDRA) terms and drug categories were defined using ATC codes (ESM Tables 1, 2).

Since we considered that an individual’s recent history of hypoglycaemia was likely to be an important predictor of future hypoglycaemia events, and since this information would ordinarily be available in a clinical setting, we used the first 6 weeks following the date of randomisation into their clinical trials (an arbitrary minimum time period in which to estimate a typical hypoglycaemia baseline) to obtain measures of baseline hypoglycaemia incidence, baseline blood glucose and blood glucose variability for each participant. Follow-up time and events after this first 6 weeks were then used in the evaluation of associations and predictions. A simple hypoglycaemia score was arbitrarily defined as the weighted sum of the number of level 1, 2 and 3 hypoglycaemia event counts in a 6 week period, with a 1:2:3 ratio between level 1, 2 and 3 event counts, respectively. Since the hypoglycaemia score was estimated after randomisation, the independent effect of the randomised insulin origin and regimen was not distinguishable in multivariate models.

Blood glucose variability was characterised by the CV calculated as the ratio of the SD to the mean of blood glucose within a 6 week time interval, as the CV is one of the most commonly used measures of this variable.

Missingness, evaluability and imputation

All continuous covariates were categorised as either having an evaluable continuous value or as being missing. For covariates such as sex and ethnicity, the covariate was either considered evaluable or missing. For drug exposure and medical history covariates, if at least one person in a given trial had the covariate recorded we considered all the participants in that trial to be evaluable for these covariates, otherwise we regarded the covariates as non-evaluated in a given trial.

Covariates were imputed on a per-trial basis using the R package Amelia (version 1.7.6; https://cran.r-project.org/web/packages/Amelia/index.html), provided the covariate was present for at least 80% of participants in that trial.

Statistical methods Data set-up

We structured our data in a longitudinal format, with time slices of 6 weeks. Time was measured relative to the entry date of each individual. Individuals exited the study at the earliest of the end of participation in the clinical trial or date of death.

Rates of hypoglycaemia

We first examined how much heterogeneity there was in the crude incidence rates of hypoglycaemia events at the three levels across clinical trials. A large degree of heterogeneity was expected given the varying entry criteria across trials and this had important implications for the potential of confounding of association by trial number.

Minimally adjusted associations with hypoglycaemia

To quantify the association of a range of clinical covariates with each hypoglycaemia outcome, we used multivariate generalised linear mixed models (GLMMs). For each analysis, the number of hypoglycaemia events experienced by an individual during a time slice was the measured outcome. We employed a Poisson mixed model for our analysis with random intercept for individual to account for any over-dispersion since the count of hypoglycaemia events is time-updated. This is as opposed to negative-binomial regression, which erroneously assumes that observations across individuals are exchangeable.

All analyses were performed for type 1 and type 2 diabetes separately. Separate GLMMs, adjusted for age, sex and diabetes duration, for participants with known insulin regimen were fit to investigate the adjusted association of each candidate covariate after imputation. We adjusted models for study identifier to account for confounding due to different trial entry criteria and populations. The covariate value for the first 6 weeks from study entry was used in the models, with only events after this time being considered in the analysis. The hypoglycaemia event rate was assumed to be constant across time slices for the same participant.

Prediction modelling

For multivariate prediction modelling, further exclusion criteria to the cohort were applied for each analysis separately. We dropped the following from consideration in our analysis: participants with unknown insulin regimen; any covariates with more than 20% missingness; any individual who had missingness in any retained candidate covariate; and concomitant medications where less than 5% of individuals were recorded as using them. We also dropped all participants in studies where there were 15 or fewer hypoglycaemia events in total across the study of the level corresponding to the outcome of the specific analysis, as such trials had too little information to contribute to the model. Data were partitioned in a 70:30 training:test split stratified by trial.

The prediction task was to predict the number of hypoglycaemia events from start of study (6 weeks post-randomisation) to end of study.

For each hypoglycaemia outcome, we fitted three models: (1) a baseline model that included age, sex, diabetes duration and study identifier; (2) a baseline model also including the hypoglycaemia score; and (3) a full model (also including the hypoglycaemia score). For the full model, all covariates meeting missingness criteria, separately for type 1 and type 2 diabetes, were included from the candidate set. In all models, the participant was included as a random effect.

Although our models predicted the number of hypoglycaemia events, for summarisation purposes the AUC for the binary outcome of the number of hypoglycaemia events at the threshold of being more or less than the 90th centile within the trial was computed. Prediction modelling included 18 models (two diabetes cohorts, three prediction outcomes and three comparator model types).

XGBoost implements a tree-based gradient boosting algorithm to fit predictive models [13]. We fitted XGBoost models using the training split to perform a threefold cross-validation grid-search (parameters are given in ESM Methods). The selected model was then evaluated on the test split where test log(likelihood) and AUC were evaluated. The difference in test log(likelihood) between two models provided the strength of evidence that one model had greater predictive performance than the other; a difference in test log(likelihood) of 6.9 natural log units is asymptotically equivalent to a p value less than 0.005 for comparison of nested models [14].

Comments (0)

No login
gif