An Introduction to Missing Data in Clinical Trials

Missing Data in Clinical Trials

The approach to missing data in clinical trials has evolved over the past twenty years, particularly regarding the view to incorporate missing data in our understanding of results. The problem of missing data is of particular importance due to it introducing bias and leading to a loss of power, inefficiencies and false positive findings (Type I Error). It is often the last visit at which clinical benefit is measured and an incomplete picture of the safety and efficacy profile is painted if subjects drop out prior to this visit, leading to inaccurate conclusions for the investigative treatment.

Post-withdrawal data in clinical trials may be assumed to be missing not at random (MNAR) as subjects who withdraw must be worse off than subjects continuing in the study. If we, on the other hand, assume that the missing data is missing at random (MAR), where the missingness is independent of the unobserved data, given the observed data, then we can model the unobserved outcome using observed outcomes as a basis. Reasons for a participant’s drop-out include, but are not limited to, adverse events or death, intolerance, lack of efficacy, or simple convenience.

The International Conference on Harmonisation (ICH) E9 guideline (1998) mentions preventing missing data; admits that there is no one way of handling missing data due to the unique design and measurement characteristics; suggests sensitivity analysis and that missing data handling is predefined in the protocol and that reasons for withdrawal are recorded. Addressing the problem of missing data has become even more important with the recent R1 Addendum to E9 (2017), requiring the precise definition of estimands and the handling of intercurrent events, which is affecting drug applications submitted to the regulators. It is therefore vital that the missing data problem be understood and addressed according to the chosen estimand at the time of trial design. An estimand provides a detailed definition of an endpoint to address the trial objective by specifying the population of interest, the variable of interest, the specification of how to account for intercurrent events and the population-level summary of the variable.

It is advised that the approach to missing data should reflect the estimand, which in turn depends on the study objective, per the definition above. Statisticians can provide input in the trial design, conduct, including patient retention strategies, and analysis to help prevent missing data in clinical trials and use the knowledge gained in the design of future studies. The quantity being estimated should be of interest given the data that can be collected. The use of historical data would allow the identification of patterns in missing data and suggest plausible approaches.

 

Methods for handling Missing Data

There are several approaches for the handling of missing data but some of the methods are:

  1. Last observation carried forward (LOCF) - LOCF carries forward the last non-missing value. LOCF can over-estimate efficacy and/or under-estimate safety issues in trials where the subject’s condition is expected to deteriorate over time.
  2. Baseline observation carried forward (BOCF) - BOCF is usually employed in trials where the endpoint is expected to return to the baseline value post-withdrawal, such as in chronic pain trials.
  3. Using representative values such as a subject or treatment average for score data
  4. Mixed models repeated measures (MMRM) - MMRM does not involve any formal imputation and aims to estimate the treatment effect by making use of all available data and the subject-specific effects and correlations between the repeated measurements.
  5. Standard multiple imputation (MI), which gives the same results as the MMRM and assumes MAR. - Standard MI performs the imputations such that the results for the subject with the missing data tend towards the mean for the treatment group they belong to, due to the weakening of the within subject correlation. This also results in an increase in the variance with time, as is expected in clinical trials. The realization that the subjects who withdraw are no longer on randomized treatment, led to developments to allow imputation based on a clinically plausible post-withdrawal path. One of these is multiple imputation under the copy reference (CR) assumption where post-withdrawal data is modeled assuming that the subject was a member of the reference group. Here, the outcome would tend towards the mean for the reference group

 

In terms of implementation when handling missing data, a number of resources are available, including the MICE (Multiple Imputation by Chained Equations) R package and missingdata.org which includes SAS macros, test data sets and imputation techniques for a wide variety of endpoints from the Drug Information Association Scientific Working Group (DIA SWG). If you are looking for a more in-depth academic resource there is a great book published “The Prevention and Treatment of Missing Data in Clinical Trials” in 2010 which has been made available for free by the publishers.

 

New call-to-action

Quanticate's statistical consultants are among the leaders in their respective areas enabling the client to have the ability to choose expertise from a range of consultants to match their needs. If you have a need for these types of services please Submit a RFI and member of our Business Development team will be in touch with you shortly.

Subscribe to the Blog