The FDA released its final guidance regarding the use of covariates to improve the precision of statistical analyses in clinical trials on 26 May 2023 (1). It is a major step forward in the statistical analysis of clinical trial data. Recognizing that nearly a decade ago, the publication of the EMA guide initiated the movement (2), the FDA has crystallized the concept with the current guidance. While a draft version had previously existed – the first one was published in 2019 (3) , and the second draft version was precisely two years ago (4) – the completion of the final version is an end in itself. As prominently stated in the draft version, it was only distributed for review and comment purposes, and the guidance would reflect the FDA’s opinion only upon finalization. But now that this guidance is finalized, this is it: we have the FDA’s official opinion on the use of covariates as a recommendation to improve treatment effect size evaluation.
Before delving into the details of this guidance, let’s provide some background information: what exactly are we talking about? The main focus of the guidance is to provide recommendations on the use of prognostic baseline covariates in the analysis of data from randomized parallel clinical trials (RCTs), designed to demonstrate both non-inferiority or superiority of a test agent. As defined in the guidance, a prognostic baseline covariate refers to a covariate measured at baseline that may be associated with the primary endpoint. The use of prognostic covariates in a statistical analysis, a so-called adjusted analysis, will generally reduce the variability of estimation of treatment effects and thus lead to narrower confidence intervals and more powerful hypothesis testing. These prognostic covariates encompass any measure, feature, or index that may be associated with a positive (or negative) response of subjects.
Why is such a guidance so critical? We know that improvement in RCTs is highly subject-specific and involves numerous elements. All these factors naturally introduce greater variability in the measured response, both in the control and active arms of clinical trials. This increased data variance poses a challenge for demonstrating efficacy, may compromise the sensitivity of clinical trials, and lead to unnecessary trial failure (5). Using covariates that allow us to understand these differences between subjects and adjust endpoints to account for these specific differences is thus of great importance. However, these adjustments must be performed correctly to ensure the validity of an adjusted analysis. And that is precisely what this guidance does: provide the necessary framework for these analyses to be valid.
So, what are the FDA’s recommendations? There are relatively few differences between this final version and the last draft version (dated 2021), indicating that substantial reflection had taken place over the last years on this topic leading up to the content of the final version. These recommendations can be divided into three parts: general recommendations, recommendations for adjusted estimations of treatment effects using a linear model, and recommendations for estimations using a nonlinear model.
The general recommendations can be summarized easily. An analysis of an efficacy endpoint can be unadjusted. However, an analysis adjusted for baseline covariates can lead to a reduction in the confidence interval of the treatment effect estimation and a more powerful test. Moreover, the adjustment can be done with minimal impact on bias or the Type I error rate. These covariates can be derived from the scientific literature or defined and/or constructed based on previous studies. The greater the association between these covariates and the endpoint, the greater the gain in precision will be. However, even an adjusted analysis for covariates that are ultimately not associated with the response remains valid. Finally, multiple covariates can be used to adjust an analysis, but ideally, their number should remain small compared to the sample size. These covariates can be correlated with each other, but the benefit will generally be greater when their correlation is low.
The recommendations on linear models largely align with the general recommendations while providing some additional details. Adjustment for baseline covariates using a linear model is an acceptable method for estimating the average treatment effect. The results remain valid even if the model has been misspecified (although a well-specified model increases estimation precision). However, if the study is not composed of a 1:1 randomized two-arm design, a misspecified model can lead to imprecise standard errors. To account for this, the Agency recommends the use of robust methods for standard error estimation (e.g., the Huber-White ‘sandwich’ standard error).
Regarding nonlinear models, the Agency acknowledges that adjustment for covariates is frequently used for outcomes such as binary outcomes, ordinal outcomes, count outcomes, and time-to-event outcomes. Adjustment in these nonlinear models remains potentially valid if certain points are taken into consideration. The Agency notably highlights the existence of methods for unconditional treatment estimation that are robust to misspecifications for the aforementioned outcomes (i). On the other hand, the use of other new analysis methods requires prior discussion with the Agency.
In conclusion, it is evident that the Agency strongly recommends the use of adjustment covariates to improve the precision of treatment effect estimation. These covariates should be few in number (relative to the sample size), measured at baseline (before randomization and treatment initiation), and defined in an unbiased manner. Furthermore, for the vast majority of outcomes and associated models used in current clinical studies, adjusted analysis remains valid even if the model is misspecified or if the association between covariates and the endpoint is weaker than expected, provided certain rules in model definition are respected. Moreover, the Agency acknowledges now that these covariates can be prognostic indices constructed using previous studies data. This interesting statement reinforces the idea proposed by JW Tukey 30 years ago (6), (7): it is advantageous to use composite covariates by aggregating different patient characteristics based on prior knowledge. As he argued at that time, such a method is valid and can enable near-optimal adjustments. Composite covariates are, in the light of this new final guidance, considered by the Agency as any other baseline covariates. As such, they may be used with minimal impact on bias or the Type I error rate and their use in adjusted analysis remain valid (ii).
All of this clearly highlights the relevance of the Placebell® methodology developed by Cognivia to produce composite covariates. Placebell® aligns perfectly with the guidance and with FDA’s intention to encourage the correct use of prognostic factors to improve estimation precision. And because Placebell® combines many small prognostic factors into one, it reduces the number of covariates for adjustment and increases the expected association with the outcome (iii). In a nutshell, exactly what the Agency recommends.
References:
1. FDA. Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biological Products (Final Guidance). May 2023. FDA-2019-D-0934-0043.
2. EMA. Guideline on adjustment for baseline covariates in clinical trials. February 2015. EMA/CHMP/295050/2013.
3. FDA. Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biologics with Continuous Outcomes (Draft Guidance). April 2019. FDA-2019-D-0934-0002.
4. FDA. Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biological Products (Draft Guidance). May 2021. FDA-2019-D-0934-0021.
5. Failure of Investigational Drugs in Late-Stage Clinical Development and Publication of Trial Results. Hwang, TJ, et al. 12, s.l. : JAMA Intern Med, 2016, Vol. 176, pp. 1826-1833.
6. Improving precision by adjusting for prognostic baseline variables in randomized trials with binary outcomes, without regression model assumptions. Steingrimsson, JA, Hanley, DF and Rosenblum, M. s.l. : Contemporary Clinical Trials, 2017, Vol. 54, pp. 18-24.
7. Enhanced Precision in the Analysis of Randomized Trials with Ordinal Outcomes. Díaz, I, Colantuoni, E and Rosenblum, M. 2, s.l. : Biometrics, 2016, Vol. 72, pp. 422-431.
8. Using Regression Models to Analyze Randomized Trials: Asymptotically Valid Hypothesis Tests Despite Incorrectly Specified Models. Rosenblum, M and Van der Laan, MJ. 3, s.l. : Biometrics, 2009, Vol. 65, pp. 937-945.
9. Nonparametric Analysis of Covariance for Hypothesis Testing with Logrank and Wilcoxon Scores and Survival-Rate Estimation in a Randomized Clinical Trial. Tangen, CM and Koch, GG. 2, s.l. : Journal of Biopharmaceutical Statistics, 199, Vol. 9, pp. 307-338.
10. Improving the Efficiency of the Log-Rank Test Using Auxiliary Covariates. Lu, X and Tsiatis, AA. 3, s.l. : Biometrika,, Vol. 95, pp. 679-694.
11. Leveraging historical data to optimize the number of covariates and their explained variance in the analysis of randomized clinical trials. Branders, S, et al. 2, s.l. : Stat Methods Med Res, 2022, Vol. 31, pp. 240-252.
(i) As an example: the method of Steingrimsson et al. (6) for the binary outcomes, the method of Diaz et al. (7) for the ordinal outcomes, the method of Rosenblum and Vand Der Laan (8) for the count outcomes, and the methods of Tangen and Koch (9) or Lu and Tsiatis (10) for time-to-event outcomes.
(ii) As long as the analysis model complies with the guidance of the Agency.
(iii) It should be noted that the gain due to the replacement of multiple covariates by a single composite one is quantitatively discussed by Branders et al (11).