## Statistical analysis of clinical trial data is a necessary component of a well-designed study. In this blog, we provide an overview of how it works—and how statisticians can minimize noise related to each patient’s individual characteristics and placebo responsiveness.

The main objective of most clinical trials is to evaluate efficacy and safety of an experimental treatment. To do this, statisticians compare the response of the drug-treated group(s) of patients to the response of the placebo-treated group. This is part of a clinical trial statistical analysis—a scientific tool that supports interpretation of study data and informs decision making on the next steps in the drug development process.

**The Role of a Clinical Trial Statistical Analysis**

In the majority of Phase 2 and Phase 3 clinical trials, the statistical analysis serves two major roles:

- Demonstrating compound efficacy
- Demonstrating compound safety

Trial statistical analyses are complex in and of themselves. But when you start to consider the impact of a heterogenous population, they become even more complicated.

**Variance & Bias in Clinical Trial Data**

People are, by nature, heterogenous. Everyone is different; different ages, gender, medical history, psychology. This is true also for patients participating in clinical trials. The variability of these characteristics creates variability, or noise, in clinical trial data. Noise can also be related to other factors; for example unequal distribution of patients with specific characteristics between treatment groups.

Noise in clinical trial data makes it difficult to detect true differences between treatment groups (e.g. between drug treatment and placebo treatment); yet, evaluating experimental therapies in heterogeneous patient populations is necessary to represent the general population. So, statisticians need ways to minimize these differences and biases while still being able to prove efficacy and safety for a generalized population.

To understand this challenge with more depth—and figure out how to solve it—let’s walk through how a clinical trial statistical analysis works.

**The Clinical Trial Statistical Analysis Process**

**#1. Decide on Hypothesis**

Like all scientific research, a clinical trial starts with a hypothesis.

There are generally two positions trials can take about a compound: superiority or equivalence (non-inferiority).

*This statistical analysis discussion is focused on a superiority trial, in which the statistics must demonstrate drug superiority over a placebo (or competitor).*

**#2. Calculate Study Power & Required Sample Size**

The statistical analysis starts by defining the sample size, which is based on the study power to be achieved. Study power is related to the probability of detecting the different between study groups assuming a difference exists, or the likelihood of avoiding a Type II (false negative) error. Study power needs to be at least 80-90 percent to be adequate for clinical research.

Depending on the study design, statisticians will help clinical trial teams figure out if the sample size required is realistic or not.

As outlined by the FDA’s guidance, E9 Statistical Principles for Clinical Trials, the following should be specified when determining sample size:

- Primary efficacy endpoints (variable)
- The test statistic
- The null hypothesis (no difference in treatments)
- The alternative hypothesis at chosen dose
- The probability of Type I error (conventionally 5 percent or less)
- The probability of Type II error (conventionally 10-20 percent)
- Approach to dealing with treatment withdrawals and protocol violations

**#3. Develop a Statistical Analysis Plan (SAP)**

As soon as the study protocol design is outlined, the statistical analysis strategy is discussed and defined. Here are critical elements of an SAP:

- Clinical trial summary, including objectives, endpoints, design and sample size.
- Dataset description, including study variables and data transfer.
- Data analysis considerations, including adjustments for covariates.
- Statistical issues, including outlier detection and handling of dropouts or missing data.
- Study population characteristics, including subject disposition and measurements of treatment compliance.
- Statistical analysis approach descriptions.

The study design, study power and statistical analysis plan are all set before the study starts. This is to remove potential bias that could occur if these parameters were adjusted while the clinical trial is ongoing.

**#4. Collect Data & Run Study**

Next, the research begins, starting with pre-trial patient data. Again, it’s important to capture important patient information before the trial begins to prevent any bias (resulting from treatment interference).

For example, if you plan to use a covariate for body mass index (BMI) in an osteoarthritis (OA) study, you need to identify that in the SAP and collect patient BMI before the trial begins.

**#5. Conduct Statistical Analysis & Report Outcomes**

Once you have your results, it’s time for the clinical trial statistical analysis.

Statistical analyses in clinical trials are typically based on estimating confidence intervals, hypotheses and drawing conclusions based on observed data. In this type of analysis for a superiority trial, there are generally four statistical methods:

**ANOVA**: Used to determine how one factor impacts a response variable.**ANCOVA**: Includes one or more covariates, which can help statisticians better understand how a factor impacts a response variable after accounting for some relevant, unchanging characteristics.**MANOVA**: Identical to an ANOVA, except it uses two or more response variables.**MANCOVA**: identical to a MANOVA, except it also includes one or more covariates.

More often than not, clinical trials will analyze data using an ANCOVA, which helps with variance reductions in a concrete way. ANCOVA normalizes data related to innate patient traits (like age or BMI) and creates cleaner, more precise understanding of true treatment effect.

As the FDA says, “Sponsors can use ANCOVA to adjust for differences between treatment groups in relevant baseline variables to improve the power of significance tests and the precision of estimates of treatment effect.”

For further reading on clinical trial statistical analysis and covariates, please refer to these industry guidances:

- FDA’s E9 Statistical Principles for Clinical Trials
- EMA’s ICH E9 Statistical Principles for Clinical Trials
- FDA’s Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biological Products
- EMA’s Adjustment for Baseline Covariates in Clinical Trials

**How to Minimize Variance in Statistical Analysis**

There are many decisions and calculations that must be made prior to the trial statistical analysis. Throughout the process, there are opportunities for bias. Even with thorough calculations and preparation, clinical trial data is still fraught with noise, making the final statistical analysis even more difficult and frustrating.

This is because covariate use to minimize variance in a statistical analysis requires quantifiable data – in other words, information that can be used mathematically. While some innate characteristics like age and pain levels can be easily quantified, other characteristics that are equally if not more impactful on the data cannot.

Rather, they have not been quantifiable until now. Tools4Patient is systematically developing tools to address some of the most common causes of data variability that can lead to clinical trial failure.

One of the most urgent examples is the placebo response, which varies significantly between patients and is a major source of noise in clinical trial data. Before now, this characteristic has been mathematically inaccessible, which means statisticians could not normalize it in the statistical analysis.

Thanks to predictive modeling, it is now possible to calculate placebo responsiveness scores for patients at the beginning of the study (just like you would for pain levels and ages). By combining an understanding of individual patient psychology with a predictive machine learning algorithm, you can calculate a relative placebo responsiveness score for each patient. This would be included in the statistical analysis plan and occur during the data collection step, prior to the trial.

**CONCLUSION**

Statisticians have been able to mathematically account for obvious sources of variance and bias for years. But, subtle, innate patient characteristics – like placebo responsiveness – have continued to be a major source of unchecked variance, causing higher Phase II and III trial failures. Now that this can be accurately predicted before the study, the statistical analysis can address this critical source of noise to improve the ability to detect treatment efficacy.

*Placebell©™ is a proven solution that helps clinical trial statisticians reduce data variability related to the placebo response. **Contact us** to learn more.*