Blinded Sample Size Re-estimation for Continuous Endpoints - Part 1
- Andrew Yan
- 6 days ago
- 3 min read
Updated: 4 days ago
Sample size re-estimation (SSR) is often performed in clinical trials to address uncertainty in design-stage assumptions (e.g., effect size). Unblinded SSR is straightforward but often subject to regulatory scrutiny. Blinded SSR is generally more acceptable to regulators, but it typically requires information about nuisance parameters, such as the variance for continuous outcomes. In this series, I will exam the statistical and practical challenges of blinded SSR to enhance understanding of its limitations.
Consider a parallel study with two equally sized groups (treatment and control) and a continuous endpoint. Suppose the data in the two groups follow normal distributions with means µ₁ and µ₂ and a common variance σ², i.e., 𝑁 (µ₁, σ²) for the treatment group and 𝑁 (µ₂, σ²) for the control group. The combined data from both groups can then be viewed as a random sample from a two-component Gaussian mixture model with a known mixture proportion. Specifically, let 𝑋 be an observation from the combined data then
where 𝜔 is the mixture proportion (i.e., 𝜔 = 1/2). Let 𝛿 = µ₁ - µ₂, then the variance of the mixture distribution is
Eq. (1) implies that, if either σ² or 𝛿² can be reliably obtained from historical studies, then the other can be estimated based on the sample variance of the combined data. That is,
and
with asymptotic variance of
and
respectively, where σ₀² and 𝛿₀² are the historical estimates of σ² and 𝛿², and 𝑛 denotes the combined sample size. It should be noted that the right-hand side of Eq. (2) and Eq. (3) can be negative, and, in such a case, the corresponding estimate is forced to be zero. Since 𝛿² ≤ σ² generally holds in clinical trials, the following patterns are expected:
Simulations can help shed more light on these patterns. For simplicity, we assume 𝛿 > 0 and a moderate (true) effect size of 𝛿/σ = 0.5 (it suffices to assume that 𝛿 = 0.5 and σ =1). We consider a total sample size of 𝑛 = 172, which is expected to provide approximately 90% power (based on a two-sample t-test) for the trial. A blinded SSR is planned when approximately 75% of study subjects (𝑛 ≈ 130) completed the trial. Additionally, the assumed σ₀ ranges from 0.90 to 1.10, and 0.45 to 0.55 for 𝛿₀. Simulation results (based on 1000 replications) are shown in the following two tables, including the mean (Mean) and standard deviation (SD) of the estimated parameter values.

As expected, the estimator for 𝛿² in Eq. (2) is highly sensitive to misspecification of σ₀. The apparent bias when σ₀ = 1 likely reflects small-sample effects - larger samples are required to demonstrate consistency of the estimator. This estimator also exhibits substantial variability. On the other hand, the estimator for σ² in Eq. (3) appears robust against misspecification of 𝛿₀ (at least within the range considered here) and is markedly more precise.
This creates a practical dilemma for blinded SSR: historical variance σ₀² is often readily available, but the poor performance of the estimator for 𝛿² in Eq. (2) makes it unattractive; conversely, the estimator for σ² in Eq. (3) is appealing, yet historical information 𝛿₀² is seldom reliable. A natural question follows: can we construct an estimator that does not rely on historical data? This will be addressed in Part 2 of this series.
Comments