Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2010 Apr 10;28(11):1936-41.
doi: 10.1200/JCO.2009.25.5489. Epub 2010 Mar 8.

Comparison of error rates in single-arm versus randomized phase II cancer clinical trials

Affiliations
Comparative Study

Comparison of error rates in single-arm versus randomized phase II cancer clinical trials

Hui Tang et al. J Clin Oncol. .

Abstract

PURPOSE To improve the understanding of the appropriate design of phase II oncology clinical trials, we compared error rates in single-arm, historically controlled and randomized, concurrently controlled designs. PATIENTS AND METHODS We simulated error rates of both designs separately from individual patient data from a large colorectal cancer phase III trials and statistical models, which take into account random and systematic variation in historical control data. RESULTS In single-arm trials, false-positive error rates (type I error) were 2 to 4 times those projected when modest drift or patient selection effects (eg, 5% absolute shift in control response rate) were included in statistical models. The power of single-arm designs simulated using actual data was highly sensitive to the fraction of patients from treatment centers with high versus low patient volumes, the presence of patient selection effects or temporal drift in response rates, and random small-sample variation in historical controls. Increasing sample size did not correct the over optimism of single-arm studies. Randomized two-arm design conformed to planned error rates. CONCLUSION Variability in historical control success rates, outcome drifts in patient populations over time, and/or patient selection effects can result in inaccurate false-positive and false-negative error rates in single-arm designs, but leave performance of the randomized two-arm design largely unaffected at the cost of 2 to 4 times the sample size compared with single-arm designs. Given a large enough patient pool, the randomized phase II designs provide a more accurate decision for screening agents before phase III testing.

PubMed Disclaimer

Conflict of interest statement

Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.

Figures

Fig 1.
Fig 1.
Choice of null hypothesis has large impact on study conclusions based on single-arm trials of N = 50 drawn from N9741 for confirmed response end point.
Fig 2.
Fig 2.
The selection of patients from different treating location volumes influences the trial conclusions. High-volume centers yield high power estimates compared with mid- and low-volume treating locations.
Fig 3.
Fig 3.
Error rates under various treatment effects (Delta) when adding sources of bias sequentially. Parameters: α = β = .2. W = 0.1. Four historical controls with mean success rate = 0.2, target success rate = 0.4. n = 21 for single-arm, 47 per arm for randomized trials. DS, drift and selection effects; beta, historical control variability; fixed, fixed historical control rate.
Fig A1.
Fig A1.
Choice of null hypothesis has large impact on study conclusions based on single-arm trials of N = 50 drawn from N9741 for 6-month overall survival (OS) rate.
Fig A2.
Fig A2.
Choice of null hypothesis has large impact on study conclusions based on single-arm trials of N = 50 drawn from N9741 for 6-month time-to-progression rate.
Fig A3.
Fig A3.
Error rates under various treatment effects (Delta) for single-arm v randomized two-arm designs assuming fixed historical control response rate and no drift and selection effects. Parameters: historical control success rate = 0.2, target success rate = 0.4. α = β = 0.2, n = 21 for single-arm trials, n = 47 per arm for randomized two-arm trials. Fixed, fixed historical control rate.
Fig A4.
Fig A4.
Error rates when assuming variability in historical controls and patient drift and selection effects. Parameters: 4 historical controls, historical control success rate = 0.2, target success rate = 0.4. W = 0.2. α = β = 0.2, n = 21 for single-arm trials, n = 47 per arm for randomized two-arm trials. Beta, historical control variability; DS, drift and selection effect.
Fig A5.
Fig A5.
Similar error rate plots as Appendix Figure A4 with larger sample size. α = β = 0.1, n = 40 for single-arm trials, n = 89 per arm for randomized two-arm trials. W = 0.2. Beta, historical control variability; DS, drift and selection effects.

Similar articles

Cited by

References

    1. Simon R. Optimal two-stage designs for phase II clinical trials. Control Clin Trials. 1989;10:1–10. - PubMed
    1. Green SJ, Dahlberg S. Planned versus attained design in phase II clinical trials. Stat Med. 1992;11:853–862. - PubMed
    1. Chalmers TC. When should randomisation begin? Lancet. 1968;291:858. - PubMed
    1. Senn S. Statistical Issues in Drug Development. ed 2. Malden, MA: Wiley-Interscience; 2008.
    1. Kindler HL, Friberg G, Singh DA, et al. Phase II trial of bevacizumab plus gemcitabine in patients with advanced pancreatic cancer. J Clin Oncol. 2005;23:8033–8040. - PubMed

Publication types

Substances