Chapter 2 Experimental Design and Statistical Considerations

2.1 Introduction

2.1.1 What

Experimental design encompasses features of a clinical trial that relate to its structure and operations.

2.1.2 Why

Careful planning and documentation of design of a clinical intervention study is critical for maintenance of rigor and the obtaining of results that are likely to be reproducible. These considerations are intimately connected to the trial’s appropriateness for formal statistical analysis and production of results. It is thus imperative that all aspects of design receive the highest level of scrutiny during the planning phase.

2.1.3 How

It is critical that aspects of design be tailored to the goals and specific aims of the trial. Below we delineate aspects of experimental and statistical design that should be considered. These are not to be considered comprehensive or as being equally applicable to all scenarios, but should provide a starting point in planning your study.

2.2 Design of the study

2.2.1 What

In this toolkit we consider explicitly randomized experiments conducted with human subjects, most typically clinical trials.

2.2.2 Why

Key design features will ultimately dictate the internal validity and generalizability of trial results.

2.2.3 How

The following items should be considered during the design phase.

  1. Rationale for conduct of the trial. The scientific / biomedical impetus necessitating conduct of the trial.
  2. Overall goal. What, specifically, the study is intended to accomplish. Investigators should note the degree to which the trial will be considered explanatory (that is, focused on estimation of pure causal effects of intervention under controlled circumstances) versus pragmatic (focused on choosing between therapeutic options, often under circumstances mimicing or derived directly from clinical experience.) Many studies will have both explanatory and pragmatic aims.
  3. Illness condition or state. The condition modification of which motivates the study.
  4. Sampling population. Individuals on whom sampling should focus, including explicit requirements for inclusion and factors that would preclude enrollment (exclusions).
  5. General approach to design. Determination of the general nature of the approach to determining intervention efficacy and safety. For instance, the trial may be intended to demonstrate superiority, noninferiority or equivalence of a novel intervention to standard care; to determine maximum tolerable dose of a particular agent in a given population; to demonstrate a dose-response relation between intervention dosing and endpoints; etc.
  6. Specific aims. Detailed aims addressable by evaluation of statistical hypotheses.
  7. Primary and secondary endpoints.
  8. General nature of the planned comparisons. Key features of the comparisons to be made; e.g. mean performance on a primary endpoint.
  9. Safety concerns. Risks, including loss of confidentiality, affiliated with intervention or trial activities. Note that these will be explicitly considered by Institutional Review Board in assessing the risk/benefit ratio of conduct of the trial.
  10. Ethical considerations. Additional safety of ethical considerations.
  11. Measurement and quantification of effects. The way in which treatment effects will be quantified.
  12. Determination of statistical significance of comparisons. Summary nature of quantitative comparisons, fleshed out in statistical analysis plan.
  13. Study staffing and environment.
    1. Leadership. Designation of membership and roles of investigative team leaders.
    2. Staffing. Designation of study staff, with specific delineation of responsibilities.
    3. Environment. Physical space in which the study will be conducted. Explicitly considers access and travel to and from the venue, as appropriate.
  14. Measurement. Detailed explication of the was in which measurements will be taken. Makes specific note of resource needs (for instance, whether images or biospecimens must be collected or stored, and the equipment, facilities and human resources necessray to obtain said measures.)
  15. Nature and administration of intervention. Specific detail on the intervention, including how, how frequently, and by whom it will be administered and monitored.
  16. Study intervention period. Length of time over which intervention would be administered, including run-in and washout periods as applicable.
  17. Frequency of participant interactions, data collection and endpoint measurement. Number and frequency of participant interactions and measurments; often displayed in tabular format.
  18. Participant allocation or randomization. Methods by which participants will be assigned to trial arms, e.g. by randomization. Detail on the specific procedures used to determine these assignments. Must consider factors (e.g. blocking or stratification) that affect this process.
  19. Masking / blinding. Determination of whether and how investigators, staff, participants, analysts etc. are aware of intervention assignments.
  20. Data collection and management. Detail on the physical systems for data capture, methods by which it will be captured, the persons performing entry, monitoring of data completeness and quality, etc (see Data Management Module).
  21. Statistical design and sample size. See below.
  22. Measurement of trial adherence. Methods by which completeness of planned participant interactions with randomized activities will be quantified, and plans for analysis, if any.
  23. Training. Provisions for training of study personnel, including human resources for providing the trainging.

2.2.4 Special considerations for older adults

The number of participants available and who would likely consent to be enrolled in trials can be difficult to estimate among older individuals, particularly in the context of numerous exclusionary factors.

2.2.5 Common pitfalls

Lack of specificity in describing intervention; insufficient attention to the potential for attrition, data missingness, or intervention non-adherence; lack of detail in considering manner by which participants may be contacted, recruited, and maintained in the trial.

2.2.6 Resources

2.3 General Statistical Considerations

2.3.1 What

The statistical design works hand-in-hand with the experimental design to establish procedures for data interpretation and analysis. Major considerations include planning the enrollment and sample size, the general analytic approach, a priori consideration of the way in which results will be interpreted, and plans to resolve expected and unexpected problems, including data missingness and untoward measurements.

2.3.2 Why

Appropriate sample size and a statistical analysis plan are critical for the validity of conclusions and to prevent bias in operations and conclusions.

2.3.3 How

Derivation of the statistical analysis plan is detailed in the sections below.

2.4 Statistical Analysis Plan (SAP)

2.4.1 What

The project Statistical Analysis Plan (SAP) provides detailed descriptions of statistical analyses to be conducted for the trial, including rationale for choice of methods, plans for dealing with unexpected difficulties, and pre-specified guidance on interpretation of results.

2.4.2 Why

Guidance on data ascertainment, management, storage, analysis and interpretation are critical to preserve the validity of the design and soundness of scientific conclusions.

2.4.3 How

A version of the SAP may be included as an addendum to or embedded within the study protocol and manual of procedures, but ideally the authoritative SAP should be an independent document with its own formatting, references etc. The SAP must be assembled prior to enrollment and approved by the appropriate parties including the trial statistician, investigative team, and regulatory authorities (e.g. DSMB), and may be included with the package submitted to Institutional Review Boards or equivalent overseeing ethical approval of the trial. Authorship should be by the project statistician, assisted by the investigative team. As with all trials documents, the SAP should be under strict version control, and a ‘living’ electronic date-stamped version considered the authoritative document (see Essential Documentations Module).

2.4.3.1 Contents of the document

The contents of the SAP will vary from trial to trial. For a conventional intervention trial designed to assess efficacy, in which participants are allocated to one of two or more groups (e.g. intervention vs. control), contents might be as follows.

  1. Introduction.
    1. Background. Provides explanatory information concerning disease target, patient population, etc. Includes references to authoritative literature.
    2. Objectives. Briefly describes overall analytic goals.
      1. Primary objective. Describes primary objective of the trial, being specific concerning such matters as demonstration of feasibility; efficacy / effectiveness; safety. Provides some clarity concerning expected treatment effects and public health relevance.
      2. Secondary objectives. As above, for secondary objectives.
  2. Endpoints.
    1. Primary endpoints. Describes primary and secondary endpoints. May give detail as to measurement, validity and reliability, and other performance characteristics.
    2. Secondary endpoints. As above, for secondary endpoints.
  3. Protocol / Grant SAP. Summarize existing plan as presented in the original protocol or funding application.
  4. Design elements.
    1. Basic design features. Discusses overall experimental structure for purposes of comparison.
      1. Nature of primary comparison. Describes the basic target of interest, i.e. determination of superiority, equivalence or noninferiority of intervention to control or current standard of care.
      2. Participant allocation. Describes randomization or other procedures by which participants are allocated to trial groups.
    2. Schedule of events. Gives a summary of time-points and measurements that will be used.
    3. Blinding. Describes the degree to which data scientists and personnel are blinded, and maintenance of same.
    4. Blocking and stratification / matching. Describes blocking, matching, etc. and references plans for consideration of design effects in analysis.
    5. Randomization Scheme. Describes the randomization scheme and procedures.
    6. Intended Sample Size. Describes the intended final sample size, referencing potential hurdles such as attrition, crossover, non-adherence and other sources of bias.
    7. Effects intended to be estimated. Describes inferential targets for the analysis.
      1. Bounds for determination of statistical significance. Gives clinical and statistical thresholds for determination of significance of evidence and, where appropriate, rejection of null hypotheses.
      2. Argument for clinical significance. Establishes clinical relevance of effect to be estimated, potentially using the minimum clinically important difference or similar construct.
      3. A priori determination of interpretation of results.
  5. Analysis Populations / datasets. Describes analysis populations.
    1. Full analysis dataset. Typically all enrolled participants.
    2. Intention-to-treat dataset. Typically all participants allocated to trial groups. Meant to simulate the effect of real-world deployment of intervention.
    3. Per-protocol dataset. Typically participants meeting some bound on adherence to trial procedures. Meant to assess causal effect of intervention under adherent conditions.
  6. Detailed Analytic Plan. Presents the detailed project analytic plan. Where appropriate this may be organized by specific aim or project objectives.
    1. Primary endpoint(s). Describe primary, supporting and exploratory analyses intended for the Primary Endpoint.
      1. Main analysis of primary endpoint(s). Describes controlling, primary treatment of main analysis.
      2. Supportive analysis of primary endpoint(s). Describes secondary, supporting version of main analysis.
      3. Exploratory analysis of primary endpoint(s). Describes additional, perhaps hypothesis-generating analyses, sensitivity assessments, etc.
    2. Secondary endpoint(s). Describe primary, supporting and exploratory analyses intended for the Secondary Endpoints.
      1. Main analysis of secondary endpoint(s). As above, for secondary endpoints.
      2. Supportive analysis of secondary endpoint(s). As above, for secondary endpoints.
      3. Exploratory analysis of secondary endpoint(s). As above, for secondary endpoints.
    3. Safety analysis. Present plan for analyses of safety signals. Refer to relevant guidance.
      1. Exposure to intervention and trial procedures.
      2. Adverse events.
      3. Deaths and Serious Adverse Events.
      4. Other safety parameters.
    4. Interim analysis. Describe the rationale for and operationalization of planned interim analysis. Pay particular attention to the objective – e.g. stopping for futility, demonstrated efficacy, etc.
      1. Reasons for interim analysis. Briefly provide rationale.
      2. Objective of the interim analysis. Describe the objectives of the interim look.
      3. Planned schedule of interim analysis. Provides the schedule and criteria by which this may be revised.
      4. Scope of potential adaptations. Describes ways in which the design may be altered as a result of the interim analysis.
      5. Stopping rule. Provides the threshold, if any, of evidence that may result in early cessation of the trial for futility and/or for early demonstration of efficacy or other reasons.
      6. Adjustments to confidence intervals and p-values. Describes the method by which interim analyses will be acknowledged in subsequent data presentations, including management of type-I error rates and related quantities.
      7. Sample-size re-estimation and conditional power. Describes procedures for re-estimation and summary of conditional power computations.
      8. Documentation of interim analysis results. Provides template for presentation of results.
  7. Other methodological aspects.
    1. Special considerations in measurement. Here or elsewhere, addresses considerations such as measurement error, technical considerations concerning, for instance, biomarker or imaging analysis, and other specialized considerations.
    2. Quantification of adherence. Considers whether adherence to trial procedures should be assessed and analyzed, and if so provides the relevant planning.
    3. Covariates and subgroups. Gives descriptions of covariates and subgroups to be considered.
    4. Handling of missing data. Describes effect of attrition and other mechanisms dealing with missingness, and analytic plans to take this into account.
    5. Handling of outliers and unresolved queries. Describes approaches to dealing with untoward values or unexplained observations. May consider sensitivity analysis or other approaches; should be very specific and explicit.
    6. Multiplicity adjustments. Describes plans, if any, to deal with multiple comparisons issues.
    7. Other considerations.
  8. Reporting conventions. Describes manner in which results will be communicated; may include detailed instructions on units, significant digits and rounding, etc.

2.4.4 Special considerations for older adults

Aging populations are subject to a number of factors that may induce bias or difficulty in analysis, including enhanced risk of attrition or competing risks, elevated prevalence of multimorbidity, difficulties with participant recall, and other factors [cite Van Ness et al]. These and related considerations must be carefully considered within the SAP and during analysis.

2.4.5 Common pitfalls

Lack of specificity in describing techniques to be used and controlling decisions. Examples: failure to specify method by which standard errors are to be computed (model-based vs. robust. vs. resampling-based methods), failure to specify the set of covariates to be used in the definitive analysis (or method by which that set of covariates will be derived).

2.4.6 Resources

2.5 Presentation of Pre-planned Statistical Analyses

2.5.1 What

Presentation of statistical analyses to collaborators within the trial can range from casual exploratory communications to formal presentation of results as described in the SAP. In general, formal inference should follow the pre-specified plan laid out in the SAP, with particular attention to maintenance of blinding and the potential for introduction of bias, and investigators should consult regulations and guidance as to the degree to which discussion of preliminary analyses by the investigative team are appropriate. Formal presentation of results should follow an established template consistent with the SAP and include critical elements as outlined below.

2.5.2 Why

Protection against confirmation bias and other hazards is critical for the maintenance of rigor and validity of conclusions. Following the agreed-upon template for analyses (as documented in the SAP) provides this protection.

2.5.3 How

Analyses should be presented in memoranda constructed in a reproducible fashion, with carefully managed version control and documentation of data sources and other items as described below. Authorship should be a collaboration between the study statistician and other members of the analytic team, and must have as a contributor the persons actually performing the analysis.

2.5.3.1 Contents of the document

Some critical items for inclusion are described below

  1. Introduction and objectives. Brief background for the proposed analysis, with particular attention to measurement or design issues relevant to the presentation of results.
    1. Data Sources. Describes the studies and data set/s used. Must specify the version of the data structures (i.e. the date upon which the dataset was frozen and/or transmitted to the analyst.
    2. Software and computing environment. Describes the analytic tools and machinery employed, including version numbers.
    3. Aims and Analysis Objectives. Briefly states the aims and hypothesis of the analysis. Includes a brief prose summary how each aim was addressed in the analyses.
  2. Methods. Describes analytic sample and methods employed.
    1. Analytic samples and timeframe. Defines samples used. Specifies any subgroups to be used in analysis. Specifies the analysis timeframe (i.e. which visits or measurements were incorporated in analysis).
    2. Outcome measures. Specifies the primary and secondary outcomes used in the analysis. If outcomes are derived, provides clear definitions in natural language. Specifies the timepoints at which measures were obtained if not obvious from text above.
    3. Control Variables. Lists covariates used in the analyses. Where variables were derived, provide clear definitions. Specifies the timepoints at which measures were obtained, if not obvious from above.
    4. Statistical Procedures. Provides the types of statistical methods/models used as well as any statistical tests performed specific to each analytic aim identified above. Identifies primary, supporting and exploratory analyses (see Section XXX).
    5. Sensitivity Analysis. Describes and justifies any sensitivity analyses, i.e. using different samples, covariates, methods/models, etc.
    6. Treatment of missing data. Describes methods by which missing data are acknowledged and/or taken into account.
    7. Deviations from SAP or secondary analytic plan. Describes and justifies any variation in approach from that previously planned.
  3. Presentation of Results. Formal presentation of results. Should track closely with SAP or secondary analytic plan as well as the sections described in Methods (above).
  4. Conclusions. Provides in prose the overall implications of the analysis, with high-level quantitative summaries as appropriate.

2.5.4 Special considerations for older adults

See Special considerations for older adults under the Statistical Analysis Plan section.

2.5.5 Common pitfalls

Failure to acknowledge data sources or describe methods with sufficient detail; failure to conduct and document conclusions in reproducible fashion.

2.5.6 Resources

2.6 Planning, Conduct and Presentation of Secondary Analyses

Following or in parallel to completion of pre-planned analyses, secondary analyses may be requested or conducted. It is recommended that the trial employ a template for request and design of these analyses that mirrors the structure of the report described in the Contents of the document section.

2.7 References

1Van Ness, P.H., V.R. Towle, and M. Juthani-Mehta (2007) Testing Measurement Reliability in Older Populations: Methods for Informed Discrimination in Instrument Selection and Application. Journal of Aging and Health (20) 2: 183-197. DOI: 10.1177/0898264307310448.

Van Ness, P.H., T. E. Murphy, and A. Ali (2016) Attention to Individuals: Mixed Methods for N-of-1 Health Care Interventions.
Journal of Mixed Methods Research DOI: 10.1177/1558689815623685.