Statistical Issues in Investigating Prognostic and Predictive Markers for DCIS
James J. Dignam, PhD;
The University of Chicago
View Presentation (PDF)
Some of the attributes of DCIS that are potentially important in prognostic marker studies include its increase in prevalence as a result of screening, its heterogeneity with respect to various features, its excellent survival prognosis, and the fact that it confers considerable excess risk for invasive breast cancer relative to those without DCIS. DCIS presents a clinical opportunity for early intervention to avoid invasive breast cancer. To take advantage of this opportunity, it is necessary to understand the heterogeneity of DCIS presentation and its characteristics.
A prognostic marker is defined as a characteristic associated with prognosis or outcome, usually in terms of relative hazard of failure, whereas a predictive marker is defined as a characteristic that is associated with, and predicts, treatment response. Sample size calculations for prognostic markers resemble those in study designs for treatment effects, with some important exceptions: the relative frequency of different levels of the marker cannot be manipulated or controlled; the strength of the relative hazard imparted needs to be larger than for treatments to be of interest and/or utility; and the marker may be correlated with multiple other known prognostic markers. The challenge in prognostic marker studies is determining how large the sample size must be to be useful in discovering and validating markers.
For predictive markers, the situation is more challenging. The detection of interaction effects is equivalent to the comparison of treated versus untreated within levels of a prognostic marker and the comparison of the size of the treatment effect between levels of the prognostic marker (i.e., the differential treatment effect). It can be shown that the sample size needed for studying prognostic markers, under some favorable assumptions, is approximately four times larger than that needed to detect a main effect of the same magnitude. As some interaction effects are large, such studies are not infeasible. Targeted clinical trials, where patients most likely to benefit are pre-selected, often can be much smaller than trials with more general eligibility. Similarly, if a treatment essentially is null in one group and effective in another, then the interaction effect may be detected.
The advent of modern biotechnology tools (e.g., microarrays) requires use of statistical methods previously little-used in marker studies, as well as the development of novel methods. These studies tend to be more involved in the discovery of candidate markers than testing for clinical utility and once a candidate marker is developed, regardless of origin, analysis from that point forward largely resembles traditional prognostic/predictive marker problems. In these situations, a marker variable may come in the form of a prognostic score or index that is a function of several other variables, and statistical principles for model building in general need to be followed to develop these scores. Independent validation data are critical, as there may be several equally reasonable scoring algorithms and the choice of which to use may be less important than validating one objectively. Additionally, reproducibility of the assay is critical, as the scoring algorithm must be portable to other settings to have utility. Numerous time-to-event endpoints are of interest in DCIS (e.g., ipsilateral recurrence, invasive contralateral tumor, breast cancer death), so it is necessary to determine which are the most important. Competing risks such as second cancers and non-cancer deaths also need to be considered.
Data sources for DCIS studies may include SEER Registries, single institution or single health care system cohorts, and randomized clinical trial databases. Each source has its strengths and weaknesses. Randomized trials have the advantage of uniformity of stage at diagnosis and cohort entry criteria, randomized treatment assignment, uniform treatment per a specific protocol, and rigorous follow-up and outcome ascertainment; however, there is limited patient diversity and a need for centralized pathology to assure common definitions. Observational cohorts provide a diversity of patients and disease presentations, ancillary data, and control over pathology information. Drawbacks of observational cohorts include treatment selection effects and the consequent validity of the predictive markers in question, differences in institutional or regional pathology definitions, and loss to follow-up.