National Cancer Institute Home at the National Institutes of Health | www.cancer.gov

Approaches to Evaluation and Validation of Therapeutically Relevant Biomarkers

Annette Molinaro, PhD; Yale University School of Medicine

View PresentationIcon indicating linked file is archived content

To avoid toxicity and expense, improved tools for selecting individual patients for treatments and accurate prediction of who will respond and who will not respond to treatment are needed. Although new technologies for genomic profiling have been developed, none has made it into clinical practice because it is difficult to develop biomarker classifiers and sufficiently validate them.

The main steps to developing a classifier are to:

  • select a prediction model;
  • split the sample data into training and test sets;
  • perform feature selection;
  • fit the model to the training set; and
  • estimate the prediction accuracy with the test set.

Because it is possible to find a perfect classifier even when no signal is present, to avoid over-fitting or chance, some form of a training or test set must be used. It is important to remember that there should be no adjustment of the model or fitting on the test set and that feature selection is done within the training set.

After statistical significance is assessed and prediction error is estimated, the investigator should determine whether the prediction error confidence interval includes chance. Split sample (in which two-thirds of the sample is placed in the training set and one-third is placed in the test set) and leave-one-out cross-validation (in which the training set is n 1, the test set is 1 observation, and the validation is repeated n times until each observation is in the test set once) can be used for internal validity.

Following internal validation, questions regarding the accuracy of the classifier, the ability of the classifier to enhance prediction accuracy, and whether the classifier is worthy of further investigation will be answered. If the genomic classifier is worthy of further investigation, then its broad clinical application can be examined through external validation. This independent validation of prediction accuracy for the completely specified classifier determines whether patients benefit by using the classifier (e.g., better efficacy, reduced incidence of adverse events) versus not using the classifier.

Last Modified: 18 Oct 2013