# How to do ROC-analysis in R with a Cox model

on 2011-10-24T10:10:35-04:00
I've created a few Cox regression models and I would like to see how well these models perform and I thought that perhaps a ROC-curve or a c-statistic might be useful similar to this articles use:

http://onlinelibrary.wiley.com/doi/10.1002/bjs.6930/abstract

Armitage used logistic regression but I wonder if it's possible to use a model from the survival package, the http://cran.r-project.org/web/packages/survivalROC/index.html gives a hint of this being possible but I can't figure out how to get that to work with a regular Cox regression.

I would be grateful if someone would show me how to do a ROC-analysis on this example:

library(survival)
data(veteran)

attach(veteran)
surv <- Surv(time, status)
fit <- coxph(surv ~ trt + age + prior, data=veteran)
summary(fit)

If possible I would appreciate both the raw c-statics output and a nice graph

Thanks!

Update Thank you very much for answers. @Dwin: I would just like to be sure that I've understood it right before selecting your answer.

The calculation as I understand it according to DWin's suggestion:

library(survival)
library(rms)
data(veteran)

fit.cph <- cph(surv ~ trt + age + prior, data=veteran, x=TRUE, y=TRUE, surv=TRUE)

# Summary fails!?
#summary(fit.cph)

# Get the Dxy
v <- validate(fit.cph, dxy=TRUE, B=100)
# Is this the correct value?
Dxy = v[rownames(v)=="Dxy", colnames(v)=="index.corrected"]

# The c-statistic according to the Dxy=2(c-0.5)
Dxy/2+0.5

I'm unfamiliar with the validate function and bootstrapping but after looking at prof. Frank Harrel's answer http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tt3316820.html#a3324516 I figured that it's probably the way to get the Dxy. The help for validate states:

... Somers' Dxy rank correlation to be computed at each resample (this takes a bit longer than the likelihood based statistics). The values corresponting to the row Dxy are equal to 2 * (C - 0.5) where C is the C-index or concordance probability.

I guess I'm mostly confused by the columns. I figured that the corrected value is the one I should use but I haven't really understood the validate output:

 index.orig training test optimism index.corrected n
Dxy -0.0137 -0.0715 -0.0071 -0.0644 0.0507 100
R2 0.0079 0.0278 0.0037 0.0242 -0.0162 100
Slope 1.0000 1.0000 0.2939 0.7061 0.2939 100
...

In the http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tt3316820.html#a3324516 I've understood that I should have "surv=TRUE" in the cph if I have strata but I'm uncertain on what the purpose of the "u=60" parameter in the validate function is. I would be grateful if you could help me understand these and check that I haven't made any mistakes.

@chl has pointed to a specific answer to your question. The 'rms' package's cph function will produce a Somers-D which can be transformed trivially into a c-index. However, Harrell (who introduced the c-index to biostatistical practice) thinks this is unwise as a general strategy for assessing prognostic measures, because it has low power for discrimination among alternatives. Instead of relying on the surgical literature for your methodological guidance, it would be wiser to seek out the accumulated wisdom in Harrell's text, "Regression Modeling Strategies" or Steyerberg's "Clinical Prediction Models".
Depending on your needs, embedding a model inside a bigger model and doing a "chunk" likelihood ratio $\chi^2$ test for the added value of the additional variables will give you a powerful test. My book talks about an index arising from this approach (the "adequacy index").