|
|
 |
Breast Cancer Survival Analysis
This is a case study of the influence of various patient characteristics
on survival rates for breast cancer. The survival analysis technique employed
is Cox Regression. This technique is useful in situations where we have
censured observations--that is, where some of the patients do not die
during the observation period. (If all patients had died during the observation
period, then we could have used another technique, such as linear regression,
to generate a predictive model of survival times.)
Data and Method
The observation period runs for 133.8 months. The modeling sample
contains 746 patients, including 50 patients who died during the observation
period and 696 who survived beyond the end of the observation period.
Our dependent variable (or "status" variable) has two values:
survived vs. died. In this simple example, we are testing only four predictors:
- Age, in years, at the start of the observation period
- Pathological tumor size, in centimeters
- Number of positive axillary lymph nodes
- Estrogen receptor status (positive vs. negative)
Here are the value ranges for the predictor variables:
- Age: 22 to 88
- Pathological tumor size: 0.10 to 7.00 centimeters
- Number of positive lymph nodes: zero to 35
- Estrogen receptor status: positive vs. negative
The Cox Regression used a backward stepwise likelihood-ratio variable
selection method, based on maximum partial likelihood estimates (-2 log
likelihood). Significance criteria were set at 0.05 for inclusion in the
model, and 0.10 for removal from the model. Here is some of the actual
computer printout from the final step of the stepwise regression analysis:
--------------------- Variables in the Equation ---------------------
Variable B S.E. Wald df Sig R Exp(B)
AGE -.0314 .0121 6.7486 1 .0094 -.0893 .9691
PATHSIZE .3975 .1175 11.4476 1 .0007 .1259 1.4881
LNPOS .1372 .0361 14.4100 1 .0001 .1443 1.1471
Since this is intended as a non-technical discussion, we will not
explain all the statistics in this table. But some key things to note
are:
- Estrogen status was removed as a predictor because it did not reach
the 0.05 significance criterion for inclusion, and showed no appreciable
correlation with the dependent variable. (The column labeled "Sig"
shows the statistical significance of included variables; the column
labeled "R" shows the degree of unique correlation with the
dependent variable.)
- Number of positive axillary lymph nodes was the strongest predictor
of survival rates over the course of the observation period (R=.1443
/ Sig=.0001)
- Pathological tumor size was the second-best predictor (R=.1259
/ Sig.=.0007), and is nearly as strong a predictor as number of positive
axillary lymph nodes
- Age, although significant, is somewhat less influential than the
other two predictors (R=-0.893 / Sig.=.0094)
Note that both the number of positive axillary lymph nodes and the
pathological tumor size are positively correlated with the dependent variable,
which means that they are directly associated with more rapid mortality.
In contrast, age is negatively correlated with the dependent variable,
which means that younger age is predictive of somewhat longer survival.
The following chart shows the cumulative survival function during
the observation period:

Several things are immediately apparent from this chart:
- All patients survive through the tenth month of the observation
period, at which time we begin to observe a fairly constant mortality
rate which runs through the fortieth month
- At the fortieth month, the mortality rate increases and continues
at this fairly constant increased rate through the forty-fifth month
- At the forty-fifth month, there is a five-month period without
additional mortality, after which time the mortality continues at a
fairly constant rate until the end of the observaton period, by which
time approximately 11% of the original sample has died
Conclusions and Implications
The case study presented here is relatively simple, and is for illustrative
purposes only. However, with the addition of more candidate predictors
(e.g., progesterone receptor status, histologic grade, etc.), an even
more powerful model could emerge.
By understanding the influence of patient characteristics on mortality
rates over time, we are in a better position to estimate survival times
for individual patients, and to defend using different or more aggressive
therapeutic approaches for some patients.
|