グループ間の生存期間の中央値を比較する方法は？

12

がんの種類ごとに異なる状態でカプラン・マイヤーを使用して生存期間の中央値を調べています。州の間には非常に大きな違いがあります。すべての州の生存期間の中央値を比較し、どの州が全国の平均生存期間と有意に異なるかを判断するにはどうすればよいですか？

multiple-comparisons survival

— ミシャ
ソース

サンプルのサイズ、時間枠、生存率（％）などについていくつか教えていただけますか？

— chl

データに打ち切り値がありますか-最大値以外の値ですか？

— ロナフ

データには実際に打ち切られた値があり、総人口は約1500人、全生存期間の中央値は18ヶ月（300〜600日の範囲）です。時間枠は2000〜2007年です。

— ミシャ

6

カプラン・マイヤー生存曲線で留意すべきことの1つは、基本的に説明的なものであり、推測ではないということです。それはデータの単なる機能であり、その背後には信じられないほど柔軟なモデルがあります。これは、破られるかもしれない仮定が事実上ないことを意味するため、これは強みですが、一般化するのが難しく、「ノイズ」と「信号」に適合するという弱点があります。推論を行いたい場合は、基本的に、知りたいことを知らない何かを導入する必要があります。

生存時間の中央値を比較する1つの方法は、次の仮定を立てることです。

カプラン・マイヤー曲線で与えられる、状態のそれぞれの生存時間中央値推定値があります。 $t_{i}$ $i$
真の生存時間の中央値、 $T_{i}$ はこの推定値に等しいと期待しています。 $E(T_{i}|t_{i})=t_{i}$
真の生存期間の中央値が正であることは100％確信しています。 $Pr(T_{i}>0)=1$

これらの仮定を使用するための「最も保守的な」方法は、最大エントロピーの原理であるため、次のようになります。

p (T_{i} | t_{i}) = K e x p (- λ T_{i})

$p(T_{i}|t_{i})= K exp(-\lambda T_{i})$

ここで、PDFが正規化されるようにとが選択され、期待値はです。今、私たちは： $K$ $\lambda$ $t_{i}$

1 = \int_{0}^{\infty} p (T_{i} | t_{i}) d T_{i} = K \int_{0}^{\infty} e x p (- λ T_{i}) d T_{i}

$1=\int_{0}^{\infty}p(T_{i}|t_{i})dT_{i} =K \int_{0}^{\infty}exp(-\lambda T_{i})dT_{i}$

、

= K {[- \frac{e x p (- λ T_{i})}{λ}]}_{T_{i} = 0}^{T_{i} = \infty} = \frac{K}{λ} ⟹ K = λ

$=K \left[-\frac{exp(-\lambda T_{i})}{\lambda}\right]_{T_{i}=0}^{T_{i}=\infty}=\frac{K}{\lambda}\implies K=\lambda$

E (T_{i}) = \frac{1}{λ} ⟹ λ = t_{i}^{- 1}

$E(T_{i})=\frac{1}{\lambda}\implies \lambda=t_{i}^{-1}$

したがって、各状態の確率分布のセットがあります。

p (T_{i} | t_{i}) = \frac{1}{t_{i}} e x p (- \frac{T_{i}}{t_{i}}) (i = 1, \dots, N)

$p(T_{i}|t_{i})= \frac{1}{t_{i}} exp\left(-\frac{T_{i}}{t_{i}}\right)\;\;\;\;\;(i=1,\dots,N)$

以下の同時確率分布を与えます：

p (T_{1}, T_{2}, \dots, T_{N} | t_{1}, t_{2}, \dots, t_{N}) = \prod_{i = 1}^{N} \frac{1}{t_{i}} e x p (- \frac{T_{i}}{t_{i}})

$p(T_{1},T_{2},\dots,T_{N}|t_{1},t_{2},\dots,t_{N})= \prod_{i=1}^{N}\frac{1}{t_{i}} exp\left(-\frac{T_{i}}{t_{i}}\right)$

あなたが仮説テストするように今では聞こえる、 $H_{0}:T_{1}=T_{2}=\dots=T_{N}=\overline{t}$ は平均生存時間の中央値です。テストに深刻な対立仮説は仮説「すべての州は、ユニークで美しい雪の結晶である」さこれは最も可能性の高い選択肢であるため、情報の失われたを表しているため、より単純な仮説（「ミニマックス」検定）への移行。より単純な仮説に対する証拠の尺度は、オッズ比によって与えられます。 $\overline{t}=\frac{1}{N}\sum_{i=1}^{N}t_{i}$ $H_{A}:T_{1}=t_{1},\dots,T_{N}=t_{N}$

O (H_{A} | H_{0}) = \frac{p (T_{1} = t_{1}, T_{2} = t_{2}, \dots, T_{N} = t_{N} | t_{1}, t_{2}, \dots, t_{N})}{p (T_{1} = \bar{t}, T_{2} = \bar{t}, \dots, T_{N} = \bar{t} | t_{1}, t_{2}, \dots, t_{N})}

$O(H_{A}|H_{0})=\frac{p(T_{1}=t_{1},T_{2}=t_{2},\dots,T_{N}=t_{N}|t_{1},t_{2},\dots,t_{N})}{ p(T_{1}=\overline{t},T_{2}=\overline{t},\dots,T_{N}=\overline{t}|t_{1},t_{2},\dots,t_{N})}$

= \frac{[\prod_{i = 1}^{N} \frac{1}{t_{i}}] e x p (- \sum_{i = 1}^{N} \frac{t_{i}}{t_{i}})}{[\prod_{i = 1}^{N} \frac{1}{t_{i}}] e x p (- \sum_{i = 1}^{N} \frac{\bar{t}}{t_{i}})} = e x p (N [\frac{\bar{t}}{t_{h a r m}} - 1])

$=\frac{ \left[\prod_{i=1}^{N}\frac{1}{t_{i}}\right] exp\left(-\sum_{i=1}^{N}\frac{t_{i}}{t_{i}}\right) }{ \left[\prod_{i=1}^{N}\frac{1}{t_{i}}\right] exp\left(-\sum_{i=1}^{N}\frac{\overline{t}}{t_{i}}\right) } =exp\left(N\left[\frac{\overline{t}}{t_{harm}}-1\right]\right)$

Where

t_{h a r m} = {[\frac{1}{N} \sum_{i = 1}^{N} t_{i}^{- 1}]}^{- 1} \leq \bar{t}

$t_{harm}=\left[\frac{1}{N}\sum_{i=1}^{N}t_{i}^{-1}\right]^{-1}\leq \overline{t}$

is the harmonic mean. Note that the odds will always favour the perfect fit, but not by much if the median survival times are reasonably close. Further, this gives you a direct way to state the evidence of this particular hypothesis test:

assumptions 1-3 give maximum odds of $O(H_{A}|H_{0}):1$ against equal median survival times across all states

Combine this with a decision rule, loss function, utility function, etc. which says how advantageous it is to accept the simpler hypothesis, and you've got your conclusion!

There is no limit to the amount of hypothesis you can test for, and give similar odds for. Just change $H_{0}$ to specify a different set of possible "true values". You could do "significance testing" by choosing the hypothesis as:

H_{S, i} : T_{i} = t_{i}, T_{j} = T = {\bar{t}}_{(i)} = \frac{1}{N - 1} \sum_{j \neq i} t_{j}

$H_{S,i}:T_{i}=t_{i},T_{j}=T=\overline{t}_{(i)}=\frac{1}{N-1}\sum_{j\neq i}t_{j}$

So this hypothesis is verbally "state $i$ has different median survival rate, but all other states are the same". And then re-do the odds ratio calculation I did above. Although you should be careful about what the alternative hypothesis is. For any one of these below is "reasonable" in the sense that they might be questions you are interested in answering (and they will generally have different answers)

my $H_{A}$ defined above - how much worse is $H_{S,i}$ compared to the perfect fit?
my $H_{0}$ defined above - how much better is $H_{S,i}$ compared to the average fit?
a different $H_{S,k}$ - how much is state $k$ "more different" compared to state $i$ ?

Now one thing which has been over-looked here is correlations between states - this structure assumes that knowing the median survival rate in one state tells you nothing about the median survival rate in another state. While this may seem "bad" it is not to difficult to improve on, and the above calculations are good initial results which are easy to calculate.

Adding connections between states will change the probability models, and you will effectively see some "pooling" of the median survival times. One way to incorporate correlations into the analysis is to separate the true survival times into two components, a "common part" or "trend" and an "individual part":

T_{i} = T + U_{i}

$T_{i}=T+U_{i}$

And then constrain the individual part $U_{i}$ to have average zero over all units and unknown variance $\sigma$ to be integrated out using a prior describing what knowledge you have of the individual variability, prior to observing the data (or jeffreys prior if you know nothing, and half cauchy if jeffreys causes problems).

— probabilityislogic
ソース

(+1) Very interesting. Your post also made me insert a comment in my answer.

— GaBorgulya

Perhaps I have missed it, but where is

M_{1}

$M_1$ defined?

— cardinal

@cardinal, my apologies - its a typo. will be removed

— probabilityislogic

no apology necessary. Just wasn't sure if I had skipped over it while reading or was simply missing something obvious.

— cardinal

4

Thought I just add to this topic that you might be interested in quantile regression with censoring. Bottai & Zhang 2010 proposed a "Laplace Regression" that can do just this task, you can find a PDF on this here. There is a package for Stata for this, it has yet not been translated to R although the quantreg package in R has a function for censored quantile regression, crq, that could be an option.

I think the approach is very interesting and might be much more intuitive to patients that hazards ratios. Knowing for instance that 50 % on the drug survive 2 more months than ones that don't take the drug and the side effects force you to stay 1-2 months at the hospital might make the choice of treatment much easier.

— Max Gordon
ソース

I don't know "Laplace Regression", but regarding your 2nd paragraph I wonder if I'm understanding it correctly. Usually in survival analysis (thinking in terms of accelerated failure time), we would say something like 'the 50th percentile for the drug group comes 2 months later than the 50th % for the control group'. Is that what you mean, or does the output of LR afford a different interpretation?

— gung - Reinstate Monica

@gung: I think you're right in your interpretation - changed the text, better? I haven't used the regression models myself although I've encountered them recently in a course. Tt's an interesting alternative to regular Cox-models that I've used a lot. Although I probably need to spend more time digesting the idea I feel that it's probably easier for me to explain to my patients since I frequently use KM curves when explaining to my patients. HR demands that you really understand the difference between relative and absolute risks - a concept that can take some time to explain...

— Max Gordon

econ.uiuc.edu/~roger/research/crq/note.pdf

— Misha

Thank you @Misha for the link. The author has a reply here: onlinelibrary.wiley.com/doi/10.1002/bimj.201100103/abstract

— Max Gordon

3

First I would visualize the data: calculate confidence intervals and standard errors for the median survivals in each state and show CIs on a forest plot, medians and their SEs using a funnel plot.

The “mean median survival all across the country” is a quantity that is estimated from the data and thus has uncertainty so you can not take it as a sharp reference value during significance testing. An other difficulty with the mean-of-all approach is that when you compare a state median to it you are comparing the median to a quantity that already includes that quantity as a component. So it is easier to compare each state to all other states combined. This can be done by performing a log rank test (or its alternatives) for each state.
(Edit after reading the answer of probabilityislogic: the log rank test does compare survival in two (or more) groups, but it is not strictly the median that it is comparing. If you are sure it is the median that you want to compare, you may rely on his equations or use resampling here, too)

You labelled your question [multiple comparisons], so I assume you also want to adjust (increase) your p values in a way that if you see at least one adjusted p value less than 5% you could conclude that “median survival across states is not equal” at the 5% significance level. You may use generic and overly conservative methods like Bonferroni, but the optimal correction scheme will take the correlations of the p values into consideration. I assume that you don't want to build any a priori knowledge into the correction scheme, so I will discuss a scheme where the adjustment is multiplying each p value by the same C constant.

As I don't know how to derive the formula to obtain the optimal C multiplyer, I would use resampling. Under the null hypothesis that the survival characteristics are the same across all states, so you can permutate the state labels of the cancer cases and recalculate medians. After obtaining many resampled vectors of state p values I would numerically find the C multiplyer below which less than 95% of the vectors include no significant p values and above which more then 95%. While the range looks wide I would repeatedly increase the number of resamples by an order of magnitude.

— GaBorgulya
ソース

Good advice about visualising the data. (+1)

— probabilityislogic

@probabilityislogic Thanks! I also welcome criticism, particularly if constructive.

— GaBorgulya

the only criticism I have is the use of p-values, but this is more a "chip on my shoulder" than anything in your answer - seems like if you are going to use p-values, then what you recommend is good. I just don't think using p-values is good. see here for my exchange with @eduardo in the comments about p-values.

— probabilityislogic