UMPがないときに拒否領域を定義する方法は？

線形回帰モデルを考えます

、 $\mathbf{y}=\mathbf{X\beta}+\mathbf{u}$

、 $\mathbf{u}\sim N(\mathbf{0},\sigma^2\mathbf{I})$

。 $E(\mathbf{u}\mid\mathbf{X})=\mathbf{0}$

LET 対 $H_0: \sigma_0^2=\sigma^2$ $H_1: \sigma_0^2\neq\sigma^2$ 。

私たちは、その推測することができここで、。そしてアニヒレーターマトリックスのための典型的な表記法である、従属変数であるに回帰。 $\frac{\mathbf{y}^T\mathbf{M_X}\mathbf{y}}{\sigma^2}\sim \chi^2(n-k)$ $dim(\mathbf{X})=n\times k$ $\mathbf{M_X}$ $\mathbf{M_X}\mathbf{y}=\hat{\mathbf{y}}$ $\hat{\mathbf{y}}$ $\mathbf{y}$ $\mathbf{X}$

私が読んでいる本は次のように述べています：

以前に、拒否領域（RR）を定義するためにどの基準を使用する必要があるかを尋ねました。この質問た。主なものは、テストを可能な限り強力にするRRを選択することでした。

この場合、二者間複合仮説である代替案では、通常UMPテストはありません。また、本で与えられた答えによって、著者はRRの力の研究をしたかどうかを示しません。それにもかかわらず、彼らは両側RRを選択しました。なぜ仮説は「一方的に」RRを決定しないのですか？

編集：この画像は、演習4.14の解決策として本書の解決策マニュアルに記載されています。

— 海の老人。
ソース

本への参照を追加してください。関連：非対称ヌル分布の両側検定のP値。

— Scortchi -復活モニカ

@Scortchiリンクをありがとう。この質問について何かお伺いできますか？面白いと思いますか？私は面白い質問をしているのか、それとも他の分野に興味を向けるべきなのかを評価しようとしています...

— 海の老人。

もちろん、誰もが理論に興味を持っているわけではありませんが、一部の人々は（私を含めて）そうしmathematical-statisticsます。だから、罰金q。IMO。それは少し広いですが、良い答えはさまざまなアプローチと考慮事項を調査し、動機付けの例が大いに役立つと思います。（ただし、できるだけ単純な例を選択しました-既知の平均または指数分布の平均を使用して正規分布の分散についてテストします。）[BTW 。]

— スコルチ-モニカの復職

@Scortchiフィードバックありがとうございます。私はこれを自分で勉強しているので、質問をうまく構成しているかどうかわからないことがあります。

— 海の老人。

を定義する必要があります

M_{X}

$M_X$

— テイラー

回答:

回帰係数が既知であるため、帰無仮説が単純な場合の最初の作業が容易になります。次に、十分な統計は。ここで、は残差です。ヌル下での分布はカイ二乗によってスケーリングも＆自由度とは、サンプルサイズに等しい $T=\sum z^2$ $z$ $\sigma^2_0$ $n$ 。

下の尤度の比書き留め＆＆確認、それはの増加関数だとの任意のための： $\sigma=\sigma_1$ $\sigma=\sigma_2$ $T$ $\sigma_2 > \sigma_1$

対数尤度比の関数である、＆に正比例正の傾きを持つ。
$ℓ (σ_{2}; T, n) - ℓ (σ_{1}; T, n) = \frac{n}{2} \cdot [\log (\frac{σ_{1}^{2}}{σ_{2}^{2}}) + \frac{T}{n} \cdot (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}})]$ $\ell(\sigma_2;T,n)-\ell(\sigma_1;T,n)=\frac{n}{2} \cdot \left[\log \left(\frac{\sigma_1^2}{\sigma_2^2}\right) + \frac{T}{n} \cdot \left(\frac{1}{\sigma_1^2} - \frac{1}{\sigma_2^2}\right) \right]$ $T$ $\sigma_2>\sigma_1$

そうカーリン-ルビン定理によって片側検定の各対＆対一様に最も強力です。明らかに何のUMPテストありません対。ここで説明したように $H_0:\sigma=\sigma_0$ $H_\mathrm{A}:\sigma < \sigma_0$ $H_0:\sigma = \sigma_0$ $H_\mathrm{A}:\sigma < \sigma_0$ $H_0:\sigma = \sigma_0$ $H_\mathrm{A}:\sigma \neq \sigma_0$ $\sigma>\sigma_0$ $\sigma<\sigma_0$

$\sigma=\hat\sigma$ $\sigma$ $\sigma=\sigma_0$

$\hat\sigma^2=\frac{T}{n}$
$ℓ (\hat{σ}; T, n) - ℓ (σ_{0}; T, n) = \frac{n}{2} \cdot [\log (\frac{n σ_{0}^{2}}{T}) + \frac{T}{n σ_{0}^{2}} - 1]$ $\ell(\hat\sigma;T,n)-\ell(\sigma_0;T,n)=\frac{n}{2} \cdot \left[\log \left(\frac{n\sigma_0^2}{T}\right) + \frac{T}{n\sigma_0^2} - 1 \right]$

This is a fine statistic for quantifying how much the data support $H_\mathrm{A}:\sigma \neq \sigma_0$ over $H_0:\sigma = \sigma_0$ . And confidence intervals formed from inverting the likelihood-ratio test have the appealing property that all parameter values inside the interval have higher likelihood than those outside. The asymptotic distribution of twice the log-likelihood ratio is well known, but for an exact test, you needn't try to work out its distribution—just use the tail probabilities of the corresponding values of $T$ in each tail.

If you can't have a uniformly most powerful test, you might want one that's most powerful against the alternatives closest to the null. Find the derivative of the log-likelihood function with respect to $\sigma$ —the score function:

$\frac{d ℓ (σ; T, n)}{d σ} = \frac{T}{σ^{3}} - \frac{n}{σ}$ $\frac{\mathrm{d}\,\ell(\sigma;T,n)}{\mathrm{d}\,\sigma}=\frac{T}{\sigma^3} - \frac{n}{\sigma}$

Evaluating its magnitude at $\sigma_0$ gives a locally most powerful test of $H_0:\sigma=\sigma_0$ vs $H_\mathrm{A}:\sigma \neq \sigma_0$ . Because the test statistic's bounded below, with small samples the rejection region may be confined to the upper tail. Again, the asymptotic distribution of the squared score is well known, but you can get an exact test in the same way as for the LRT.

Another approach is to restrict your attention to unbiased tests, viz those for which the power under any alternative exceeds the size. Check your sufficient statistic has a distribution in the exponential family; then for a size $\alpha$ test, $\phi(T)= 1$ if $T<c_1$ or $T>c_2$ , else $\phi(T)= 0$ , you can find the uniformly most powerful unbiased test by solving

\begin{aligned} E (ϕ (T)) & = α \\ E (T ϕ (T)) & = α E T \end{aligned}

$\begin{align} \operatorname{E}(\phi(T)) &= \alpha \\ \operatorname{E}(T\phi(T)) &= \alpha \operatorname{E} T \end{align}$

A plot helps show the bias in the equal-tail-areas test & how it arises:

At values of $\sigma$ a little over $\sigma_0$ the increased probability of the test statistics' falling in the the upper-tail rejection rejection doesn't compensate for the reduced probability of its falling in the lower-tail rejection region & the power of the test drops below its size.

Being unbiased is good; but it's not self-evident that having a power slightly lower than the size over a small region of the parameter space within the alternative is so bad as to rule out a test altogether.

Two of the above two-tailed tests coincide (for this case, not in general):

The LRT is UMP among unbiased tests. In cases where this isn't true the LRT may still be asymptotically unbiased.

I think all, even the one-tailed tests, are admissible, i.e. there's no test more powerful or as powerful under all alternatives—you can make the test more powerful against alternatives in one direction only by making it less powerful against alternatives in the other direction. As the sample size increases, the chi-squared distribution becomes more & more symmetric, & all the two-tailed tests will end up being much the same (another reason for using the easy equal-tailed test).

With the composite null hypothesis, the arguments become a little more complicated, but I think you can get practically the same results, mutatis mutandis. Note that one but not the other of the one-tailed tests is UMP!

— Scortchi - Reinstate Monica
ソース

Scortchi thanks for your answer. I still have some doubts, though. Firstly, could you elaborate a bit more on the following sentence? «applying a multiple-comparisons correction leads to the commonly used test with equally sized rejection regions in both tails, & it's quite reasonable when you're going to claim either that σ>σ0 or that σ<σ0 when you reject the null.» Also why do you say it's reasonable? I think this is the core of my question if I'm not mistaken. ;)

— An old man in the sea.

I read this paragraph from you linked answer, but I did not understand it well«Doubling the lowest one-tailed p-value can be seen as a multiple-comparisons correction for carrying out two one-tailed tests.» I would be thankful if you could please explained it a bit more. ;)

— An old man in the sea.

See Bonferroni correction. If you carry out two separate size

α / 2

$\alpha/2$ tests the family-wise Type I error is no more than

α

$\alpha$ , & when the rejection regions are disjoint it's exactly

α

$\alpha$ . I wanted to point out that the equal-tail-areas test can be seen in this way because people sometimes seem to think the only reasons to use it are ease of calculation & approximation to the other tests. In fact each test has its own rationale: so I wouldn't say this was the core of your question; it's a matter of horses for courses.

— Scortchi - Reinstate Monica

In this case, with the alternative being a bilateral composite hypothesis there's usually no UMP test.

I am not sure if that is true in general. Certainly, a lot of the classical results (Neymon-Pearson, Karlin-Rubin) are based on either simple or one-sided hypothesis, but generalizations to two-sided composite hypothesis do exist. You can find some notes on that here, and more discussion in the textbook here.

For your problem specifically, I don't know whether a UMP test exists or not. But intuitively, it seems to be that under 0-1 loss, a one sided test will probably be inadmissible, and thus the class of admissible test will be all two-sided tests. Give the class of two sided tests, the goal is to find the one with the largest power, which should automatically happen by choosing quantiles around the one mode of the $\chi^2$ . (This is all based on intuition).

— Greenparker
ソース

There's clearly not a uniformly most powerful test in this case because of the existence of different tests most powerful against particular alternatives in different directions from

σ_{0}

$\sigma_0$ . For a "best" test defined in terms of power you'd have to look for the uniformly most powerful test of all unbiased tests, or of all invariant tests; or for a locally most powerful test; or something like that - & perhaps end up settling for any admissible test.

— Scortchi - Reinstate Monica