ノンパラメトリック推定分布からランダムサンプルを描画する方法は？

連続した1次元の100点のサンプルがあります。カーネル法を使用して、そのノンパラメトリック密度を推定しました。この推定分布からランダムサンプルを描画するにはどうすればよいですか？

r sampling kernel-smoothing

— ラブケシュ
ソース

カーネル密度の推定値は混合分布です。観測ごとにカーネルがあります。カーネルがスケーリングされた密度である場合、これはカーネル密度推定からサンプリングするための簡単なアルゴリズムにつながります。

repeat nsim times:
  sample (with replacement) a random observation from the data
  sample from the kernel, and add the previously sampled random observation

（たとえば）ガウスカーネルを使用した場合、密度の推定値は100個の法線の混合であり、それぞれがサンプルポイントの1つを中心とし、すべてが推定帯域幅に等しい標準偏差を持ちます。サンプルを描画するには、サンプルポイントの1つ（たとえば）を置き換えてサンプリングし、次にからサンプリングします。Rで： $h$ $x_i$ $N(\mu = x_i, \sigma = h)$

# Original distribution is exp(rate = 5)
N = 1000
x <- rexp(N, rate = 5)

hist(x, prob = TRUE)
lines(density(x))

# Store the bandwith of the estimated KDE
bw <- density(x)$bw

# Draw from the sample and then from the kernel
means <- sample(x, N, replace = TRUE)
hist(rnorm(N, mean = means, sd = bw), prob = TRUE)

厳密に言えば、混合物の成分が均等に重み付けされている場合、交換部品によるサンプリングを回避し、混合物の各成分からサイズサンプルを単純に引き出すことができます。 $M$

M = 10
hist(rnorm(N * M, mean = x, sd = bw))

何らかの理由でカーネルから描画できない場合（たとえば、カーネルが密度ではない場合）、重要度サンプリングまたはMCMCを試すことができます。たとえば、重要度サンプリングを使用します。

# Draw from proposal distribution which is normal(mu, sd = 1)
sam <- rnorm(N, mean(x), 1)

# Weight the sample using ratio of target and proposal densities
w <- sapply(sam, function(input) sum(dnorm(input, mean = x, sd = bw)) / 
                                 dnorm(input, mean(x), 1))

# Resample according to the weights to obtain an un-weighted sample
finalSample <- sample(sam, N, replace = TRUE, prob = w)

hist(finalSample, prob = TRUE)

PS回答に貢献してくれたGlen_bに感謝します。

— マッテオ・ファシオーロ
ソース

申し訳ありませんが、重要なサンプリングをまっすぐに行ったところ、通常はサンプリングがそれよりも簡単であることに気付きました。最初の説明を回答に追加しました。多くの感謝

— マッテオファシオーロ14年

@ Matteo Fasiolo-この方法について引用できる論文への参照はありますか。

— パラヴィ