モンテカルロサンプリングによる情報エントロピーの推定

10

その分布からサンプリングする唯一の実用的な方法がモンテカルロ法である場合、その分布の情報エントロピーを推定できる方法を探しています。

私の問題は、Metropolis–Hastingsサンプリングの導入例として通常使用される標準のイジングモデルと同じです。セット確率分布があります。つまりごとにがあります。の要素はイジング状態のような組み合わせの性質のものであり、それらの数は非常に多いです。つまり、実際には、この分布からコンピューターでサンプリングするときに、同じサンプルを2回取得することはありません。正規化係数がわからないためを直接計算することはできませんが、比率は簡単に計算できます。 $A$ $p(a)$ $a \in A$ $a \in A$ $p(a)$ $p(a_1)/p(a_2)$

この分布の情報エントロピーを推定したいのですが、

S = - \sum_{a \in A} p (a) \ln p (a) .

$S = -\sum_{a \in A} p(a) \ln p(a).$

あるいは、この分布とそれをサブセット制限することで得られる分布とのエントロピーの差を推定したいと思います（もちろん再正規化します）。 $a\in A_1 \subset A$

monte-carlo random-sampling

— チャールズウェルズ
ソース

3

入手可能な情報を理解している場合、必要な情報は不可能です。入手可能な情報では、エントロピーを決定するのに十分ではありません。エントロピーを概算するのに十分ではありません。

分布からサンプリングする方法があり、サンプリングによって取得した要素ペアの比率を計算する方法があるようです。他の情報はありません。もしそうなら、あなたの問題は解決できません。 $p(\cdot)$ $p(a_1)/p(a_2)$ $a_1,a_2$

特に、エントロピーが異なるが、利用可能な情報を使用して区別できない2つの分布を見つけることができます。最初に、サイズ（ランダム）セットの一様分布を考えます。次に、サイズ（ランダム）セットの一様分布を考えます。これらのエントロピーは異なります（200ビットと300ビット）。ただし、利用可能な情報を考えると、これら2つのディストリビューションのどちらを使用しているかを知る方法はありません。特に、どちらの場合も、比率 $2^{200}$ $2^{300}$ $p(a_1)/p(a_2)$ は常に正確に1になるため、比率は2つの分布を区別するのに役立ちません。また、誕生日のパラドックスがあるため、好きなだけサンプリングできますが、同じ値を2度取得することはありません（指数関数的に小さな確率を除いて、ライフタイム内ではありません）。したがって、サンプリングから取得する値は、次のようになります。ランダムなポイントであり、有用な情報は含まれていません。

したがって、問題を解決するには、さらに何かを知る必要があります。たとえば、分布の構造について何か知っていれば、問題を解決できる可能性があります。 $p(\cdot)$

— DW
ソース

p (a)

$p(a)$ does in fact have a special property: it is Gibbs like, i.e.

p (a) \propto \exp (θ E (a))

$p(a) \propto \exp(\theta E(a))$ where

E

$E$ is the "energy" of

a

$a$ . Except that there are multiple "energy" quantities, each with its corresponding

θ

$\theta$ parameter.

— Charles Wells

1

@CharlesWells, I'm not following what you mean by "multiple quantities". It sounds like that is worth posting separately, as a separate question, where you give us information about the structure of

p (a)

$p(a)$ . Maybe there is a solution to that special case.

— D.W.

2

For the second part of your question (estimation of entropy difference between distributions) you may be able to use the identity

F = ⟨ E ⟩ - T S,

$F = \langle E \rangle - T S,$ where

⟨ E ⟩

$\langle E \rangle$ is the average energy,

T

$T$ is the temperature (it is proportional to

θ

$\theta$ in

p \propto e^{θ E}

$p \propto \mathrm{e}^{\theta E}$ ), and

S

$S$ is the entropy. For details, see: Jaynes, E. (1957). Information Theory and Statistical Mechanics. Physical Review, 106(4), 620–630. http://doi.org/10.1103/PhysRev.106.620.

The idea then would be to use one of the methods available in the Computational Statistical Physics literature (see the links in the sidebar of that page) to find free energy differences $\Delta F$ and then find $\Delta S$ as a function of $\Delta F$ and $\Delta \langle E \rangle$ using the above formula (keep in mind that you can think of the restriction to a subset $A_1$ of $A$ as being equivalent to modifying the energy function $E$ so that it becomes infinite in the complement of $A_1$ ).

Here are two additional references on algorithms for computing free energy:

Lelièvre, T., Rousset, M., & Stoltz, G. (2010). Free Energy Computations. Imperial College Press. http://doi.org/10.1142/9781848162488

Chipot, C., & Pohorille, A. (2007). Free Energy Calculations. (C. Chipot & A. Pohorille, Eds.) (Vol. 86). Berlin, Heidelberg: Springer Berlin Heidelberg. http://doi.org/10.1007/978-3-540-38448-9

— Juan M. Bello-Rivas
ソース

Can you give more practical references for computing free energy differences? That wiki doesn't go very far

— Charles Wells

Done. I added two more references and pointed to the links in the sidebar of the wiki.

— Juan M. Bello-Rivas