コルモゴロフ複雑性定義の同等性

Kolmogorov-Complexityを定義するには多くの方法があり、通常、これらすべての定義は加法定数まで同等です。場合であることを $K_1$ 及び $K_2$ コルモゴロフ複雑機能である（別の言語又はモデルを介して定義された）、その後一定存在 $c$ ようにそのすべての文字列のための $x$ 、 $|K_1(x) - K_2(x)| < c$ 。これは、すべてのコルモゴロフ複雑度関数 $K$ およびごとに $x$ 、（定数。 $K(x) \le |x| +c$ $c$

チューリングマシンに基づいたの次の定義に興味があります $K$

状態の数：状態のTMが空の文字列でを出力するように、を最小数定義します。 $K_1(x)$ $q$ $q$ $x$
プログラムの長さ：定義：出力がその最短「プログラム」されるように。つまり、TMをバイナリ文字列にエンコードする方法を修正します。マシンののようにコードの（バイナリ）を示す。ここで、最小値は、空の入力でを出力するすべてののものです。 $K_2(x)$ $x$ $M$ $\langle M \rangle$ $K_2(x) = \min |\langle M \rangle|$ $M$ $x$

あるとと同等か？それらの間の関係は何ですか？また、コルモゴロフの複雑さの概念をよりよく把握しているのは、それらが同等でない場合です。 $K_1$ $K_2$

特に私を悩ませているのは、での増加率で、これは超線形ではないように思われます（またはではなく、ような定数で少なくとも線形）。を出力する最も単純なTMを考えてみましょう- 状態と遷移関数の一部としてを単にエンコードするTMです。それは見てすぐにあるという $K_2$ $x$ $C>1$ $K_2 < C|x|$ $|x|+c$ $x$ $x$ $K_1(x) \le |x|+1$ 。しかし、同じマシンのエンコーディングははるかに大きく、私が得る自明な限界は。 $K_2(x) \le |x|\log |x|$

computability kolmogorov-complexity

— ランG.
ソース

状態を持つ

台を超えるマシンがあり、それらの平均サイズは少なくとも

であるため、これらが加算定数によってのみ異なることはほとんどありません。2n2 $2^{n^2}$

n $n$

n2 $n^2$

— カベ

There is a well known bound that

K2(x)≤c+2|x| $K_2(x) \leq c + 2|x|$ for some fixed

c $c$ not depending on

x $x$ . This is because we can encode

x $x$ into a prefix-free language by just doubling each bit of

x $x$ and then ending with

01 $01$ . This takes

2|x|+2 $2|x| + 2$ bits to represent

x $x$ . Thus, because

K2 $K_2$ is defined in terms of a universal prefix-free machine,

K2(x)≤2|x|+2+c′ $K_2(x) \leq 2|x| + 2 + c'$ for some fixed

c $c$ . This can be improved some by using a more intelligent way to encode

x $x$ into a prefix free language.

— Carl Mummert

I can't see how. It seems that either

x $x$ is given as part of the encoding (as raw data), or you must construct

x $x$ by your state-machine. The first option seems to be cheating and I don't see how it can be comparable to the second option (which implies

K1 $K_1$ )

— Ran G.

@Ran G.: The key point is the invariance theorem described at en.wikipedia.org/wiki/Invariance_theorem . If I can describe any effective system that has a growth rate of

2|x| $2|x|$ then a universal Turing machine (as you describe for

K2 $K_2$ ) will meet this within an additive constant. The universal machine is the one that takes

⟨M⟩ $\langle M\rangle$ are input and returns the output of

M $M$ if

M $M$ halts.

— Carl Mummert

I apologize in advance for that I give way too many details, but I'm about to contradict people.

About $K(x)≤K'(x)+c$

The fact that $K_1(x)≤K_2(x)+c$ usually comes from an interpreter of the description language #2 into the description language #1 and not from a translation from programs of #2 into programs of #1.

For example $K_\mathtt{C}(x)≤K_\mathtt{Python}(x)+c_\mathtt{py2c}$ and you get this inequality as simply as this:

void py_run(char * s) {
    // code of your Python interpreter
}

int main(void) {
    py_run("Put here your Python program of size Kpython(x)");
}

Then your constant $c_\mathtt{py2c}$ will be something like $528+490240688$ where $528$ is the number of bits for this code and $490240688$ bits is the size the official Python interpreter written in C. Of course you only need to interpret what is possible in your description language for Python so you can do better than 69 MB :-)

What is important is that you can write your Python program linearly in your C code. For example, a language where you need to put "BANANA" between every character is not a very good description program and the property is then false. (But if the description language authorizes you to write data in a separate file or in a block, this problem disappear)

Why your $K_1(x)=q$ is flawed

The problem with your definition of $K_1$ is that you may need more than $q$ bits to describe a Turing machine with $q$ states because you need to encode transitions.

So no $K_1$ and $K_2$ are probably not equivalent, but that's mainly $K_1$ 's fault. I think that we can prove that for all $a>0$ there is a $c_a$ such that $K_1(x)≤a|x|+c_a$ . Of course any $a<1$ is enough to disprove the fact that $K_1$ is not a valid function, since it would mean that we can encode more all $2^n$ possible strings of length $n$ into $an+c_a$ bits.

But the size is an incredibly tight bound when building Turing machines. The idea is that in a block of $b$ states there are $b^{2b}$ ways to find transitions for each state and that's better than the usual $2^b$ ways you can fill $b$ bits. Then you can store in each block $\log_2 b$ bits of information. (not $2\log_2 b$ because you have to go in and out the block one way or another)

So yeah... With blocks of size $2^{1/a}$ you could probably prove $K_1(x)≤a|x|+c_a$ . But I already written way too much about why the number of states is not a valid Kolmogorov complexity function. If you want me to elaborate, I will.

Now about $K_2$

The naive descriptive language corresponds roughly to $K_2(x)=q\cdot 2 \cdot (\log_2 q + 2)$ (i.e. $\log_2q$ for each next state and details about write and termination).

As you seem to be, I'm convinced that a better/cheater way would be to authorize to encode "data" into Turing machine, maybe by adding a binary tag in the description language that says whether if a state is a data state (that just writes a bit and go to the next state) or if it does something else. That way you could store one bit of your $x$ in one bit of your descriptive language.

However if you keep the same $K_2$ you could use the same technique I used in the previous part to save a few bits, but I seem to be stuck at $K_2(x)≤a|x|\log|x|+c$ (for any $a>0$ ) .. maybe less than $\log|x|$ , even, but obtaining $O(|x|)$ seems hard. (And I expect it should be $|x|$ , not even $O(|x|)$ .)

— jmad
ソース

do you claim that

$K_1$ is not a kolmogorov-complexity function? This is very surprising to me, since

$K_1$ is actually the definition used in some intoduction to computability course I've once taken (not that it says anything about its correctness).

— Ran G.

Well the fact that

$K_1(x)≤\frac12|x|+c$ is quite disturbing. Consider this: there are

$2^n$ possible words of

$n$ bits and you could encode them using a

$\frac12n+c$ bits? That would imply

$2^n=O(2^{\frac12n})$ (you encoding has to be injective)

— jmad

What if the python program has characters reserved by C?

— PyRulez

コルモゴロフ複雑性定義の同等性

About K(x)≤K′(x)+cK(x)≤K′(x)+cK(x)≤K'(x)+c

Why your K1(x)=qK1(x)=qK_1(x)=q is flawed

Now about K2K2K_2

About $K(x)≤K'(x)+c$

Why your $K_1(x)=q$ is flawed

Now about $K_2$