ニューラルネットワークの追加出力層（10進数から2進数）

17

私はオンラインブックからの質問に取り組んでいます：

http://neuralnetworksanddeeplearning.com/chap1.html

追加の出力層が5つの出力ニューロンである場合、前の層のバイアスをそれぞれ0.5、重みをそれぞれ0.5に設定できることを理解できます。しかし、今の質問では、4つの出力ニューロンの新しい層を求めています。これは、で10の可能な出力を表すのに十分 $2^{4}$ です。

誰かがこの問題を理解して解決するために必要な手順を説明してくれますか？

演習問題：

上記の3層ネットワークに追加の層を追加することにより、数字のビット単位の表現を決定する方法があります。下の図に示すように、追加のレイヤーは前のレイヤーの出力をバイナリ表現に変換します。新しい出力レイヤーの重みとバイアスのセットを見つけます。ニューロンの最初の3つの層は、3番目の層（つまり、古い出力層）の正しい出力の活性化が少なくとも0.99で、誤った出力の活性化が0.01未満であると仮定します。

neural-network

— ビクター・イップ
ソース

16

問題は、古い表現と新しい表現の間で次のマッピングを行うように求めています。

Represent    Old                     New
0            1 0 0 0 0 0 0 0 0 0     0 0 0 0 
1            0 1 0 0 0 0 0 0 0 0     0 0 0 1 
2            0 0 1 0 0 0 0 0 0 0     0 0 1 0 

3            0 0 0 1 0 0 0 0 0 0     0 0 1 1 
4            0 0 0 0 1 0 0 0 0 0     0 1 0 0 
5            0 0 0 0 0 1 0 0 0 0     0 1 0 1 

6            0 0 0 0 0 0 1 0 0 0     0 1 1 0 
7            0 0 0 0 0 0 0 1 0 0     0 1 1 1 
8            0 0 0 0 0 0 0 0 1 0     1 0 0 0 

9            0 0 0 0 0 0 0 0 0 1     1 0 0 1

古い出力層は単純な形式であるため、これは非常に簡単に実現できます。各出力ニューロンは、それを表すためにオンになっているはずの出力ニューロンとそれ自体を表すはずの出力ニューロンとの間に正の重みを、オフにすべき出力ニューロンとの間に負の重みを持たせます。値は、きれいにオンまたはオフに切り替わるのに十分な大きさに結合する必要があるため、+ 10や-10などの大きな重みを使用します。

ここでシグモイド活性化がある場合、バイアスはそれほど重要ではありません。各ニューロンを単にオンまたはオフに飽和させたいだけです。この質問により、古い出力層で非常に明確な信号を想定することができました。

したがって、3を表し、ニューロンにゼロインデクシングを使用する順序を示す例を見てください（これらのオプションは質問に設定されていません）、古い出力のアクティブ化からの重みがあります、新しい出力のロジットに、次の通りに： $i=3$ $A_3^{Old}$ $Z_j^{New}$ $Z_j^{New} = \Sigma_{i=0}^{i=9} W_{ij} * A_i^{Old}$

W_{3 、 0} = - 10

$W_{3,0} = -10$

W_{3 、 1} = - 10

$W_{3,1} = -10$

W_{3 、 2} = + 10

$W_{3,2} = +10$

W_{3 、 3} = + 10

$W_{3,3} = +10$

0 0 1 1「3」を表す古い出力層のニューロンのみがアクティブな場合、これは明らかに出力に近い値を生成するはずです。質問では、1つのニューロンの0.99の活性化と、古い層の競合するニューロンの<0.01を想定できます。したがって、全体で同じ大きさの重みを使用する場合、他の古いレイヤーのアクティベーション値からの+ -0.1（0.01 * 10）に由来する比較的小さな値は、+-9.9の値に重大な影響を与えず、新しいレイヤーの出力0または1に非常に近いところで飽和します。

— ニール・スレーター
ソース

ありがとうございました。私はこの部分を完全に追うことができませんでした。さらに詳しく説明してもらえますか？-「次のように、古い出力i = 3、AOld3のアクティブ化から新しい出力ZNewjのロジットに移動する重みがあります。ここで、ZNewj =Σi= 9i = 0Wij ∗ AOldiは次のとおりです。、2 = + 10 W3,3 = + 10 "

— ビクターイップ

A_{i} = f (Z_{i})

$A_i = f( Z_i )$

f

$f$

@NeilSlater-サンプルの重みは3以外の出力に対して機能しますか？彼らがそうなるとは思わない。詳しく説明してください。ありがとう。

— -FullStack

A_{3}^{o l d}

$A_3^{old}$

1

@ Rrz0：出力のシグモイドレイヤーを想定しているため、バイナリ分類であるため、ビットはオンまたはオフです。したがって、あなたの例sigmoid((0 * 10) * 1)ではどれが0.5になります。適切に大きな数を選択することにより、シグモイドの前に非常に高いまたは低い出力を確保し、その後0または1に非常に近く出力します。これは、FullStackの答えで想定される線形出力よりも堅牢なIMOです。 2つの答えは同じです。

— ニールスレーター

4

SaturnAPIの以下のコードがこの質問に答えています。https://saturnapi.com/artitw/neural-network-decimal-digits-to-binary-bitwise-conversionでコードを参照して実行します

% Welcome to Saturn's MATLAB-Octave API.
% Delete the sample code below these comments and write your own!

% Exercise from http://neuralnetworksanddeeplearning.com/chap1.html
% There is a way of determining the bitwise representation of a digit by adding an extra layer to the three-layer network above. The extra layer converts the output from the previous layer into a binary representation, as illustrated in the figure below. Find a set of weights and biases for the new output layer. Assume that the first 3 layers of neurons are such that the correct output in the third layer (i.e., the old output layer) has activation at least 0.99, and incorrect outputs have activation less than 0.01.

% Inputs from 3rd layer
xj = eye(10,10)

% Weights matrix
wj = [0 0 0 0 0 0 0 0 1 1 ;
      0 0 0 0 1 1 1 1 0 0 ;
      0 0 1 1 0 0 1 1 0 0 ;
      0 1 0 1 0 1 0 1 0 1 ]';

% Results
wj*xj


% Confirm results
integers = 0:9;
dec2bin(integers)

— FullStack
ソース

これにより、線形出力レイヤーの重みのセットが実装されることに注意してください。対照的に、私の答えは、出力層でのシグモイド活性化を想定しています。それ以外の場合、2つの答えは同等です。

— ニールスレーター

入力の意味は何eye(10,10)ですか？

— -Rrz0

はい、それは確かにちょうどオクターブオンラインでそれを試してみましたが、おかげで確認!! ... PS、魔法のように動作します：誰かがスタック:)でなければなりません説明O少しも、良いでしょう

— Anaximandroアンドラーデ

1

@ Rrz0それは恒等行列を作成するためのMatlab / Octave関数です（主対角線に1つだけ）

— Anaximandro Andrade

0

上記の演習のPython証明：

"""
NEURAL NETWORKS AND DEEP LEARNING by Michael Nielsen

Chapter 1

http://neuralnetworksanddeeplearning.com/chap1.html#exercise_513527

Exercise:

There is a way of determining the bitwise representation of a digit by adding an extra layer to the three-layer network above. The extra layer converts the output from the previous layer into a binary representation, as illustrated in the figure below. Find a set of weights and biases for the new output layer. Assume that the first 3 layers of neurons are such that the correct output in the third layer (i.e., the old output layer) has activation at least 0.99, and incorrect outputs have activation less than 0.01.

"""
import numpy as np


def sigmoid(x):
    return(1/(1+np.exp(-x)))


def new_representation(activation_vector):
    a_0 = np.sum(w_0 * activation_vector)
    a_1 = np.sum(w_1 * activation_vector)
    a_2 = np.sum(w_2 * activation_vector)
    a_3 = np.sum(w_3 * activation_vector)

    return a_3, a_2, a_1, a_0


def new_repr_binary_vec(new_representation_vec):
    sigmoid_op = np.apply_along_axis(sigmoid, 0, new_representation_vec)
    return (sigmoid_op > 0.5).astype(int)


w_0 = np.full(10, -1, dtype=np.int8)
w_0[[1, 3, 5, 7, 9]] = 1
w_1 = np.full(10, -1, dtype=np.int8)
w_1[[2, 3, 6, 7]] = 1
w_2 = np.full(10, -1, dtype=np.int8)
w_2[[4, 5, 6, 7]] = 1
w_3 = np.full(10, -1, dtype=np.int8)
w_3[[8, 9]] = 1

activation_vec = np.full(10, 0.01, dtype=np.float)
# correct number is 5
activation_vec[3] = 0.99

new_representation_vec = new_representation(activation_vec)
print(new_representation_vec)
# (-1.04, 0.96, -1.0, 0.98)
print(new_repr_binary_vec(new_representation_vec))
# [0 1 0 1]

# if you wish to convert binary vector to int
b = new_repr_binary_vec(new_representation_vec)
print(b.dot(2**np.arange(b.size)[::-1]))
# 5

— NpnSaddy
ソース

0

Octaveを使用したNeil Slaterのコメントに関するFullStackの回答に対する少しの修正：

% gzanellato
% Octave

% 3rd layer:
A = eye(10,10);

% Weights matrix:

fprintf('\nSet of weights:\n\n')

wij = [-10 -10 -10 -10 -10 -10 -10 -10 10 10;
       -10 -10 -10 -10 10 10 10 10 -10 -10;
       -10 -10 10 10 -10 -10 10 10 -10 -10;
       -10 10 -10 10 -10 10 -10 10 -10 10]

% Any bias between -9.999.. and +9.999.. runs ok

bias=5

Z=wij*A+bias;

% Sigmoid function:

for j=1:10;
  for i=1:4;
    Sigma(i,j)=int32(1/(1+exp(-Z(i,j))));
  end
end

fprintf('\nBitwise representation of digits:\n\n')

disp(Sigma')

— グザネラート
ソース