[Supervised Learning / python / not use tensorflow] MNIST

Deep learning

[Supervised Learning / python / not use tensorflow] MNIST - Softmax regression

JaykayChoi 2017. 3. 28. 22:34

tensorflow 을 사용하지 않고 numpy 을 이용하여 MNIST - Softmax regression 을 구현해봤습니다. (MNIST 이미지를 가져오는 부분에서는 tensorflow 의 소스 사용)

코드는 tensorflow 에서 제공하는 tutorial 과 그 내용이 같습니다.

[TensorFlow] MNIST For ML Beginners - Softmax regression

python 3.6

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
 
class NN:
    def __init__(self):
        self.W = np.random.uniform(low=-0.01, high=0.01, size=(784, 10))
        self.b = np.zeros(10)
        self.learningRate = 0.001
 
    def sigmoid(self, x):
        return 1.0 / (1.0 + np.exp(-x))
 
    def softmax(self, x):
        if x.ndim == 1:
            x = x.reshape([1, x.size])
        modifiedX = x -  np.max(x, 1).reshape([x.shape[0],1]);
        sigmoid = np.exp(modifiedX)
        return sigmoid/np.sum(sigmoid,axis=1).reshape([sigmoid.shape[0],1]);
 
    def getCrossEntropy(self, predictY, labelY):
        return np.mean(-np.sum(labelY * np.log(self.softmax(predictY)), axis=1))
 
    def feedForward(self, x):
        y = np.dot(x, self.W) + self.b
        softmaxY = self.softmax(y)
        return softmaxY
 
    def backpropagation(self, x, labelY, y):
        dW = x.T.dot(y - labelY)
        return dW
 
    def update(self, dW):
        self.W -= self.learningRate * dW
 
 
if __name__ == '__main__':
 
    mnist = input_data.read_data_sets('/tmp/tensorflow/mnist/input_data', one_hot=True)
 
    np.random.seed(777)
 
    NN = NN()
 
    for _ in range(1000):
        batch_xs, batch_ys = mnist.train.next_batch(100)
        y = NN.feedForward(batch_xs)
        dW = NN.backpropagation(batch_xs, batch_ys, y)
        NN.update(dW)
 
 
    y = NN.feedForward(mnist.test.images)
    correct_prediction = np.equal(np.argmax(y, 1), np.argmax(mnist.test.labels, 1))
    accuracy = np.mean(correct_prediction)
    print(accuracy)
 
Colored by Color Scripter
cs

다른 점은 weights 를 0으로 초기화할 경우 제대로 학습이 되지 않아 균등 분포를 사용했습니다.

tensorflow 을 이용한 구현과 다른 부분은 backpropagation 과 학습을 하는 부분입니다.

tensorflow 의 경우 아래와 같이 간단하게 할 수 있지만 직접 구현을 하기 위해서는 backpropagation 과 학습을 하는 부분을 제대로 이해해야만 했습니다.

cross_entropy = tf.reduce_mean(
      tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
cs

먼저 y 를 예상할 수 있는 식을 Wx + b 라 하고, 이 식을 통해 얻어진 값을 predictY, 정답인 이미지에 해당되는 정답 labelY 가 있을 때

cost 값을 (predictY - labelY)^2 이라 표현할 수 있을 것입니다. (제곱을 하는 이유는 predictY - labelY 가 음수일 수도 있기 때문)

그리고 이 cost 를 이용하여 gradient descent https://en.wikipedia.org/wiki/Gradient_descent

을 사용하기 위해서

weights 에 cost 의 미분한 값을 빼는 과정을 반복하면 cost 가 0이 되는 점을 향해 weights 가 학습이 될 것입니다.

이를 위해 (predictY - labelY)^2 을 w로 미분한 식 (predictY - labelY) * x 에 learning rate 를 곱하여 weights 을 학습시키는 방법을 사용했습니다.

(xW - y)^2 => x(xW - y)

아래는 layer 를 하나 추가한 코드입니다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
 
class NN:
    def __init__(self):
        self.W1 = np.random.uniform(low=-0.01, high=0.01, size=(784, 100))
        self.W2 = np.random.uniform(low=-0.01, high=0.01, size=(100, 10))
        self.learningRate = 0.001
 
    def sigmoid(self, x):
        return 1.0 / (1.0 + np.exp(-x))
 
    def dsigmoid(self,x):
        return x * (1. - x)
 
    def softmax(self, x):
        if x.ndim == 1:
            x = x.reshape([1, x.size])
        modifiedX = x -  np.max(x, 1).reshape([x.shape[0],1]);
        sigmoid = np.exp(modifiedX)
        return sigmoid/np.sum(sigmoid,axis=1).reshape([sigmoid.shape[0],1]);
 
    def getCrossEntropy(self, predictY, labelY):
        return np.mean(-np.sum(labelY * np.log(self.softmax(predictY)), axis=1))
 
    def feedForward(self, x):
        y1 = np.dot(x, self.W1)
        sigmoidY1 = self.sigmoid(y1)
 
        y2 = np.dot(sigmoidY1, self.W2)
        softmaxY2 = self.softmax(y2)
 
        return sigmoidY1, softmaxY2
 
    def backpropagation(self, x, labelY, predictY1, predictY2):
        error = predictY2 - labelY
 
        dY2 = np.matmul(error, self.W2.T)
        dY1 = self.dsigmoid(predictY1)
 
        dW1 = x.T.dot(dY2 * dY1)
 
        dW2 = predictY1.T.dot(error)
 
        return dW1, dW2
 
    def update(self, dW1, dW2):
        self.W1 -= self.learningRate * dW1
        self.W2 -= self.learningRate * dW2
 
 
if __name__ == '__main__':
 
    mnist = input_data.read_data_sets('/tmp/tensorflow/mnist/input_data', one_hot=True)
 
    np.random.seed(777)
 
    NN = NN()
 
    for _ in range(1000):
        batch_xs, batch_ys = mnist.train.next_batch(100)
        y1, y2 = NN.feedForward(batch_xs)
        dW1, dW2 = NN.backpropagation(batch_xs, batch_ys, y1, y2)
        NN.update(dW1, dW2)
 
 
    y1, y2 = NN.feedForward(mnist.test.images)
    correct_prediction = np.equal(np.argmax(y2, 1), np.argmax(mnist.test.labels, 1))
    accuracy = np.mean(correct_prediction)
    print(accuracy)
 
Colored by Color Scripter
cs

저작자표시 비영리 변경금지

'Deep learning' 카테고리의 다른 글

[Reinforcement Learning / learn article] Model-Based RL (CartPole) (1)	2017.04.10
[Reinforcement Learning / review article / not use tensorflow] Policy Gradient (CartPole) (0)	2017.04.08
[Reinforcement Learning / review article / c++] Policy Gradient (Two-armed Bandit) (0)	2017.03.26
[Reinforcement Learning / learn article] Policy Gradient (CartPole) (0)	2017.03.15
[Reinforcement Learning / learn article] Policy Gradient (Contextual Bandits) (0)	2017.03.07

현재글[Supervised Learning / python / not use tensorflow] MNIST - Softmax regression

Program Programming Programmer

프로그래머

Math, sort, Complete Search, Shoelace Formula, bit mask, convex hull, Base Conversion, GREEDY, binomial coefficient, string, memoization, binary search, Divide And Conquer, bipartite matching, dynamic programming, Erathosthenes, Simulation, Josephus, dfs, Deterministic finite automaton,

Today :
Yesterday :

Program Programming Programmer