[Supervised Learning / TensorFlow tutorial] Deep MNIST for Experts (CNN)

Deep learning

[Supervised Learning / TensorFlow tutorial] Deep MNIST for Experts (CNN)

JaykayChoi 2017. 2. 4. 15:45

Deep MNIST for Experts tutorial 에서는 학습 효과를 높이기 위해 Convolutional Neural Network (CNN) 이라는 모델을 사용합니다.

https://www.tensorflow.org/tutorials/mnist/pros/

https://en.wikipedia.org/wiki/Convolutional_neural_network

https://en.wikipedia.org/wiki/Deep_learning

https://ko.wikipedia.org/wiki/%EB%94%A5_%EB%9F%AC%EB%8B%9D

CNN 은 입력 데이터를 그대로 사용하지 않고 데이터를 가공하고 sub sampling 하여 학습할 데이터를 줄이는 방법으로 2차원 입력 데이터에 적합하여 영상과 음성 분야에서 좋은 성능을 발휘합니다.

http://cs231n.github.io/convolutional-networks/#conv

https://nrupatunga.github.io/fcn-segmentation/

CNN 은 위와 같이 Convolution, ReLU, Max pooling 과정을 거치며 데이터를 변형해갑니다.

Convolution 은 데이터를 각 patch 로 나눠 순차적으로 각 영역을 filter 를 이용해 데이터를 가공합니다. 가공하는 방법은 하나의 patch 안의 값을 특정 식에 넣어 (deep learning 에서는 Wx+b) 하나의 값으로 변형하여 데이터를 새롭게 구성합니다. 이렇게 함으로써 가공된 데이터의 각 값은 주변값을 반영한 새로운 값이 됩니다. 이렇게 가공된 데이터를 feature map 이라고 하고 하나의 filter 로만 데이터를 가공하는 것이 아니라 여러 가지의 필터로 동일한 데이터를 가공하여 데이터를 여러 개 쌓아나갑니다.

다음으로 Max pooling 은 데이터를 resize 하는 과정입니다. Convolution 와 같이 데이터를 격자로 나눠 해당 격자안의 데이터 중 가장 큰 값을 뽑아 새롭게 데이터를 구성하여 데이터의 사이즈를 줄입니다. Convolution 와 같이 patch 의 크기와 쉬프트 값에 따라 output 의 크기가 달리지게 됩니다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
 
"""A very simple MNIST classifier.
See extensive documentation at
http://tensorflow.org/tutorials/mnist/beginners/index.md
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
 
import argparse
import sys
 
from tensorflow.examples.tutorials.mnist import input_data
 
import tensorflow as tf
 
FLAGS = None
 
 
def main(_):
  # Import data
  mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
 
  # Create the model
  x = tf.placeholder(tf.float32, [None, 784])
 
  # Define loss and optimizer
  y_ = tf.placeholder(tf.float32, [None, 10])
 
  def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)
 
  def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)
 
  def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
 
  def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                          strides=[1, 2, 2, 1], padding='SAME')
 
  W_conv1 = weight_variable([5, 5, 1, 32])
  b_conv1 = bias_variable([32])
 
  x_image = tf.reshape(x, [-1, 28, 28, 1])
 
  h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
  h_pool1 = max_pool_2x2(h_conv1)
 
  W_conv2 = weight_variable([5, 5, 32, 64])
  b_conv2 = bias_variable([64])
 
  h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
  h_pool2 = max_pool_2x2(h_conv2)
 
  W_fc1 = weight_variable([7 * 7 * 64, 1024])
  b_fc1 = bias_variable([1024])
 
  h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
  h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
 
  keep_prob = tf.placeholder(tf.float32)
  h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
 
  W_fc2 = weight_variable([1024, 10])
  b_fc2 = bias_variable([10])
 
  y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
 
  # The raw formulation of cross-entropy,
  #
  #   tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.nn.softmax(y)),
  #                                 reduction_indices=[1]))
  #
  # can be numerically unstable.
  #
  # So here we use tf.nn.softmax_cross_entropy_with_logits on the raw
  # outputs of 'y', and then average across the batch.
  cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
 
  train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
  correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
  accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
 
  sess = tf.InteractiveSession()
  tf.global_variables_initializer().run()
 
  # Train
  for i in range(20000):
    batch = mnist.train.next_batch(50)
    if i % 100 == 0:
      train_accuracy = accuracy.eval(feed_dict={
        x: batch[0], y_: batch[1], keep_prob: 1.0})
      print("step %d, training accuracy %g" % (i, train_accuracy))
    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
 
  print("test accuracy %g" % accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
 
 
 
if __name__ == '__main__':
  parser = argparse.ArgumentParser()
  parser.add_argument('--data_dir', type=str, default='/tmp/tensorflow/mnist/input_data',
                      help='Directory for storing input data')
  FLAGS, unparsed = parser.parse_known_args()
  tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
 
Colored by Color Scripter
cs

이 소스는 https://github.com/tensorflow/tensorflow/blob/56fc8834c736878af34f00caa95e7d4a57ab01d2/tensorflow/examples/tutorials/mnist/mnist_softmax.py

을 tutorial 에 맞게 수정한 코드입니다.

Convolution 을 하기 위해서는 간단하게 nn.conv2d 함수를 사용하면 됩니다.파라미터는 순서대로 input data, weight, 필터의 쉬프트 간격, padding. 입니다. strides 의 경우 1,1,1,1 을 넣을 경우 해당 필터가 좌로 한 칸. 그리고 좌측으로 모두 이동했을 경우 아래로 한 칸 이동하면서 activation maps을 만들게 됩니다.. padding 의 경우 CNN 에서는 이미지의 태두리에 데이터 0을 넣어 실 데이터를 구분짓는 방법을 사용하는데 이때 padding 값을 same 으로 할 경우 strides가 1*1일 경우 input 데이터의 크기와 out 데이터의 크기를 같게 만들어 주는 필터 크기를 자동으로 만들어 줍니다.

 def conv2d(x, W):
 
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
cs

다음으로 max pooling 을 위해서는 nn.max_pool 함수를 사용하게 되는데 파리미터는 순서대로 input data, 필터의 크기, 필터의 쉬프트 간격, padding 입니다. 여기서처럼 파라미터값을 넣을 경우 필터의 크기는 2*2 이고 쉬프트 간격이 2*2 그리고 padding이 same 이기 때문에 output data의 크기는 input data 의 1/4이 되게 됩니다. (너비, 높이가 절반씩 감소)

def max_pool_2x2(x):
 
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
cs

이제 첫 번째 layer 을 만들게 됩니다. 여기서는 5*5의 크기를 가지는 32개의 filter 를 사용하게 되는데 이를 위해 아래와 같이 W, b 를 만들게 됩니다.

(W_conv1 의 세 번째 값은 입력 채널의 수입니다. MNIST 는 흑백 하나의 채널만 가지고 있습니다)

W_conv1 = weight_variable([5, 5, 1, 32])
 
b_conv1 = bias_variable([32])

Colored by Color Scripter
cs

다음으로 데이터 x 도 4d tensor 로 만들어줘야 됩니다.

x_image = tf.reshape(x, [-1,28,28,1])
cs

이제 convolution 이후 relu 와 max pooling 을 시도하여 첫 번째 convolution layer 을 만듭니다.

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
 
h_pool1 = max_pool_2x2(h_conv1)
cs

다음으로 두 번째 convolution layer 을 만듭니다. max pooling 을 두 번 하였기 때문에 28*28 데이터가 7*7 이 되고 두 번째 convolution layer 의 필터가 64이기 때문에 32*32 크기의 input 데이터가 7*7 크기의 64개 데이터가 되게 됩니다.

W_conv2 = weight_variable([5, 5, 32, 64])
 
b_conv2 = bias_variable([64])
 
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
 
h_pool2 = max_pool_2x2(h_conv2)
cs

이제 다음으로 classification 을 하기 위해 1024개의 neural network 으로 연결되는 fully-connected layer 을 만듭니다.

W_fc1 = weight_variable([7 * 7 * 64, 1024])
 
b_fc1 = bias_variable([1024])
 
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
 
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
cs

다음으로 overfitting 을 방지하기 위해 dropout 을 시행합니다.

keep_prob = tf.placeholder(tf.float32)
 
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
cs

이제 마지막으로 y를 정의합니다.

y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
cs

학습 코드는 기존 tutorial 과 동일하며 이 학습을 통해 weights 와 filter 가 적합하게 학습되게 됩니다.

저작자표시 비영리 변경금지

'Deep learning' 카테고리의 다른 글

[Reinforcement Learning / learn article] Policy Gradient (Two-armed Bandit) (0)	2017.02.27
[Reinforcement Learning / learn article] Q-Learning (0)	2017.02.23
[Supervised Learning / TensorFlow tutorial] MNIST deep neural network with summaries (0)	2017.01.30
[Supervised Learning / TensorFlow tutorial] MNIST For ML Beginners - Softmax regression (1)	2017.01.24
TensorFlow 0.12.1 설치 (Installing from sources for Linux) (0)	2017.01.07

현재글[Supervised Learning / TensorFlow tutorial] Deep MNIST for Experts (CNN)

Program Programming Programmer

프로그래머

Josephus, Base Conversion, convex hull, Divide And Conquer, Erathosthenes, binary search, Deterministic finite automaton, dynamic programming, Simulation, bit mask, bipartite matching, dfs, memoization, Math, Complete Search, sort, Shoelace Formula, GREEDY, string, binomial coefficient,

Today :
Yesterday :

Program Programming Programmer