[학습] 모두를 위한 딥러닝 #5 Lec 07-1, 07-2, Lab 07-01, 07-02

교육&학습/Deep Learning 2017. 12. 10. 13:47

Lec 07-1 Learning Rate, Overfitting and Regularization

1. Determining Learning Rate

1) Try Several learning Rates ( start with 0.01 )

2) Observe the cost function

3) Check it goes down in a reasonable rate

- 너무 크면 divergence , 너무 작은면 늦게 수렴

2. Data(X) Preprocessing for gradient descent

1) RAW data가 편중되어 있을 경우, 특정 변수의 민감도가 높거나 낮아질 수 있음.

2) Normalization

3) Standardization

X_std[:,0] = (X[:0] - X[:,0].mean()) / X[:,0].std()

2. Regularization

1) OverFitting

- Our model is very good with training data set (with memorization)

- Not good at test dataset or in real use

2) Solutions for overfitting

- More Training data

- Reduce the number of features

- Regularization

3) Regularization

- Let's not have too big numbers in the weight

( Loss함수에 Weight크기를 포함하도록 함으로써 해당 값도 최소화하는 경우를 찾고자 함)

- with Tensorflow

l2reg = 0.001 * tf.reduce_sum(tf.square(W))

Lec 07-2: Training/Testing Data Set

1. Data set

2. Online Learning

1) 데이터 셋을 여러 단위로 분리해서 예측 진행

2) 이전 단계에서 예측돼었던 내용이 새로운 데이터 셋 예측에서도 동일한 영향을 주여야 함

3. M NIST Data set

Lab 07-1: training/test dataset, learning rate, normalization

1. Test Dataset & Learning Rate

0) Test Data

# Evaluation our model using this test dataset

x_test = [[2, 1, 1],

[3, 1, 2],

[3, 3, 4]]

y_test = [[0, 0, 1],

[0, 0, 1],

[0, 0, 1]]

1) optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-10).minimize(cost)

TOOL Small : Local Min. No Progress

200 5.73203 [[ 0.80269569 0.67861289 -1.21728313]

[-0.3051686 -0.3032113 1.50825703]

[ 0.75722361 -0.7008909 -2.10820389]]

Prediction: [0 0 0]

Accuracy: 0.0

2) optimizer = tf.train.GradientDescentOptimizer(learning_rate=1.5).minimize(cost)

TOOL Large : Divergen

200 nan [[ nan nan nan]

[ nan nan nan]

[ nan nan nan]]

Prediction: [0 0 0]

Accuracy: 0.0

3) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

Proper Initial :

200 0.670909 [[-1.15377057 0.2814692 1.13632655]

[ 0.37484601 0.18958248 0.33544892]

[-0.35609847 -0.43973017 -1.256042 ]]

Prediction: [2 2 2]

Accuracy: 1.0

2. Normalized input

1) Large Value Raw Data

y = np.array([[828.659973, 833.450012, 908100, 828.349976, 831.659973],

[823.02002, 828.070007, 1828100, 821.655029, 828.070007],

[819.929993, 824.400024, 1438100, 818.97998, 824.159973],

[816, 820.958984, 1008100, 815.48999, 819.23999],

[819.359985, 823, 1188100, 818.469971, 818.97998],

[819, 823, 1198100, 816, 820.450012],

[811.700012, 815.25, 1098100, 809.780029, 813.669983],

[809.51001, 816.659973, 1398100, 804.539978, 809.559998]])

100 Cost: nan

Prediction:

[[ nan]

[ nan]

[ nan]]

Process finished with exit code 0

2) Normalized input

def MinMaxScaler(data):

numerator = data - np.min(data, 0)

denominator = np.max(data, 0) - np.min(data, 0)

# noise term prevents the zero division

return numerator / (denominator + 1e-7)

...

xy = MinMaxScaler(xy)

100 Cost: 0.0136869

Prediction:

[[ 1.12295258]

[ 0.63500863]

[ 0.53340685]

[ 0.4315863 ]

[ 0.53191048]

[ 0.55868214]

[ 0.15761785]

[ 0.14425412]]

Process finished with exit code 0

Lab 07-2: Meet MNIST Dataset

1. MNIST Image

1) 28 * 28 * 1 Image (Binary Bitmap)

# MNIST data image of shape 28 * 28 = 784

X = tf.placeholder(tf.float32, [None, 784])

# 0 - 9 digits recognition = 10 classes

Y = tf.placeholder(tf.float32, [None, nb_classes])

2. Reading Data and set variables

from tensorflow.examples.tutorials.mnist import input_data

# Check out https://www.tensorflow.org/get_started/mnist/beginners for

# more information about the mnist dataset

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

nb_classes = 10

# MNIST data image of shape 28 * 28 = 784

X = tf.placeholder(tf.float32, [None, 784])

# 0 - 9 digits recognition = 10 classes

Y = tf.placeholder(tf.float32, [None, nb_classes])

W = tf.Variable(tf.random_normal([784, nb_classes]))

b = tf.Variable(tf.random_normal([nb_classes]))

...

batch_xs, batch_ys = mnist.train.next_batch(100)

...

print("Accuracy: ", accuracy.eval(session=sess, feed_dict={X: mnist.test.images, Y: mnist.test.labels}))

3. SoftMax

# Hypothesis (using softmax)

hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)

cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

# Test model

is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y, 1))

# Calculate accuracy

accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))

4. epoch / Batch

1) 데이터가 많을 경우, 모든 데이터를 올리려면 Memory가 많이 필요하므로, Batch로 분할하여 적용

2) epoch [에폭] : 전체 데이터를 한번 모두 훈련하는 과정

3) iteration per epoch = [전체데이터 수] / [batch_size]

# parameters

training_epochs = 15

batch_size = 100

with tf.Session() as sess:

# Initialize TensorFlow variables

sess.run(tf.global_variables_initializer())

# Training cycle

for epoch in range(training_epochs):

avg_cost = 0

total_batch = int(mnist.train.num_examples / batch_size)

for i in range(total_batch):

batch_xs, batch_ys = mnist.train.next_batch(batch_size)

c, _ = sess.run([cost, optimizer], feed_dict={X: batch_xs, Y: batch_ys})

avg_cost += c / total_batch

print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))

5. Result

1) 0.8951 정확도

Epoch: 0007 cost = 0.591160339 Epoch: 0008 cost = 0.563868978 Epoch: 0009 cost = 0.541745167 Epoch: 0010 cost = 0.522673571 Epoch: 0011 cost = 0.506782322 Epoch: 0012 cost = 0.492447640 Epoch: 0013 cost = 0.479955830 Epoch: 0014 cost = 0.468893666 Epoch: 0015 cost = 0.458703479 Learning finished Accuracy: 0.8951 Label: [3] Prediction: [5]

3을 5로 인식한 결과

7을 2로 인식하였다. 아직 89 % 정확도 이니까...

2) Learning Rate, Epoch 회수 조정으로 정확도를 더 높일 수 있다.

Learning Rate = 0.4 , Epoch = 100시 92.25%

Epoch: 0098 cost = 0.243552753

Epoch: 0099 cost = 0.243438786

Epoch: 0100 cost = 0.243145558

Learning finished

Accuracy: 0.9225

Label: [5]

Prediction: [5]

'교육&학습 > Deep Learning' 카테고리의 다른 글

[학습] 모두를 위한 딥러닝 #7 Lec 09-1, 09-x, 09-2, Lab 09-1,09-2 (0)	2017.12.11
[학습] 모두를 위한 딥러닝 #6 Lec 08-1, 08-2, Lab 08 (0)	2017.12.10
[학습] 모두를 위한 딥러닝 #4 Lec 06-1, 06-2, Lab 06-01, 06-02 (0)	2017.12.09
[학습] 모두를 위한 딥러닝 #3 Lec 05-1, 05-2, Lab 05 (0)	2017.12.07
[학습] 모두를 위한 딥러닝 #2 Lab 01 - Lab 04 (0)	2017.11.30

Posted by 꿈을펼쳐라

세상을 시뮬이이션하고픈 실천가

[학습] 모두를 위한 딥러닝 #5 Lec 07-1, 07-2, Lab 07-01, 07-02

Lec 07-1 Learning Rate, Overfitting and Regularization

Lec 07-2: Training/Testing Data Set

Lab 07-1: training/test dataset, learning rate, normalization

Lab 07-2: Meet MNIST Dataset

'교육&학습 > Deep Learning' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바


	Sim-ulate Uni-Verse 우주을 시뮬레이션하다 OpenGL OpenCV SQL SmartFactory Machine Learning 서울의 야경, 신이 바라보는 세상의 모습 by 꿈을펼쳐라

세상을 시뮬이이션하고픈 실천가

[학습] 모두를 위한 딥러닝 #5 Lec 07-1, 07-2, Lab 07-01, 07-02

Lec 07-1 Learning Rate, Overfitting and Regularization

Lec 07-2: Training/Testing Data Set

Lab 07-1: training/test dataset, learning rate, normalization

Lab 07-2: Meet MNIST Dataset

'교육&amp;학습 > Deep Learning' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바

'교육&학습 > Deep Learning' 카테고리의 다른 글