김성훈 딥러닝 5 - Logistic Classification의 가설함수 정의

기타/WWW

김성훈 딥러닝 5 - Logistic Classification의 가설함수 정의

하늘이푸른오늘 2017. 11. 16. 23:08

Lec 05-1 - Logistic Classification의 가설함수 정의

https://www.youtube.com/watch?v=PIjno6paszY

Neural network과 관계가 깊음.
Binary Classification은 두가지 범주로 나누는 것 -> 0, 1 encoding

Spam or Ham
Show or Hide
주식 Buy/Sell

Linear Regression으로 가능한가?

예를 들어 0.5 정도 이하면 Fail로 두면 될텐데, 50과 같은 값으로 인해, (대칭이 이루어지지 않아) 합격/불합격 선이 바뀌게 될 수 있다.
또한 출력이 0 이하나 1 이상으로 나올 수 있다.... 별로 좋지 않다.

그래서 Logistic Hypothesis 가 필요. (출력 범위가 0에서 1까지)

아래와 같은 logistic(sigmod) 함수를 사용하면 됨.

$$ g(z) = \frac{1}{\left(1+e^{-z} \right)} $$
어떠한 z에 대해서도 0 < g(z) <1 가되는 특성이 있음.

Lec 5-2 Logistic Regression의 cost 함수

https://www.youtube.com/watch?v=6vzchGYEJBc

이 Sigmoid 함수에 대한 cost 함수는 오른쪽과 같이 local 극소점이 많은, 울퉁불퉁한 형태이다.

$$ cost(W,b) = {1 \over m} \sum_{i=1}^m \left( H ( x^{(i)} ) - y^{(i)} \right)^2$$
따라서, 이러한 cost 함수를 사용할 수 없다.

cost 함수는... 실제값과 예측값이 같으면 cost값이 작아지고, 다르면 cost가 커져야 한다. 그래서 다음과 같은 cost 함수를 사용한다.

$$\begin{matrix}cost(W) = {1 \over m} \sum c(H(x), y) \\
(H(x), y) = \begin{cases} -log(H(x) &: y=1\\ -log(1-H(x)) &: y=0 \end{cases} \end{matrix} $$

이 cost 함수는 다음과 같은 특성이 있다.

y=1 이고, H(x)=1 일때... 즉 예측이 맞으면 cost(1) => 0
y=1 이고, H(x)= 0 이면...즉 예측이 틀리면 cost(0) => 엄청커짐.
y=0 이고, H(x)=0 이면... 즉 예측이 맞으면 cost(0) => 0
y=1 이고, H(x)=1 이면... 즉 예측이 틀리면 cost(0) => 엄청커짐.
이 cost 함수도 오목한 형태여서, GradientDescent 알고리듬을 적용할 수 있다.

결론 : cost 함수는 아래와 같다.

맨 아래식에서 y=1 이면 앞의 항만 남고, y=0 이면 위의 항만 남아서 결론적으로 중간에 있는 식과 동일해진다.

$$ \begin{matrix} cost(W) &=& {1 \over m} \sum c(H(x), y) \\
C(H(x), y) &=& {\begin{cases} -log(H(x)) &: y=1\\ -log(1-H(x)) &: y=0\end{cases}} \\
C(H(x), y) &=& -y log(H(x)) - (1-y) log (1-H(x)) \end{matrix}$$

Gradient descent algorithm 적용

미분이 필요하지만 생략. 아래와 같이 적용만 하면 됨.

Lab 05 : TensorFlow로 Logistic Classification을 구현하기

https://www.youtube.com/watch?v=2FeWGgnyLSw

소스코드 : https://github.com/hunkim/DeepLearningZeroToAll
복습 : Logistic Classification의 Hypothesis, Cost Function

$$\begin{matrix} Hypothesis &:& H(x) = \frac{1}{1+e^{-W^T X}} \\
Cost &:& - cost(W) = {1 \over m} \sum \left[ y log (H(x)) + (1-y) (log (1- H(x)) \right] \\
Gradient &:& W := W - \alpha {\partial \over {\partial W}} cost (W) \end{matrix}$$

예제 돌려보기.

import tensorflow as tf
tf.set_random_seed(777)

x_data = [[1,2], [2,3], [3,1], [4,3], [5,3], [6,2]]
y_data =[[0], [0], [0], [1], [1], [1]] # output 출력이 0/1, true/false 등이라는 것이 중요함.

X = tf.placeholder(tf.float32, shape = [None, 8])
Y = tf.placeholder(tf.float32, shape = [None, 1])

W = tf.Variable(tf.random_normal([8,1]), name = 'weight') # X,Y의 숫자와 동일
b = tf.Variable(tf.random_normal([1]), name = 'bias') #항상 나가는 숫자와 동일

hypothesis = tf.sigmoid(tf.matmul(X,W) + b) # tf.div(1., 1. + tf.exp(tf.matmul(X, W) +b)) 라고 해도 됨.
cost = -tf.reduce_mean( Y * tf.log(hypothesis) + (1- Y) * tf.log(1- hypothesis)) # 식을 옮긴 것임.
train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# hypothesis는 0과 1사이의 실수로 나옴. 이것을 0/1로 바꿔줌
predicted = tf.cast(hypothesis > 0.5, dtype = tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))

sess =tf.Session()
sess.run(tf.global_variables_initializer())
for step in range(10001) :
cost_val, _ = sess.run([cost, train], feed_dict={X:x_data, Y:y_data})
if step % 200 == 0:
print(step, cost_val)

h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X:x_data, Y:y_data})
print("\nHypothesis: \n", h, "\nCorrect (Y): \n", c, "\nAccuracy: ", a)

파일에서 불러올 때

import tensorflow as tf
import numpy as np

tf.set_random_seed(777)

xy = np.loadtxt('data-03-diabetes.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]

X = tf.placeholder(tf.float32, shape = [None, 8])
Y = tf.placeholder(tf.float32, shape = [None, 1])

W = tf.Variable(tf.random_normal([8,1]), name = 'weight') # X,Y의 숫자와 동일
b = tf.Variable(tf.random_normal([1]), name = 'bias') #항상 나가는 숫자와 동일

h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X:x_data, Y:y_data})
print("\nHypothesis: \n", h, "\nCorrect (Y): \n", c, "\nAccuracy: ", a)

data-03-diabetes.csv

-0.294118,0.487437,0.180328,-0.292929,0,0.00149028,-0.53117,-0.0333333,0
-0.882353,-0.145729,0.0819672,-0.414141,0,-0.207153,-0.766866,-0.666667,1
-0.0588235,0.839196,0.0491803,0,0,-0.305514,-0.492741,-0.633333,0
-0.882353,-0.105528,0.0819672,-0.535354,-0.777778,-0.162444,-0.923997,0,1

더 많은 데이터로 테스트 하고 싶을 경우, https://www.kaggle.com 을 가볼 것.

현재글김성훈 딥러닝 5 - Logistic Classification의 가설함수 정의

Quadcopter, 구글어스, 3D모델, Drone, 스트릿뷰, 드론, street view, Geocaching, 구글, 스테이블 디퓨전, google, GPS, 인공지능 이미지, Stable Diffusion, Google Earth, 3D 빌딩, 3D City, 이미지 생성 AI, 위성영상, 지오캐싱,

Today :
Yesterday :

공간정보와 인터넷지도