#!/usr/bin/env python3
Loss Function in Linear Regressions
이 그림은 Learning rate에 따른 L1과 L2 손실함수를 보여줍니다.
learning rate가 낮으면 정확도는 높아지지만 그만큼 많은 시간과 비용이 들어가며
learning rate가 높으면 발산 or 수렴할 수 있습니다. 따라서 적절한 learning rate를 선정하는 것이 중요합니다.
Linear Regression: L1 vs L2
출처: https://github.com/nfmcclure/tensorflow_cookbook
L1 Loss for Linear Least Squares 공식은 다음과 같습니다.
$$S =\sum_{i=1}^{N }{ \left| y_{i}-\overset {\wedge }{y}_{i} \right |} $$
여기서 $N$은 data points의 수이고, $y_{i}$는 $y$의 실제 값이며, $\overset{\wedge}{y_{i}}$는 $i$번째 $y$의 예측 값입니다.
iris데이터셋에 L1 Loss function을 사용하여 결과를 확인해보면 다음과 같습니다.
from sklearn.datasets import load_iris
import tensorflow as tf
import numpy as np
from tensorflow.python.framework import ops
ops.reset_default_graph()
iris = load_iris()
print(iris.keys())
# dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])
print(iris.feature_names)
# ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
x_vals = iris.data[:, 3] # petal width
y_vals = iris.data[:, 0] # sepal length
batch_size = 25
lrn_rate = 0.1 # divergenc at 0.4
iterations = 100
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
A = tf.Variable(tf.random_normal(shape=[1, 1]))
b = tf.Variable(tf.random_normal(shape=[1, 1]))
with tf.Session() as sess:
model_output = tf.add(tf.matmul(x_data, A), b)
loss_l1 = tf.reduce_mean(tf.abs(y_target - model_output))
my_opt_l1 = tf.train.GradientDescentOptimizer(learning_rate=lrn_rate)
train_step_l1 = my_opt_l1.minimize(loss_l1)
init = tf.global_variables_initializer()
init.run()
loss_vec_l1 = []
for i in range(iterations):
rnd_idx = np.random.choice(len(x_vals), size=batch_size)
rnd_x = x_vals[rnd_idx].reshape(-1, 1)
rnd_y = y_vals[rnd_idx].reshape(-1, 1)
feed = {x_data:rnd_x, y_target:rnd_y}
sess.run(train_step_l1, feed_dict=feed)
temp_loss_l1 = sess.run(loss_l1, feed_dict=feed)
loss_vec_l1.append(temp_loss_l1)
if (i+1)%10 == 0:
print('Step {}: A={}, b={}'.format(i+1, A.eval(), b.eval()))
# Step 10: A=[[-0.4101382]], b=[[-0.2588229]]
# Step 20: A=[[0.7642619]], b=[[0.74117714]]
# Step 30: A=[[1.9774618]], b=[[1.7411773]]
# Step 40: A=[[2.3398612]], b=[[2.3091774]]
# Step 50: A=[[2.252261]], b=[[2.6771777]]
이번에는 L2 Loss function에 대해 알아보겠습니다.
L2 Loss for Linear Least Squares는 다음과 같으며
$$S = \sum_{i=1}^{N}{\left(y_{i}-\overset{\wedge}{y_{i}} \right)^{2}}$$
위와 마찬가지로 $N$은 data points의 수이고, $y_{i}$는 $y$의 실제 값이며, $\overset{\wedge}{y_{i}}$는 $i$번째 $y$의 예측 값입니다.
다음 코드는 iris데이터에 L2 Loss function을 사용한 예입니다.
ops.reset_default_graph()
iris = load_iris()
print(iris.keys())
# dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])
print(iris.feature_names)
# ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
x_vals = iris.data[:, 3] # petal width
y_vals = iris.data[:, 0] # sepal length
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
A = tf.Variable(tf.random_normal(shape=[1, 1]))
b = tf.Variable(tf.random_normal(shape=[1, 1]))
with tf.Session() as sess:
fomula = tf.add(tf.matmul(x_data, A), b)
loss_l2 = tf.reduce_mean(tf.square(y_target - fomula))
opt = tf.train.GradientDescentOptimizer(lrn_rate)
train_step_l2 = opt.minimize(loss_l2)
init = tf.global_variables_initializer()
init.run()
loss_vec_l2 = []
for i in range(iterations):
rand_idx = np.random.choice(len(x_vals), size=batch_size)
rand_x = x_vals[rand_idx].reshape(-1, 1)
rand_y = y_vals[rand_idx].reshape(-1, 1)
my_dict = {x_data:rand_x, y_target:rand_y}
sess.run(train_step_l2, feed_dict=my_dict)
temp_loss_l2 = sess.run(loss_l2, feed_dict=my_dict)
loss_vec_l2.append(temp_loss_l2)
if (i+1)%20 == 0:
print('step {}: A={}, b={}'.format(i+1, A.eval()[0], b.eval()[0]))
# step 20: A=[1.876194], b=[3.1437955]
# step 40: A=[1.3978924], b=[4.1212244]
# step 60: A=[1.1288545], b=[4.4799623]
# step 80: A=[0.9617775], b=[4.64575]
# step 100: A=[0.88910973], b=[4.7133427]
L1 Loss function과 L2 Loss function을 시각화하는 코드는 다음과 같습니다.
import matplotlib.pyplot as plt
plt.plot(loss_vec_l1, c='k', label='L1 Loss')
plt.plot(loss_vec_l2, c='red', ls='--', label='L2 Loss')
plt.title('L1 and L2 Loss per Generation, learning rate={}'.format(lrn_rate))
plt.xlabel('Generation')
plt.ylabel('Loss')
plt.legend(loc=1)
plt.show()
참고 자료:
[1]TensorFlow Machine Learning Cookbook, Nick McClure
[2]https://github.com/nfmcclure/tensorflow_cookbook
'Tensorflow > Linear Regression' 카테고리의 다른 글
LASSO and Ridge Regression (1) | 2018.04.27 |
---|---|
Deming Regression (0) | 2018.04.27 |
TensorFlow Way of LinearRegression (0) | 2018.04.26 |
Implementing_a_Decomposition_Method with the Cholesky Decomposition Method (0) | 2018.04.26 |
Inverse Matrix Method (0) | 2018.04.26 |