2020人工神经网络第一次作业-参考答案第五部分
本文是 2020人工神经网络第一次作业 的参考答案第五部分
➤05 第五题参考答案
1.题目分析
MATLAB中的Peaks函数是一个二元函数,构造BP网络来逼近该函数,网络的输入节点数量为2,输出节点为1。中间隐层节点个数 N h N_h Nh数量自行确定。根据课件中给定的参数,隐层节点数量选择: N h = 7 N_h = 7 Nh=7。
f ( x , y ) = 3 ( 1 − x ) 2 e − [ x 2 + ( y + 1 ) 2 ] − 10 ( x 5 − x 3 − y 5 ) e − ( x 2 + y 2 ) − 1 3 e − [ ( x + 1 ) 2 + y 2 ] f\left( {x,y} \right) = 3\left( {1 - x} \right)^2 e^{ - \left[ {x^2 + \left( {y + 1} \right)^2 } \right]} - 10\left( {{x \over 5} - x^3 - y^5 } \right)e^{ - \left( {x^2 + y^2 } \right)} - {1 \over 3}e^{ - \left[ {\left( {x + 1} \right)^2 + y^2 } \right]} f(x,y)=3(1−x)2e−[x2+(y+1)2]−10(5x−x3−y5)e−(x2+y2)−31e−[(x+1)2+y2]
▲ peaks函数图像
(1) 训练样本采集
训练样本从 ( − 3 < x i < 3 ) i = 1 , 2 \left( { - 3 < x_i < 3} \right)_{i = 1,2} (−3<xi<3)i=1,2矩形区域随机抽样 N s = 200 N_s = 200 Ns=200个样本点。
▲ 采集到的数据样本
2.神经网络结构
构建单隐层神经网络,隐层神经元节点7个,传递函数为: tanh ( x ) = 1 − e − x 1 + e − x \tanh(x) = {{1 - e^{ - x} } \over {1 + e^{ - x} }} tanh(x)=1+e−x1−e−x
传递函数的导数为:
f ′ ( x ) = [ tanh ( x ) ] ′ = 1 − f 2 ( x ) f'\left( x \right) = \left[ {\tanh \left( x \right)} \right]' = 1 - f^2 \left( x \right) f′(x)=[tanh(x)]′=1−f2(x)
▲ 单隐层BP神经网络
上述网络实现参见后面附录中的: 作业1-6中的实验程序
3.网络训练
网络采用基本的BP算法,学习速率 η = 0.5 \eta = 0.5 η=0.5。训练次数3000次。
▲ 神经网络输入输出对应函数曲面演变过程
▲ 训练完之后网络的输入输出关系
▲ 网络误差随着训练演变过程
从训练结果来看,由于采集到的数据样本少,使得训练后的神经网络无法反应出原Peaks函数多个峰值。比如原来原本有三个峰值(一个高峰,两个峰)最终只能粗略看看一个峰值。
4.提高训练样本数量
为了提高网络训练后的效果,需要提高训练数据的数量。
下面在 ( − 3 < x i < 3 ) i = 1 , 2 \left( { - 3 < x_i < 3} \right)_{i = 1,2} (−3<xi<3)i=1,2矩形区域随机抽样 N s = 1000 N_s = 1000 Ns=1000个样本点。将网络隐层节点增加到10个。
▲ 采集1000个训练样本
下图显示了网络输入输出函数随着训练步骤增加演变的过程。
对于训练后的函数进行观察,可以看到训练误差明显减少了。
▲ 网络误差随着训练次数演变
6.提高隐层神经元个数
仍然采样200个训练样本,只是将网络的隐层节点个数提高到20个。下面显示了训练的结果。
▲ 网络输入输出关系训练演变过程
▲ 训练后的函数图像
▲ 网络训练误差变化曲线
7.结论
对于复杂映射关系,需要更多的网络训练样本才能够有效提高网络训练的精度。适当增加隐层节点数量也可以在一定程度上增加网络训练精度。
➤作业1-6中的实验程序
1.主程序
#!/usr/local/bin/python
# -*- coding: gbk -*-
#============================================================
# TEST1.PY -- by Dr. ZhuoQing 2020-11-19
#
# Note:
#============================================================
from headm import *
from mpl_toolkits.mplot3d import axes3d
from matplotlib import cm
from bp1tanh import *
#------------------------------------------------------------
def func(x, y):
return 3 * (1-x)**2 * exp(-(x**2 + (y+1)**2)) -\
10 * (x/5 - x**3 - y**5) * exp(-(x**2 + y**2)) -\
1/3 * exp(-((x+1)**2 + y**2))
#------------------------------------------------------------
x = arange(-4, 4, 0.05)
y = arange(-4, 4, 0.05)
xx,yy = meshgrid(x, y)
zz = func(xx, yy)
#------------------------------------------------------------
SAMPLE_NUMBER = 200
xs = random.uniform(-3, 3, SAMPLE_NUMBER)
ys = random.uniform(-3, 3, SAMPLE_NUMBER)
zs = func(xs, ys)
x_train = array(list(zip(xs, ys)))
y_train = zs.reshape(1,-1)
#printf(x_train.shape, y_train.shape)
#------------------------------------------------------------
ax = plt.axes(projection='3d')
'''
#ax.plot_surface(xx,yy,zz, cmap='coolwarm', linewidth=0, antialiased=False)
plt.contour(xx, yy, zz)
ax.scatter(xs, ys, zs, color = 'r')
ax.set_xlabel('X Axes')
ax.set_ylabel('Y Axes')
ax.set_zlabel('Z Axes')
plt.show()
'''
#------------------------------------------------------------
#------------------------------------------------------------
# Define the training
DISP_STEP = 50
#------------------------------------------------------------
pltgif = PlotGIF()
#------------------------------------------------------------
def train(X, Y, num_iterations, learning_rate, print_cost=False):
n_x = 2
n_y = 1
n_h = 7
lr = learning_rate
parameters = initialize_parameters(n_x, n_h, n_y)
XX,YY = shuffledata(x_train, y_train)
costdim = []
for i in range(0, num_iterations):
A2, cache = forward_propagate(XX, parameters)
cost = calculate_cost(A2, YY, parameters)
grads = backward_propagate(parameters, cache, XX, YY)
parameters = update_parameters(parameters, grads, lr)
if print_cost and i % DISP_STEP == 0:
printf('Cost after iteration:%i: %f'%(i, cost))
costdim.append(cost)
XX,YY = shuffledata(X, Y)
zzz = []
for xxx,yyy in zip(xx, yy):
xy3 = array(list(zip(xxx,yyy)))
a2, cache= forward_propagate(xy3, parameters)
zzz.append(a2[0])
zz = array(zzz)
plt.clf()
ax = plt.axes(projection='3d')
ax.plot_surface(xx,yy,zz, cmap='coolwarm', linewidth=0, antialiased=False)
plt.title('Step:%d, Error:%4.3f'%(i, cost))
ax.set_xlabel('X Axes')
ax.set_ylabel('Y Axes')
ax.set_zlabel('Z Axes')
plt.draw()
plt.pause(.01)
pltgif.append(plt)
return parameters, costdim
#------------------------------------------------------------
parameter,costdim = train(x_train, y_train, 4500, 0.35, True)
pltgif.save(r'd:\temp\1.gif')
printf('cost:%f'%costdim[-1])
plt.show()
#------------------------------------------------------------
plt.clf()
plt.plot(arange(len(costdim))*DISP_STEP, costdim)
plt.xlabel("Step")
plt.ylabel("Cost")
plt.grid(True)
plt.tight_layout()
plt.show()
#------------------------------------------------------------
# END OF FILE : TEST1.PY
#============================================================
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
2.BP网络的子程序
#!/usr/local/bin/python
# -*- coding: gbk -*-
#============================================================
# BP1TANH.PY -- by Dr. ZhuoQing 2020-11-17
#
# Note:
#============================================================
from headm import *
#------------------------------------------------------------
# Samples data construction
random.seed(int(time.time()))
#------------------------------------------------------------
def shuffledata(X, Y):
id = list(range(X.shape[0]))
random.shuffle(id)
return X[id], (Y.T[id]).T
#------------------------------------------------------------
# Define and initialization NN
def initialize_parameters(n_x, n_h, n_y):
W1 = random.randn(n_h, n_x) * 0.5 # dot(W1,X.T)
W2 = random.randn(n_y, n_h) * 0.5 # dot(W2,Z1)
b1 = zeros((n_h, 1)) # Column vector
b2 = zeros((n_y, 1)) # Column vector
parameters = {'W1':W1,
'b1':b1,
'W2':W2,
'b2':b2}
return parameters
#------------------------------------------------------------
# Forward propagattion
# X:row->sample;
# Z2:col->sample
def forward_propagate(X, parameters):
W1 = parameters['W1']
b1 = parameters['b1']
W2 = parameters['W2']
b2 = parameters['b2']
Z1 = dot(W1, X.T) + b1 # X:row-->sample; Z1:col-->sample
# A1 = 1/(1+exp(-Z1))
A1 = (1-exp(-Z1))/(1+exp(-Z1))
Z2 = dot(W2, A1) + b2 # Z2:col-->sample
A2 = Z2 # Linear output
cache = {'Z1':Z1,
'A1':A1,
'Z2':Z2,
'A2':A2}
return Z2, cache
#------------------------------------------------------------
# Calculate the cost
# A2,Y: col->sample
def calculate_cost(A2, Y, parameters):
err = [x1-x2 for x1,x2 in zip(A2.T, Y.T)]
cost = [dot(e,e) for e in err]
return mean(cost)
#------------------------------------------------------------
# Backward propagattion
def backward_propagate(parameters, cache, X, Y):
m = X.shape[0] # Number of the samples
W1 = parameters['W1']
W2 = parameters['W2']
A1 = cache['A1']
A2 = cache['A2']
dZ2 = (A2 - Y)
dW2 = dot(dZ2, A1.T) / m
db2 = sum(dZ2, axis=1, keepdims=True) / m
# dZ1 = dot(W2.T, dZ2) * (A1 * (1-A1))
dZ1 = dot(W2.T, dZ2) * (1-A1**2)
dW1 = dot(dZ1, X) / m
db1 = sum(dZ1, axis=1, keepdims=True) / m
grads = {'dW1':dW1,
'db1':db1,
'dW2':dW2,
'db2':db2}
return grads
#------------------------------------------------------------
# Update the parameters
def update_parameters(parameters, grads, learning_rate):
W1 = parameters['W1']
b1 = parameters['b1']
W2 = parameters['W2']
b2 = parameters['b2']
dW1 = grads['dW1']
db1 = grads['db1']
dW2 = grads['dW2']
db2 = grads['db2']
W1 = W1 - learning_rate * dW1
W2 = W2 - learning_rate * dW2
b1 = b1 - learning_rate * db1
b2 = b2 - learning_rate * db2
parameters = {'W1':W1,
'b1':b1,
'W2':W2,
'b2':b2}
return parameters
#------------------------------------------------------------
# END OF FILE : BP1TANH.PY
#============================================================
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
文章来源: zhuoqing.blog.csdn.net,作者:卓晴,版权归原作者所有,如需转载,请联系作者。
原文链接:zhuoqing.blog.csdn.net/article/details/109820010
- 点赞
- 收藏
- 关注作者
评论(0)