【小白学习keras教程】五、基于reuters数据集训练不同RNN循环神经网络模型
【摘要】 @Author:Runsen@[toc] 循环神经网络RNN前馈神经网络(例如 MLP 和 CNN)功能强大,但它们并未针对处理“顺序”数据进行优化换句话说,它们不具有先前输入的“记忆”例如,考虑翻译语料库的情况。 你需要考虑 “context” 来猜测下一个出现的单词RNN 适合处理顺序格式的数据,因为它们具有 循环 结构换句话说,他们保留序列中早期输入的记忆但是,为了减少参数数量,不同时...
@Author:Runsen
@[toc]
循环神经网络RNN
-
前馈神经网络(例如 MLP 和 CNN)功能强大,但它们并未针对处理“顺序”数据进行优化
-
换句话说,它们不具有先前输入的“记忆”
-
例如,考虑翻译语料库的情况。 你需要考虑 “context” 来猜测下一个出现的单词
-
RNN 适合处理顺序格式的数据,因为它们具有 循环 结构
-
换句话说,他们保留序列中早期输入的记忆
- 但是,为了减少参数数量,不同时间步长的每一层需要共享相同的参数
Load Dataset
import numpy as np
from sklearn.metrics import accuracy_score
from tensorflow.keras.datasets import reuters
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical
# parameters for data load
num_words = 30000
maxlen = 50
test_split = 0.3
(X_train, y_train), (X_test, y_test) = reuters.load_data(num_words = num_words, maxlen = maxlen, test_split = test_split)
# pad the sequences with zeros
#填充参数设置为“post”=> 0 被附加到序列的末尾
X_train = pad_sequences(X_train, padding = 'post')
X_test = pad_sequences(X_test, padding = 'post')
X_train = np.array(X_train).reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = np.array(X_test).reshape((X_test.shape[0], X_test.shape[1], 1))
y_data = np.concatenate((y_train, y_test))
y_data = to_categorical(y_data)
y_train = y_data[:1395]
y_test = y_data[1395:]
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
1. Vanilla RNN
- Vanilla RNN 结构简单
- 然而,他们遭受“长期依赖”的问题
- 因此,无法长时间保持顺序内存
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, SimpleRNN, Activation
from tensorflow.keras import optimizers
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
def vanilla_rnn():
model = Sequential()
model.add(SimpleRNN(50, input_shape = (49,1), return_sequences = False))
model.add(Dense(46))
model.add(Activation('softmax'))
adam = optimizers.Adam(lr = 0.001)
model.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])
return model
model = KerasClassifier(build_fn = vanilla_rnn, epochs = 200, batch_size = 50, verbose = 1)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
y_test_ = np.argmax(y_test, axis = 1)
print(accuracy_score(y_pred, y_test_))
0.7495826377295493
2. Stacked Vanilla RNN
- 可以堆叠 RNN 层以形成更深的网络
def stacked_vanilla_rnn():
model = Sequential()
model.add(SimpleRNN(50, input_shape = (49,1), return_sequences = True)) # return_sequences parameter has to be set True to stack
model.add(SimpleRNN(50, return_sequences = False))
model.add(Dense(46))
model.add(Activation('softmax'))
adam = optimizers.Adam(lr = 0.001)
model.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])
return model
model = KerasClassifier(build_fn = stacked_vanilla_rnn, epochs = 200, batch_size = 50, verbose = 1)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(accuracy_score(y_pred, y_test_))
0.7746243739565943
3. LSTM
- LSTM(long short-term memory)是一种改进的结构,用于解决长期依赖问题
from tensorflow.keras.layers import LSTM
def lstm():
model = Sequential()
model.add(LSTM(50, input_shape = (49,1), return_sequences = False))
model.add(Dense(46))
model.add(Activation('softmax'))
adam = optimizers.Adam(lr = 0.001)
model.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])
return model
model = KerasClassifier(build_fn = lstm, epochs = 200, batch_size = 50, verbose = 1)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
# accuracy improves by adopting LSTM structure
print(accuracy_score(y_pred, y_test_))
0.8464106844741235
4. Stacked LSTM
- LSTM 层也可以堆叠
def stacked_lstm():
model = Sequential()
model.add(LSTM(50, input_shape = (49,1), return_sequences = True))
model.add(LSTM(50, return_sequences = False))
model.add(Dense(46))
model.add(Activation('softmax'))
adam = optimizers.Adam(lr = 0.001)
model.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])
return model
model = KerasClassifier(build_fn = stacked_lstm, epochs = 200, batch_size = 50, verbose = 1)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(accuracy_score(y_pred, y_test_))
0.8480801335559266
【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
cloudbbs@huaweicloud.com
- 点赞
- 收藏
- 关注作者
评论(0)