Google Colab 开发记录(5)TensorFlow开发 之GPU、TPU
前言
Google Colab中已经安装好了TensorFlow,包括TensorFlow1.x版本、TensorFlow2.x版本;本文介绍如何切换TensorFlow1与2版本、使用GPU、使用TPU开发。
一、切换TensorFlow版本
Colab 预装了两个版本的 TensorFlow:2.x 版本和 1.x 版本。Colab 默认使用 TensorFlow 2.x,不过可以通过如下所示的方法切换到 1.x。
1.1、使用TensorFlow1.x
1.2、使用TensorFlow2.x
pip install
为 GPU 和 TPU 后端指定特定的 TensorFlow 版本。Colab 从源代码构建 TensorFlow,以确保与 Google Colab的一系列加速器兼容。从 PyPI 获取的 TensorFlow 版本pip
可能会遇到性能问题或根本无法工作。
二、TensorFlow 之GPU
首先点击“修改”→笔记本设置,从硬件加速器下拉菜单中选择 GPU。下面使用 tensorflow 连接到 GPU:
%tensorflow_version 2.x
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))
此示例在随机图像上构建典型的卷积神经网络层,并手动将结果操作放置在 CPU 或 GPU 上以比较执行速度。
%tensorflow_version 2.x
import tensorflow as tf
import timeit
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
print(
'\n\nThis error most likely means that this notebook is not '
'configured to use a GPU. Change this in Notebook Settings via the '
'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n')
raise SystemError('GPU device not found')
def cpu():
with tf.device('/cpu:0'):
random_image_cpu = tf.random.normal((100, 100, 100, 3))
net_cpu = tf.keras.layers.Conv2D(32, 7)(random_image_cpu)
return tf.math.reduce_sum(net_cpu)
def gpu():
with tf.device('/device:GPU:0'):
random_image_gpu = tf.random.normal((100, 100, 100, 3))
net_gpu = tf.keras.layers.Conv2D(32, 7)(random_image_gpu)
return tf.math.reduce_sum(net_gpu)
cpu()
gpu()
# Run the op several times.
print('Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images '
'(batch x height x width x channel). Sum of ten runs.')
print('CPU (s):')
cpu_time = timeit.timeit('cpu()', number=10, setup="from __main__ import cpu")
print(cpu_time)
print('GPU (s):')
gpu_time = timeit.timeit('gpu()', number=10, setup="from __main__ import gpu")
print(gpu_time)
print('GPU speedup over CPU: {}x'.format(int(cpu_time/gpu_time)))
三、TensorFlow 之TPU
在此示例中,我们将训练一个模型,以便对 Google 闪电般快速的 Cloud TPU 上的花朵图像进行分类。我们的模型将输入一张花的照片,并返回它是雏菊、蒲公英、玫瑰、向日葵还是郁金香。我们使用 Keras 框架,这是 TF 2 中 TPU 的新功能。
3.1、启用和测试 TPU
首先点击“修改”→笔记本设置,从硬件加速器下拉菜单中选择 TPU。下面使用 tensorflow 连接到 TPU。
%tensorflow_version 2.x
import tensorflow as tf
import re
import numpy as np
from matplotlib import pyplot as plt
print("Tensorflow version " + tf.__version__)
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver() # TPU detection
print('Running on TPU ', tpu.cluster_spec().as_dict()['worker'])
except ValueError:
raise BaseException('ERROR: Not connected to a TPU runtime; please see the previous cell in this notebook for instructions!')
tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)
tpu_strategy = tf.distribute.experimental.TPUStrategy(tpu)
3.2、输入数据
输入数据存储在 Google Cloud Storage 上;为了更充分地利用 TPU 提供的并行性,并避免数据传输出现瓶颈,我们将输入数据存储在 TFRecord 文件中,每个文件 230 张图像。下面,我们大量使用tf.data.experimental.AUTOTUNE
来优化输入加载的不同部分。
AUTO = tf.data.experimental.AUTOTUNE
IMAGE_SIZE = [331, 331]
batch_size = 16 * tpu_strategy.num_replicas_in_sync
gcs_pattern = 'gs://flowers-public/tfrecords-jpeg-331x331/*.tfrec'
validation_split = 0.19
filenames = tf.io.gfile.glob(gcs_pattern)
split = len(filenames) - int(len(filenames) * validation_split)
train_fns = filenames[:split]
validation_fns = filenames[split:]
def parse_tfrecord(example):
features = {
"image": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
"class": tf.io.FixedLenFeature([], tf.int64), # shape [] means scalar
"one_hot_class": tf.io.VarLenFeature(tf.float32),
}
example = tf.io.parse_single_example(example, features)
decoded = tf.image.decode_jpeg(example['image'], channels=3)
normalized = tf.cast(decoded, tf.float32) / 255.0 # convert each 0-255 value to floats in [0, 1] range
image_tensor = tf.reshape(normalized, [*IMAGE_SIZE, 3])
one_hot_class = tf.reshape(tf.sparse.to_dense(example['one_hot_class']), [5])
return image_tensor, one_hot_class
def load_dataset(filenames):
# Read from TFRecords. For optimal performance, we interleave reads from multiple files.
records = tf.data.TFRecordDataset(filenames, num_parallel_reads=AUTO)
return records.map(parse_tfrecord, num_parallel_calls=AUTO)
def get_training_dataset():
dataset = load_dataset(train_fns)
# Create some additional training images by randomly flipping and
# increasing/decreasing the saturation of images in the training set.
def data_augment(image, one_hot_class):
modified = tf.image.random_flip_left_right(image)
modified = tf.image.random_saturation(modified, 0, 2)
return modified, one_hot_class
augmented = dataset.map(data_augment, num_parallel_calls=AUTO)
# Prefetch the next batch while training (autotune prefetch buffer size).
return augmented.repeat().shuffle(2048).batch(batch_size).prefetch(AUTO)
training_dataset = get_training_dataset()
validation_dataset = load_dataset(validation_fns).batch(batch_size).prefetch(AUTO)
看一下训练数据集:
CLASSES = ['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']
def display_one_flower(image, title, subplot, color):
plt.subplot(subplot)
plt.axis('off')
plt.imshow(image)
plt.title(title, fontsize=16, color=color)
# If model is provided, use it to generate predictions.
def display_nine_flowers(images, titles, title_colors=None):
subplot = 331
plt.figure(figsize=(13,13))
for i in range(9):
color = 'black' if title_colors is None else title_colors[i]
display_one_flower(images[i], titles[i], 331+i, color)
plt.tight_layout()
plt.subplots_adjust(wspace=0.1, hspace=0.1)
plt.show()
def get_dataset_iterator(dataset, n_examples):
return dataset.unbatch().batch(n_examples).as_numpy_iterator()
training_viz_iterator = get_dataset_iterator(training_dataset, 9)
# Re-run this cell to show a new batch of images
images, classes = next(training_viz_iterator)
class_idxs = np.argmax(classes, axis=-1) # transform from one-hot array to class number
labels = [CLASSES[idx] for idx in class_idxs]
display_nine_flowers(images, labels)
3.3、模型
为了获得最大的准确度,我们利用了一个预训练的图像识别模型(这里是Xception)。我们删除了 ImageNet 特定的顶层 ( include_top=false
),并添加了一个最大池化层和一个 softmax 层来预测我们的 5 个类。
def create_model():
pretrained_model = tf.keras.applications.Xception(input_shape=[*IMAGE_SIZE, 3], include_top=False)
pretrained_model.trainable = True
model = tf.keras.Sequential([
pretrained_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(5, activation='softmax')
])
model.compile(
optimizer='adam',
loss = 'categorical_crossentropy',
metrics=['accuracy']
)
return model
with tpu_strategy.scope(): # creating the model in the TPUStrategy scope means we will train the model on the TPU
model = create_model()
model.summary()
3.4、训练
计算每个数据集中的图像数量。
def count_data_items(filenames):
# The number of data items is written in the name of the .tfrec files, i.e. flowers00-230.tfrec = 230 data items
n = [int(re.compile(r"-([0-9]*)\.").search(filename).group(1)) for filename in filenames]
return np.sum(n)
n_train = count_data_items(train_fns)
n_valid = count_data_items(validation_fns)
train_steps = count_data_items(train_fns) // batch_size
print("TRAINING IMAGES: ", n_train, ", STEPS PER EPOCH: ", train_steps)
print("VALIDATION IMAGES: ", n_valid)
计算并显示学习率计划。
EPOCHS = 12
start_lr = 0.00001
min_lr = 0.00001
max_lr = 0.00005 * tpu_strategy.num_replicas_in_sync
rampup_epochs = 5
sustain_epochs = 0
exp_decay = .8
def lrfn(epoch):
if epoch < rampup_epochs:
return (max_lr - start_lr)/rampup_epochs * epoch + start_lr
elif epoch < rampup_epochs + sustain_epochs:
return max_lr
else:
return (max_lr - min_lr) * exp_decay**(epoch-rampup_epochs-sustain_epochs) + min_lr
lr_callback = tf.keras.callbacks.LearningRateScheduler(lambda epoch: lrfn(epoch), verbose=True)
rang = np.arange(EPOCHS)
y = [lrfn(x) for x in rang]
plt.plot(rang, y)
print('Learning rate per epoch:')
实际训练模型。
history = model.fit(training_dataset, validation_data=validation_dataset,
steps_per_epoch=train_steps, epochs=EPOCHS, callbacks=[lr_callback])
final_accuracy = history.history["val_accuracy"][-5:]
print("FINAL ACCURACY MEAN-5: ", np.mean(final_accuracy))
def display_training_curves(training, validation, title, subplot):
ax = plt.subplot(subplot)
ax.plot(training)
ax.plot(validation)
ax.set_title('model '+ title)
ax.set_ylabel(title)
ax.set_xlabel('epoch')
ax.legend(['training', 'validation'])
plt.subplots(figsize=(10,10))
plt.tight_layout()
display_training_curves(history.history['accuracy'], history.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history.history['loss'], history.history['val_loss'], 'loss', 212)
准确度上升,损失下降。看起来挺好的!
3.4、预测
训练好的模型进行预测,展示验证集中的 9 张图片。
def flower_title(label, prediction):
# Both prediction (probabilities) and label (one-hot) are arrays with one item per class.
class_idx = np.argmax(label, axis=-1)
prediction_idx = np.argmax(prediction, axis=-1)
if class_idx == prediction_idx:
return f'{CLASSES[prediction_idx]} [correct]', 'black'
else:
return f'{CLASSES[prediction_idx]} [incorrect, should be {CLASSES[class_idx]}]', 'red'
def get_titles(images, labels, model):
predictions = model.predict(images)
titles, colors = [], []
for label, prediction in zip(classes, predictions):
title, color = flower_title(label, prediction)
titles.append(title)
colors.append(color)
return titles, colors
validation_viz_iterator = get_dataset_iterator(validation_dataset, 9)
# Re-run this cell to show a new batch of images
images, classes = next(validation_viz_iterator)
titles, colors = get_titles(images, classes, model)
display_nine_flowers(images, titles, colors)
3.5、保存并重新加载我们训练好的模型
# We can save our model with:
model.save('model.h5')
# and reload it with:
reloaded_model = tf.keras.models.load_model('model.h5')
# Re-run this cell to show a new batch of images
images, classes = next(validation_viz_iterator)
titles, colors = get_titles(images, classes, reloaded_model)
display_nine_flowers(images, titles, colors)
- 点赞
- 收藏
- 关注作者
评论(0)