在笔记本中使用 TensorBoard

##### Copyright 2019 The TensorFlow Authors.
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

在笔记本中使用 TensorBoard#

在 TensorFlow.org 上查看 在 Google Colab 中运行 在 GitHub 中查看源代码 下载笔记本

TensorBoard 可以直接在诸如 ColabJupyter 一类的笔记本体验中使用。这有助于共享结果、将 TensorBoard 集成到现有工作流,以及在不进行任何本地安装的情况下使用 TensorBoard。

设置#

首先,安装 TF 2.0 并加载 TensorBoard 笔记本扩展程序:

对于 Jupyter 用户:如果您已经将 Jupyter 和 TensorBoard 安装在同一 virtualenv 中,那么您无需进行其他设置。如果您使用更复杂的设置,例如为不同 Conda/virtualenv 环境使用全局 Jupyter 安装和内核,则必须确保 tensorboard 二进制文件位于 Jupyter 笔记本上下文内的 PATH 中。执行此操作的一种方法是修改 kernel_spec,在 PATH 前添加环境的 bin 目录,如此处所述

对于 Docker 用户:如果您使用 TensorFlow 的 Nightly 版本运行 Jupyter Notebook 服务器的 Docker 镜像,则不仅要公开笔记本的端口,还要公开 TensorBoard 的端口。因此,使用以下命令运行容器:

docker run -it -p 8888:8888 -p 6006:6006 \
tensorflow/tensorflow:nightly-py3-jupyter

其中,-p 6006 为 TensorBoard 的默认端口。这将为您分配一个端口来运行一个 TensorBoard 实例。要运行并发实例,必须分配多个端口。此外,将 --bind_all 传递给 %tensorboard 可以在容器外公开端口。

# Load the TensorBoard notebook extension
%load_ext tensorboard

导入 TensorFlow、日期时间和操作系统:

import tensorflow as tf
import datetime, os

在笔记本中使用 TensorBoard#

下载 FashionMNIST 数据集并对其进行缩放:

fashion_mnist = tf.keras.datasets.fashion_mnist

(x_train, y_train),(x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
8192/5148 [===============================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step

创建一个非常简单的模型:

def create_model():
  return tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28), name='layers_flatten'),
    tf.keras.layers.Dense(512, activation='relu', name='layers_dense'),
    tf.keras.layers.Dropout(0.2, name='layers_dropout'),
    tf.keras.layers.Dense(10, activation='softmax', name='layers_dense_2')
  ])

使用 Keras 和 TensorBoard 回调训练模型:

def train_model():
  
  model = create_model()
  model.compile(optimizer='adam',
                loss='sparse_categorical_crossentropy',
                metrics=['accuracy'])

  logdir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
  tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)

  model.fit(x=x_train, 
            y=y_train, 
            epochs=5, 
            validation_data=(x_test, y_test), 
            callbacks=[tensorboard_callback])

train_model()
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
60000/60000 [==============================] - 11s 182us/sample - loss: 0.4976 - accuracy: 0.8204 - val_loss: 0.4143 - val_accuracy: 0.8538
Epoch 2/5
60000/60000 [==============================] - 10s 174us/sample - loss: 0.3845 - accuracy: 0.8588 - val_loss: 0.3855 - val_accuracy: 0.8626
Epoch 3/5
60000/60000 [==============================] - 10s 175us/sample - loss: 0.3513 - accuracy: 0.8705 - val_loss: 0.3740 - val_accuracy: 0.8607
Epoch 4/5
60000/60000 [==============================] - 11s 177us/sample - loss: 0.3287 - accuracy: 0.8793 - val_loss: 0.3596 - val_accuracy: 0.8719
Epoch 5/5
60000/60000 [==============================] - 11s 178us/sample - loss: 0.3153 - accuracy: 0.8825 - val_loss: 0.3360 - val_accuracy: 0.8782

使用魔术命令在笔记本中启动 TensorBoard:

%tensorboard --logdir logs

您现在可以查看 Time SeriesGraphsDistributions 等信息中心。某些信息中心在 Colab 中尚不可用(例如配置文件插件)。

%tensorboard 魔术命令与 TensorBoard 命令行调用的格式基本相同,区别在于其开头带有 % 符号。

您也可以在训练前启动 TensorBoard,对其进行监视:

%tensorboard --logdir logs

通过发出相同的命令,可以重用相同的 TensorBoard 后端。如果选择了其他日志目录,将打开新的 TensorBoard 实例。将自动管理端口。

开始训练新模型,观察 TensorBoard 每 30 秒自动更新一次,或者使用右上角的按钮进行刷新:

train_model()
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
60000/60000 [==============================] - 11s 184us/sample - loss: 0.4968 - accuracy: 0.8223 - val_loss: 0.4216 - val_accuracy: 0.8481
Epoch 2/5
60000/60000 [==============================] - 11s 176us/sample - loss: 0.3847 - accuracy: 0.8587 - val_loss: 0.4056 - val_accuracy: 0.8545
Epoch 3/5
60000/60000 [==============================] - 11s 176us/sample - loss: 0.3495 - accuracy: 0.8727 - val_loss: 0.3600 - val_accuracy: 0.8700
Epoch 4/5
60000/60000 [==============================] - 11s 179us/sample - loss: 0.3282 - accuracy: 0.8795 - val_loss: 0.3636 - val_accuracy: 0.8694
Epoch 5/5
60000/60000 [==============================] - 11s 176us/sample - loss: 0.3115 - accuracy: 0.8839 - val_loss: 0.3438 - val_accuracy: 0.8764

您可以使用 tensorboard.notebook API 进行更多控制:

from tensorboard import notebook
notebook.list() # View open TensorBoard instances
Known TensorBoard instances:
  - port 6006: logdir logs (started 0:00:54 ago; pid 265)
# Control TensorBoard display. If no port is provided, 
# the most recently launched TensorBoard is used
notebook.display(port=6006, height=1000)