序贯模型

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

序贯模型 #

在 TensorFlow.org 上查看

在 Google Colab 中运行

在 GitHub 中查看源代码

下载笔记本

安装#

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

何时使用序贯模型#

Sequential 模型适用于简单的层堆栈，其中每个层都恰好有一个输入张量和一个输出张量。

以下 Sequential 模型（仅作为示意）：

# Define Sequential model with 3 layers
model = keras.Sequential(
    [
        layers.Dense(2, activation="relu", name="layer1"),
        layers.Dense(3, activation="relu", name="layer2"),
        layers.Dense(4, name="layer3"),
    ]
)
# Call model on a test input
x = tf.ones((3, 3))
y = model(x)

等效于此函数：

# Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")

# Call layers on a test input
x = tf.ones((3, 3))
y = layer3(layer2(layer1(x)))

在以下情况下，序贯模型不适用：

您的模型有多个输入或多个输出
您的任何层都有多个输入或多个输出
您需要进行层共享
您需要非线性拓扑（例如残差连接、多分支模型）

创建序贯模型#

您可以通过将层列表传递给序贯构造函数来创建序贯模型：

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)

它的层可以通过 layers 属性访问：

model.layers

您还可以通过 add() 方法增量式创建序贯模型：

model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))

请注意，还有一个相应的 pop() 方法来移除层：序贯模型的行为非常类似于层列表。

model.pop()
print(len(model.layers))  # 2

另请注意，序贯构造函数接受 name 参数，就像 Keras 中的任何层或模型一样。这对于使用语义上有意义的名称来注释 TensorBoard 计算图非常有用。

model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))

预先指定输入形状#

一般来说，Keras 中的所有层都需要知道其输入的形状，以便能够创建其权重。因此，当您创建这样的层时，它最初没有权重：

layer = layers.Dense(3)
layer.weights  # Empty

当第一次在输入上被调用时，它会创建其权重，因为权重的形状取决于输入的形状：

# Call layer on a test input
x = tf.ones((1, 4))
y = layer(x)
layer.weights  # Now it has weights, of shape (4, 3) and (3,)

当然，这也适用于序贯模型。当您实例化没有输入形状的序贯模型时，它不会被“构建”：它没有权重（并且调用 model.weights 会导致说明这一点的错误）。当模型第一次看到一些输入数据时，会创建权重：

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)  # No weights at this stage!

# At this point, you can't do this:
# model.weights

# You also can't do this:
# model.summary()

# Call the model on a test input
x = tf.ones((1, 4))
y = model(x)
print("Number of weights after calling the model:", len(model.weights))  # 6

一旦模型“已构建”，您就可以调用它的 summary() 方法来显示其内容：

model.summary()

不过，在增量式构建序贯模型时，它非常有用，能够显示迄今为止模型的摘要，包括当前的输出形状。在这种情况下，您应通过将 Input 对象传递给您的模型来启动模型，以便模型从一开始就知道其输入形状：

model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))

model.summary()

请注意，Input 对象不会显示为 model.layers 的一部分，因为它不是层：

model.layers

一种简单的替代方式是将 input_shape 参数传递给第一层：

model = keras.Sequential()
model.add(layers.Dense(2, activation="relu", input_shape=(4,)))

model.summary()

使用像这样的预定义输入形状构建的模型始终具有权重（甚至在看到任何数据之前），并且始终具有定义的输出形状。

一般来说，如果您知道序贯模型的输入形状是什么，推荐的最佳做法是始终提前指定它。

常见的调试工作流：`add()` + `summary()`#

在构建新的序贯架构时，使用 add() 增量式堆叠层并经常打印模型摘要非常有用。例如，这样便能监控 Conv2D 和 MaxPooling2D 层的堆栈如何对图像特征映射进行下采样：

model = keras.Sequential()
model.add(keras.Input(shape=(250, 250, 3)))  # 250x250 RGB images
model.add(layers.Conv2D(32, 5, strides=2, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))

# Can you guess what the current output shape is at this point? Probably not.
# Let's just print it:
model.summary()

# The answer was: (40, 40, 32), so we can keep downsampling...

model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))

# And now?
model.summary()

# Now that we have 4x4 feature maps, time to apply global max pooling.
model.add(layers.GlobalMaxPooling2D())

# Finally, we add a classification layer.
model.add(layers.Dense(10))

非常实用，对吧？

有了模型后该怎么办#

一旦模型架构准备就绪，您将需要执行以下操作：

训练您的模型、评估模型并运行推断。请参阅我们的使用内置循环的训练和评估指南
将模型保存到磁盘并将其还原。请参阅我们的序列化和保存指南。
利用多个 GPU 加速模型训练。请参阅我们的多 GPU 和分布式训练指南。

使用序贯模型进行特征提取#

一旦构建了序贯模型，它的行为就类似于函数式 API 模型。这意味着每层都有一个 input 和 output 属性。这些属性可用于执行一些巧妙的操作，例如快速创建一个模型来提取序贯模型中所有中间层的输出：

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=[layer.output for layer in initial_model.layers],
)

# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)

下面是一个仅从一层提取特征的类似示例：

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu", name="my_intermediate_layer"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=initial_model.get_layer(name="my_intermediate_layer").output,
)
# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)

使用序贯模型进行迁移学习#

迁移学习包括冻结模型中的底层并仅训练顶层。如果您不熟悉迁移学习，请务必阅读我们的迁移学习指南。

下面是涉及序贯模型的两种常见迁移学习蓝图。

首先，假设您有一个序贯模型，并且想要冻结除最后一层之外的所有层。在这种情况下，只需迭代 model.layers 并在除最后一层之外的每一层上设置 layer.trainable = False。示例代码如下：

model = keras.Sequential([
    keras.Input(shape=(784)),
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(10),
])

# Presumably you would want to first load pre-trained weights.
model.load_weights(...)

# Freeze all layers except the last one.
for layer in model.layers[:-1]:
  layer.trainable = False

# Recompile and train (this will only update the weights of the last layer).
model.compile(...)
model.fit(...)

另一个常见的蓝图是使用序贯模型来堆叠预训练模型和一些新初始化的分类层。示例代码如下：

# Load a convolutional base with pre-trained weights
base_model = keras.applications.Xception(
    weights='imagenet',
    include_top=False,
    pooling='avg')

# Freeze the base model
base_model.trainable = False

# Use a Sequential model to add a trainable classifier on top
model = keras.Sequential([
    base_model,
    layers.Dense(1000),
])

# Compile & train
model.compile(...)
model.fit(...)

如果您进行迁移学习，您可能会发现自己经常使用这两种模式。

上面是您需要了解的有关序贯模型的全部信息！

要详细了解如何在 Keras 中构建模型，请参阅：