{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "id": "6bYaCABobL5q" }, "outputs": [], "source": [ "##### Copyright 2021 The TensorFlow Authors." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "FlUw7tSKbtg4", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "_-fogOi3K7nR" }, "source": [ "# 在 TF2 工作流中使用 TF1.x 模型\n" ] }, { "cell_type": "markdown", "metadata": { "id": "MfBg1C5NB3X0" }, "source": [ "\n", " \n", " \n", " \n", " \n", "
在 TensorFlow.org 上查看 在 Google Colab 中运行 在 GitHub 上查看源代码 下载笔记本
" ] }, { "cell_type": "markdown", "metadata": { "id": "7-GwECUqrkqT" }, "source": [ "本指南提供了[建模代码 shim](https://en.wikipedia.org/wiki/Shim_(computing)) 的概述和示例,您可以使用这些模型在 TF2 工作流(例如 Eager Execution、`tf.function` 和分布策略)中使用现有 TF1.x 模型,只需对建模代码进行少量的更改。" ] }, { "cell_type": "markdown", "metadata": { "id": "k_ezCbogxaqt" }, "source": [ "## 使用范围\n", "\n", "本指南中介绍的 shim 是为 TF1.x 模型设计的,它依赖于:\n", "\n", "1. `tf.compat.v1.get_variable` 和 `tf.compat.v1.variable_scope` 来控制变量的创建和重用,以及\n", "2. `tf.compat.v1.global_variables()`、`tf.compat.v1.trainable_variables`、`tf.compat.v1.losses.get_regularization_losses()` 和 `tf.compat.v1.get_collection()` 等基于计算图集合的 API 来跟踪权重和正则化损失\n", "\n", "这包括大多数在 `tf.compat.v1.layer`、`tf.contrib.layers` API 和 [TensorFlow-Slim](https://github.com/google-research/tf-slim) 上构建的模型。\n", "\n", "以下 TF1.x 模型**不**需要 shim:\n", "\n", "1. 已经分别通过 `model.trainable_weights` 和 `model.losses` 跟踪所有可训练权重和正则化损失的独立 Keras 模型。\n", "2. 已经通过 `module.trainable_variables` 跟踪其所有可训练权重,并且仅在尚未创建时才创建权重的 `tf.Module`。\n", "\n", "这些模型很可能在 TF2 中使用 Eager Execution 和开箱即用的 `tf.function`。" ] }, { "cell_type": "markdown", "metadata": { "id": "3OQNFp8zgV0C" }, "source": [ "## 安装\n", "\n", "导入 TensorFlow 和其他依赖项。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "EG2n3-qlD5mA", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "!pip uninstall -y -q tensorflow" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "mVfR3MBvD9Sc", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "# Install tf-nightly as the DeterministicRandomTestTool is available only in\n", "# Tensorflow 2.8\n", "\n", "!pip install -q tf-nightly" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "PzkV-2cna823", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "import tensorflow as tf\n", "import tensorflow.compat.v1 as v1\n", "import sys\n", "import numpy as np\n", "\n", "from contextlib import contextmanager" ] }, { "cell_type": "markdown", "metadata": { "id": "Ox4kn0DK8H0f" }, "source": [ "## `track_tf1_style_variables` 装饰器\n", "\n", "本指南中介绍的关键 shim 是 `tf.compat.v1.keras.utils.track_tf1_style_variables`,这是一个装饰器,您可以在属于 `tf.keras.layers.Layer` 和 `tf.Module` 的方法中利用它来跟踪 TF1.x 样式的权重和捕获正则化损失。\n", "\n", "使用 `tf.compat.v1.keras.utils.track_tf1_style_variables` 装饰 `tf.keras.layers.Layer` 或 `tf.Module` 的调用方法允许通过 `tf.compat.v1.get_variable`(以及扩展程序 `tf.compat.v1.layers`)在装饰方法内部正常工作,而不是总是在每次调用时创建一个新变量。此外,它还将导致层或模块隐式跟踪通过装饰方法内部的 `get_variable` 创建或访问的任何权重。\n", "\n", "除了在标准 `layer.variable`/`module.variable`/ 等属性下跟踪权重本身外,如果该方法属于 `tf.keras.layers.Layer`,则通过 `get_variable` 或 `tf.compat.v1.layers` 正则化器参数指定的任何正则化损失都将由标准 `layer.losses` 属性下的层跟踪。\n", "\n", "即使启用了 TF2 行为,这种跟踪机制也允许在 TF2 中的 Keras 层或 `tf.Module` 内使用大量 TF1.x 样式的模型前向传递代码。\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Sq6IqZILmGmO" }, "source": [ "## 用法示例\n", "\n", "下面的用法示例演示了用于装饰 `tf.keras.layers.Layer` 方法的建模 shim,但除了它们与 Keras 功能特别交互的情况外,它们在装饰 `tf.Module` 方法时也适用。" ] }, { "cell_type": "markdown", "metadata": { "id": "YWGPh6KmkHq6" }, "source": [ "### 使用 tf.compat.v1.get_variable 构建的层\n", "\n", "想象一下,您有一个直接在 `tf.compat.v1.get_variable` 上实现的层,代码如下所示:\n", "\n", "```python\n", "def dense(self, inputs, units):\n", " out = inputs\n", " with tf.compat.v1.variable_scope(\"dense\"):\n", " # The weights are created with a `regularizer`,\n", " kernel = tf.compat.v1.get_variable(\n", " shape=[out.shape[-1], units],\n", " regularizer=tf.keras.regularizers.L2(),\n", " initializer=tf.compat.v1.initializers.glorot_normal,\n", " name=\"kernel\")\n", " bias = tf.compat.v1.get_variable(\n", " shape=[units,],\n", " initializer=tf.compat.v1.initializers.zeros,\n", " name=\"bias\")\n", " out = tf.linalg.matmul(out, kernel)\n", " out = tf.compat.v1.nn.bias_add(out, bias)\n", " return out\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "6sZWU7JSok2n" }, "source": [ "使用 shim 将其转换成一个层并在输入上调用它。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Q3eKkcKtS_N4", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class DenseLayer(tf.keras.layers.Layer):\n", "\n", " def __init__(self, units, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.units = units\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs):\n", " out = inputs\n", " with tf.compat.v1.variable_scope(\"dense\"):\n", " # The weights are created with a `regularizer`,\n", " # so the layer should track their regularization losses\n", " kernel = tf.compat.v1.get_variable(\n", " shape=[out.shape[-1], self.units],\n", " regularizer=tf.keras.regularizers.L2(),\n", " initializer=tf.compat.v1.initializers.glorot_normal,\n", " name=\"kernel\")\n", " bias = tf.compat.v1.get_variable(\n", " shape=[self.units,],\n", " initializer=tf.compat.v1.initializers.zeros,\n", " name=\"bias\")\n", " out = tf.linalg.matmul(out, kernel)\n", " out = tf.compat.v1.nn.bias_add(out, bias)\n", " return out\n", "\n", "layer = DenseLayer(10)\n", "x = tf.random.normal(shape=(8, 20))\n", "layer(x)" ] }, { "cell_type": "markdown", "metadata": { "id": "JqXAlWnYgwcq" }, "source": [ "像标准 Keras 层一样访问跟踪的变量和捕获的正则化损失。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZNz5HmkXg0B5", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "layer.trainable_variables\n", "layer.losses" ] }, { "cell_type": "markdown", "metadata": { "id": "W0z9GmRlhM9X" }, "source": [ "为了确保权重在每次调用该层时都得到重用,请将所有权重设置为零,然后再次调用该层。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZJ4vOu2Rf-I2", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "print(\"Resetting variables to zero:\", [var.name for var in layer.trainable_variables])\n", "\n", "for var in layer.trainable_variables:\n", " var.assign(var * 0.0)\n", "\n", "# Note: layer.losses is not a live view and\n", "# will get reset only at each layer call\n", "print(\"layer.losses:\", layer.losses)\n", "print(\"calling layer again.\")\n", "out = layer(x)\n", "print(\"layer.losses: \", layer.losses)\n", "out" ] }, { "cell_type": "markdown", "metadata": { "id": "WwEprtA-lOh6" }, "source": [ "您也可以直接在 Keras 函数式模型构造中使用转换后的层。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7E7ZCINHlaHU", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "inputs = tf.keras.Input(shape=(20))\n", "outputs = DenseLayer(10)(inputs)\n", "model = tf.keras.Model(inputs=inputs, outputs=outputs)\n", "\n", "x = tf.random.normal(shape=(8, 20))\n", "model(x)\n", "\n", "# Access the model variables and regularization losses\n", "model.weights\n", "model.losses" ] }, { "cell_type": "markdown", "metadata": { "id": "ew5TTEyZkZGU" }, "source": [ "### 使用 `tf.compat.v1.layers` 构建的模型\n", "\n", "想象一下,您有一个直接在 `tf.compat.v1.layers` 上实现的层或模型,代码如下所示:\n", "\n", "```python\n", "def model(self, inputs, units):\n", " with tf.compat.v1.variable_scope('model'):\n", " out = tf.compat.v1.layers.conv2d(\n", " inputs, 3, 3,\n", " kernel_regularizer=\"l2\")\n", " out = tf.compat.v1.layers.flatten(out)\n", " out = tf.compat.v1.layers.dense(\n", " out, units,\n", " kernel_regularizer=\"l2\")\n", " return out\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "gZolXllfpVx6" }, "source": [ "使用 shim 将其转换成一个层并在输入上调用它。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "cBpfSHWTTTCv", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class CompatV1LayerModel(tf.keras.layers.Layer):\n", "\n", " def __init__(self, units, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.units = units\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs):\n", " with tf.compat.v1.variable_scope('model'):\n", " out = tf.compat.v1.layers.conv2d(\n", " inputs, 3, 3,\n", " kernel_regularizer=\"l2\")\n", " out = tf.compat.v1.layers.flatten(out)\n", " out = tf.compat.v1.layers.dense(\n", " out, self.units,\n", " kernel_regularizer=\"l2\")\n", " return out\n", "\n", "layer = CompatV1LayerModel(10)\n", "x = tf.random.normal(shape=(8, 5, 5, 5))\n", "layer(x)" ] }, { "cell_type": "markdown", "metadata": { "id": "OkG9oLlblfK_" }, "source": [ "警告:出于安全原因,请确保将所有 `tf.compat.v1.layers` 都置于非空字符串 `variable_scope` 内。这是因为具有自动生成名称的 `tf.compat.v1.layers` 将始终在任何变量范围之外使名称自动递增。这意味着每次调用层/模块时请求的变量名称都会不匹配。因此,它不会重用已经创建的权重,而是会在每次调用时创建一组新的变量。" ] }, { "cell_type": "markdown", "metadata": { "id": "zAVN6dy3p7ik" }, "source": [ "像标准 Keras 层一样访问跟踪的变量和捕获的正则化损失。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "HTRF99vJp7ik", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "layer.trainable_variables\n", "layer.losses" ] }, { "cell_type": "markdown", "metadata": { "id": "kkNuEcyIp7ik" }, "source": [ "为了确保权重在每次调用该层时都得到重用,请将所有权重设置为零,然后再次调用该层。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "4dk4XScdp7il", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "print(\"Resetting variables to zero:\", [var.name for var in layer.trainable_variables])\n", "\n", "for var in layer.trainable_variables:\n", " var.assign(var * 0.0)\n", "\n", "out = layer(x)\n", "print(\"layer.losses: \", layer.losses)\n", "out" ] }, { "cell_type": "markdown", "metadata": { "id": "7zD3a8PKzU7S" }, "source": [ "您也可以直接在 Keras 函数式模型构造中使用转换后的层。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Q88BgBCup7il", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "inputs = tf.keras.Input(shape=(5, 5, 5))\n", "outputs = CompatV1LayerModel(10)(inputs)\n", "model = tf.keras.Model(inputs=inputs, outputs=outputs)\n", "\n", "x = tf.random.normal(shape=(8, 5, 5, 5))\n", "model(x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "2cioB6Zap7il", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "# Access the model variables and regularization losses\n", "model.weights\n", "model.losses" ] }, { "cell_type": "markdown", "metadata": { "id": "NBNODOx9ly6r" }, "source": [ "### 捕获批次归一化更新和模型 `training` 参数\n", "\n", "在 TF1.x 中,您可以按如下方式执行批次归一化:\n", "\n", "```python\n", " x_norm = tf.compat.v1.layers.batch_normalization(x, training=training)\n", "\n", " # ...\n", "\n", " update_ops = tf.compat.v1.get_collection(tf.GraphKeys.UPDATE_OPS)\n", " train_op = optimizer.minimize(loss)\n", " train_op = tf.group([train_op, update_ops])\n", "```\n", "\n", "注意:\n", "\n", "1. 批次归一化移动平均值更新由与层分开调用的 `get_collection` 跟踪\n", "2. `tf.compat.v1.layers.batch_normalization` 需要一个 `training` 参数(使用 TF-Slim 批次归一化层时一般称为 `is_training`)\n", "\n", "在 TF2 中,由于 [Eager Execution](https://tensorflow.google.cn/guide/eager) 和自动控制依赖项,批次归一化移动平均值更新将立即执行。无需从更新集合中单独收集它们并将它们添加为显式控制依赖项。\n", "\n", "此外,如果您为 `tf.keras.layers.Layer` 的前向传递方法提供一个 `training` 参数,Keras 能够将当前训练阶段和任何嵌套层传递给它,就像它对任何其他层所做的那样。有关 Keras 如何处理 `training` 参数的更多信息,请参阅 `tf.keras.Model` 的 API 文档。\n", "\n", "如果您正在装饰 `tf.Module` 方法,则需要确保根据需要手动传递所有 `training` 参数。但是,批次归一化移动平均值更新仍将自动应用,无需显式控制依赖项。\n", "\n", "以下代码段演示了如何在 shim 中嵌入批次归一化层以及如何在 Keras 模型中使用它(适用于 `tf.keras.layers.Layer`)。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "CjZE-J7mkS9p", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class CompatV1BatchNorm(tf.keras.layers.Layer):\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs, training=None):\n", " print(\"Forward pass called with `training` =\", training)\n", " with v1.variable_scope('batch_norm_layer'):\n", " return v1.layers.batch_normalization(x, training=training)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NGuvvElmY-fu", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "print(\"Constructing model\")\n", "inputs = tf.keras.Input(shape=(5, 5, 5))\n", "outputs = CompatV1BatchNorm()(inputs)\n", "model = tf.keras.Model(inputs=inputs, outputs=outputs)\n", "\n", "print(\"Calling model in inference mode\")\n", "x = tf.random.normal(shape=(8, 5, 5, 5))\n", "model(x, training=False)\n", "\n", "print(\"Moving average variables before training: \",\n", " {var.name: var.read_value() for var in model.non_trainable_variables})\n", "\n", "# Notice that when running TF2 and eager execution, the batchnorm layer directly\n", "# updates the moving averages while training without needing any extra control\n", "# dependencies\n", "print(\"calling model in training mode\")\n", "model(x, training=True)\n", "\n", "print(\"Moving average variables after training: \",\n", " {var.name: var.read_value() for var in model.non_trainable_variables})\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Gai4ikpmeRqR" }, "source": [ "### 基于变量范围的变量重用\n", "\n", "在基于 `get_variable` 的前向传递中创建的任何变量都将保持与 TF1.x 中变量作用域相同的变量命名和重用语义。只要任何具有自动生成名称的 `tf.compat.v1.layers` 至少有一个非空的外部范围,情况就如上面所述。\n", "\n", "注:命名和重用的范围将限定在单个层/模块实例内。在一个 shim 装饰的层或模块内调用 `get_variable` 将无法引用在层或模块内创建的变量。如果需要,您可以通过直接使用 Python 对其他变量的引用来解决此问题,而不是通过 `get_variable` 访问变量。" ] }, { "cell_type": "markdown", "metadata": { "id": "6PzYZdX2nMVt" }, "source": [ "### Eager Execution 和 `tf.function`\n", "\n", "如上所示,`tf.keras.layers.Layer` 和 `tf.Module` 的装饰方法在 Eager Execution 内部运行,并且也与 `tf.function` 兼容。这意味着您可以使用 [pdb](https://docs.python.org/3/library/pdb.html) 和其他交互式工具在前向传递运行时单步执行。\n", "\n", "警告:尽管从 `tf.function` *内部*调用 shim 装饰的层/模块方法是完全安全的,但如果这些 `tf.functions` 包含 `get_variable` 调用,则将 `tf.function` 置于 shim 装饰的方法中是不安全的。进入 `tf.function` 会重置 `variable_scope`,这意味着 shim 模仿的 TF1.x 样式基于变量范围的变量重用将在此设置中失效。" ] }, { "cell_type": "markdown", "metadata": { "id": "aPytVgZWnShe" }, "source": [ "### 分布策略\n", "\n", "在 `@track_tf1_style_variables` 装饰的层或模块方法中调用 `get_variable` 会在底层使用标准 `tf.Variable` 变量创建。这意味着您可以将它们与 `MirroredStrategy` 和 `TPUStrategy` 等 `tf.distribute` 提供的各种分发策略一起使用。" ] }, { "cell_type": "markdown", "metadata": { "id": "_DcK24FOA8A2" }, "source": [ "## 在装饰调用中嵌套 `tf.Variable`、`tf.Module`、`tf.keras.layers` 和 `tf.keras.models`\n", "\n", "在 `tf.compat.v1.keras.utils.track_tf1_style_variables` 中装饰您的层调用只会添加对通过 `tf.compat.v1.get_variable` 创建(和重用)的变量的自动隐式跟踪。它不会捕获由 `tf.Variable` 调用直接创建的权重,例如典型的 Keras 层和大多数 `tf.Module` 使用的权重。本部分介绍如何处理这些嵌套情况。\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Azxza3bVOZlv" }, "source": [ "### (预先存在的用法)`tf.keras.layers` 和 `tf.keras.models`\n", "\n", "对于嵌套 Keras 层和模型的预先存在的用法,请使用 `tf.compat.v1.keras.utils.get_or_create_layer`。这仅建议用于简化现有 TF1.x 嵌套 Keras 用法的迁移;新代码应当使用如下所述的 tf.Variables 和 tf.Modules 的显式特性设置。\n", "\n", "要使用 `tf.compat.v1.keras.utils.get_or_create_layer`,请将构造嵌套模型的代码封装到一个方法内,并将其传递给该方法。示例如下:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "LN15TcRgHKsq", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class NestedModel(tf.keras.Model):\n", "\n", " def __init__(self, units, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.units = units\n", "\n", " def build_model(self):\n", " inp = tf.keras.Input(shape=(5, 5))\n", " dense_layer = tf.keras.layers.Dense(\n", " 10, name=\"dense\", kernel_regularizer=\"l2\",\n", " kernel_initializer=tf.compat.v1.ones_initializer())\n", " model = tf.keras.Model(inputs=inp, outputs=dense_layer(inp))\n", " return model\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs):\n", " # Get or create a nested model without assigning it as an explicit property\n", " model = tf.compat.v1.keras.utils.get_or_create_layer(\n", " \"dense_model\", self.build_model)\n", " return model(inputs)\n", "\n", "layer = NestedModel(10)\n", "layer(tf.ones(shape=(5,5)))" ] }, { "cell_type": "markdown", "metadata": { "id": "DgsKlltPHI8z" }, "source": [ "这种方法可确保这些嵌套层被 TensorFlow 正确重用和跟踪。请注意,在适当的方法上仍然需要 `@track_tf1_style_variables` 装饰器。传递给 `get_or_create_layer` 的模型构建器方法(在本例中为 `self.build_model`)不应带参数。\n", "\n", "跟踪权重:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3zO5A78MJsqO", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "assert len(layer.weights) == 2\n", "weights = {x.name: x for x in layer.variables}\n", "\n", "assert set(weights.keys()) == {\"dense/bias:0\", \"dense/kernel:0\"}\n", "\n", "layer.weights" ] }, { "cell_type": "markdown", "metadata": { "id": "o3Xsi-JbKTuj" }, "source": [ "以及正则化损失:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "mdK5RGm5KW5C", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "tf.add_n(layer.losses)" ] }, { "cell_type": "markdown", "metadata": { "id": "J_VRycQYJrXu" }, "source": [ "### 增量迁移:`tf.Variables` 和 `tf.Modules`\n", "\n", "如果您需要在修饰方法中嵌入 `tf.Variable` 调用或 `tf.Module`(例如,如果您遵循本指南后面介绍的向非传统 TF2 API 的增量迁移),您仍然需要根据下面的要求显式跟踪它们:\n", "\n", "- 显式确保变量/模块/层只创建一次\n", "- 就像定义[典型模块或层](https://tensorflow.google.cn/guide/intro_to_modules#defining_models_and_layers_in_tensorflow)时一样,将它们显式附加为实例特性\n", "- 在后续调用中显式重用已创建的对象\n", "\n", "这确保了每次调用都不会创建新的权重并且可以正确地重用权重。此外,这还可以确保跟踪现有的权重和正则化损失。\n", "\n", "下面是一个展现外观的示例:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "mrRPPoJ5ap5U", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class NestedLayer(tf.keras.layers.Layer):\n", "\n", " def __init__(self, units, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.units = units\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def __call__(self, inputs):\n", " out = inputs\n", " with tf.compat.v1.variable_scope(\"inner_dense\"):\n", " # The weights are created with a `regularizer`,\n", " # so the layer should track their regularization losses\n", " kernel = tf.compat.v1.get_variable(\n", " shape=[out.shape[-1], self.units],\n", " regularizer=tf.keras.regularizers.L2(),\n", " initializer=tf.compat.v1.initializers.glorot_normal,\n", " name=\"kernel\")\n", " bias = tf.compat.v1.get_variable(\n", " shape=[self.units,],\n", " initializer=tf.compat.v1.initializers.zeros,\n", " name=\"bias\")\n", " out = tf.linalg.matmul(out, kernel)\n", " out = tf.compat.v1.nn.bias_add(out, bias)\n", " return out\n", "\n", "class WrappedDenseLayer(tf.keras.layers.Layer):\n", "\n", " def __init__(self, units, **kwargs):\n", " super().__init__(**kwargs)\n", " self.units = units\n", " # Only create the nested tf.variable/module/layer/model\n", " # once, and then reuse it each time!\n", " self._dense_layer = NestedLayer(self.units)\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs):\n", " with tf.compat.v1.variable_scope('outer'):\n", " outputs = tf.compat.v1.layers.dense(inputs, 3)\n", " outputs = tf.compat.v1.layers.dense(inputs, 4)\n", " return self._dense_layer(outputs)\n", "\n", "layer = WrappedDenseLayer(10)\n", "\n", "layer(tf.ones(shape=(5, 5)))" ] }, { "cell_type": "markdown", "metadata": { "id": "Lo9h6wc6bmEF" }, "source": [ "请注意,即使使用 `track_tf1_style_variables` 装饰器装饰嵌套模块,也需要显式跟踪它。这是因为带有修饰方法的每个模块/层都有自己的变量存储与之关联。\n", "\n", "正确跟踪权重:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Qt6USaTVbauM", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "assert len(layer.weights) == 6\n", "weights = {x.name: x for x in layer.variables}\n", "\n", "assert set(weights.keys()) == {\"outer/inner_dense/bias:0\",\n", " \"outer/inner_dense/kernel:0\",\n", " \"outer/dense/bias:0\",\n", " \"outer/dense/kernel:0\",\n", " \"outer/dense_1/bias:0\",\n", " \"outer/dense_1/kernel:0\"}\n", "\n", "layer.trainable_weights" ] }, { "cell_type": "markdown", "metadata": { "id": "dHn-bJoNJw7l" }, "source": [ "以及正则化损失:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "pq5GFtXjJyut", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "layer.losses" ] }, { "cell_type": "markdown", "metadata": { "id": "p7VKJj3JOCEk" }, "source": [ "请注意,如果 `NestedLayer` 是非 Keras `tf.Module`,仍会跟踪变量,但不会自动跟踪正则化损失,因此您必须单独显式跟踪它们。" ] }, { "cell_type": "markdown", "metadata": { "id": "FsTgnydkdezQ" }, "source": [ "### 变量名称指南\n", "\n", "显式 `tf.Variable` 调用和 Keras 层使用不同于 `get_variable` 和 `variable_scopes` 组合中的层名/变量名自动生成机制。尽管即使从 TF1.x 计算图转到 TF2 Eager Execution 和 `tf.function`,shim 也会使您的变量名称与 `get_variable` 创建的变量匹配,但它无法保证为 `tf.Variable` 调用和 Keras 层生成的变量名称与您嵌入到方法装饰器中的变量名称相同。多个变量甚至可以在 TF2 Eager Execution 和 `tf.function` 中共享相同的名称。\n", "\n", "在本指南后面有关验证正确性和映射 TF1.x 检查点的部分中,您应该特别注意这一点。" ] }, { "cell_type": "markdown", "metadata": { "id": "CaP7fxoUWfMm" }, "source": [ "### 在装饰方法中使用 `tf.compat.v1.make_template`\n", "\n", "**强烈建议您直接使用 `tf.compat.v1.keras.utils.track_tf1_style_variables` 而不是使用 `tf.compat.v1.make_template`,因为它是 TF2 上一个较薄的层**。\n", "\n", "请按照本部分中的指南获取已经依赖于 `tf.compat.v1.make_template` 的先前 TF1.x 代码。\n", "\n", "由于 `tf.compat.v1.make_template` 封装使用 `get_variable` 的代码,`track_tf1_style_variables` 装饰器允许您在层调用中使用这些模板并成功跟踪权重和正则化损失。\n", "\n", "但是,请确保只调用一次 `make_template`,然后在每个层调用中重用相同的模板。否则,每次调用层时都会创建一个新模板以及一组新变量。\n", "\n", "例如:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "iHEQN8z44dbK", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class CompatV1TemplateScaleByY(tf.keras.layers.Layer):\n", "\n", " def __init__(self, **kwargs):\n", " super().__init__(**kwargs)\n", " def my_op(x, scalar_name):\n", " var1 = tf.compat.v1.get_variable(scalar_name,\n", " shape=[],\n", " regularizer=tf.compat.v1.keras.regularizers.L2(),\n", " initializer=tf.compat.v1.constant_initializer(1.5))\n", " return x * var1\n", " self.scale_by_y = tf.compat.v1.make_template('scale_by_y', my_op, scalar_name='y')\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs):\n", " with tf.compat.v1.variable_scope('layer'):\n", " # Using a scope ensures the `scale_by_y` name will not be incremented\n", " # for each instantiation of the layer.\n", " return self.scale_by_y(inputs)\n", "\n", "layer = CompatV1TemplateScaleByY()\n", "\n", "out = layer(tf.ones(shape=(2, 3)))\n", "print(\"weights:\", layer.weights)\n", "print(\"regularization loss:\", layer.losses)\n", "print(\"output:\", out)" ] }, { "cell_type": "markdown", "metadata": { "id": "3vKTJ7IsTEe8" }, "source": [ "警告:避免在多个层实例之间共享 `make_template` 创建的相同模板,因为它可能会破坏 shim 装饰器的变量和正则化损失跟踪机制。此外,如果您计划在多个层实例中使用相同的 `make_template` 名称,那么您应当将所创建模板的用法嵌套在 `variable_scope` 内。如果不这么做,则为模板的 `variable_scope` 生成的名称将随着层的每个新实例而递增。这可能会以意想不到的方式改变权重名称。" ] }, { "cell_type": "markdown", "metadata": { "id": "P4E3-XPhWD2N" }, "source": [ "## 到原生 TF2 的增量迁移\n", "\n", "如前文所述,`track_tf1_style_variables` 允许您将 TF2 样式的面向对象的 `tf.Variable`/`tf.keras.layers.Layer`/`tf.Module` 用法与传统的 `tf.compat.v1.get_variable`/`tf.compat.v1.layers` 样式用法混合在同一个装饰模块/层内。\n", "\n", "这意味着在使 TF1.x 模型与 TF2 完全兼容后,您可以使用原生(非 `tf.compat.v1`)TF2 API 编写所有新模型组件,并让它们与旧代码互操作。\n", "\n", "但是,如果您继续修改旧模型组件,您也可以选择将传统样式的 `tf.compat.v1` 用法逐步切换到推荐用于新编写的 TF2 代码的纯原生面向对象 API。\n", "\n", "`tf.compat.v1.get_variable` 用法可以替换为 `self.add_weight` 调用(如果您正在装饰 Keras 层/模型),或者 `tf.Variable` 调用(如果您正在装饰 Keras 对象或 `tf.Module`)。\n", "\n", "函数式和面向对象的 `tf.compat.v1.layers` 通常都可以替换为等效的 `tf.keras.layers` 层,无需更改任何参数。\n", "\n", "在逐步迁移到本身可能使用 `track_tf1_style_variables` 的纯原生 API 的过程中,您还可以考虑将模型的一部分或常见模式拆分单独的层/模块。\n", "\n", "### 关于 Slim 和 contrib.layers 的注意事项\n", "\n", "大量早期 TF 1.x 代码使用 [Slim](https://ai.googleblog.com/2016/08/tf-slim-high-level-library-to-define.html) 库,该库与 TF 1.x 一起打包为 `tf.contrib.layers`。使用 Slim 将代码转换为原生 TF 2 比转换 `v1.layers` 更复杂。事实上,先将您的 Slim 代码转换为 `v1.layers`,然后再转换为 Keras 可能是有意义的。下面是一些转换 Slim 代码的一般指南。\n", "\n", "- 确保所有参数都是显式的。如果可能,移除 `arg_scopes`。如果您仍然需要使用它们,请将 `normalizer_fn` 和 `activation_fn` 拆分为它们自己的层。\n", "- 可分离的卷积层映射至一个或多个不同的 Keras 层(深度、逐点和可分离 Keras 层)\n", "- Slim 与 `v1.layers` 有不同的参数名和默认值。\n", "- 请注意,某些参数具有不同的比例。" ] }, { "cell_type": "markdown", "metadata": { "id": "RFoULo-gazit" }, "source": [ "### 在忽略检查点兼容性的情况下迁移到原生 TF2\n", "\n", "以下代码示例演示了在不考虑检查点兼容性的情况下将模型逐步迁移到纯原生 API。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dPO9YJsb6r-D", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class CompatModel(tf.keras.layers.Layer):\n", "\n", " def __init__(self, units, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.units = units\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs, training=None):\n", " with tf.compat.v1.variable_scope('model'):\n", " out = tf.compat.v1.layers.conv2d(\n", " inputs, 3, 3,\n", " kernel_regularizer=\"l2\")\n", " out = tf.compat.v1.layers.flatten(out)\n", " out = tf.compat.v1.layers.dropout(out, training=training)\n", " out = tf.compat.v1.layers.dense(\n", " out, self.units,\n", " kernel_regularizer=\"l2\")\n", " return out\n" ] }, { "cell_type": "markdown", "metadata": { "id": "fp16xK6Oa8k9" }, "source": [ "接下来,以分段方式将 `compat.v1` API 替换为其原生的面向对象的对应项。首先,将卷积层切换为在层构造函数中创建的 Keras 对象。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "LOj1Swe16so3", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class PartiallyMigratedModel(tf.keras.layers.Layer):\n", "\n", " def __init__(self, units, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.units = units\n", " self.conv_layer = tf.keras.layers.Conv2D(\n", " 3, 3,\n", " kernel_regularizer=\"l2\")\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs, training=None):\n", " with tf.compat.v1.variable_scope('model'):\n", " out = self.conv_layer(inputs)\n", " out = tf.compat.v1.layers.flatten(out)\n", " out = tf.compat.v1.layers.dropout(out, training=training)\n", " out = tf.compat.v1.layers.dense(\n", " out, self.units,\n", " kernel_regularizer=\"l2\")\n", " return out\n" ] }, { "cell_type": "markdown", "metadata": { "id": "kzJF0H0sbce8" }, "source": [ "使用 [`v1.keras.utils.DeterministicRandomTestTool`](https://tensorflow.google.cn/api_docs/python/tf/compat/v1/keras/utils/DeterministicRandomTestTool) 类来验证这种增量更改是否使模型具有与以前相同的行为。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "MTJq0qW9_Tz2", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "random_tool = v1.keras.utils.DeterministicRandomTestTool(mode='num_random_ops')\n", "with random_tool.scope():\n", " tf.keras.utils.set_random_seed(42)\n", " layer = CompatModel(10)\n", "\n", " inputs = tf.random.normal(shape=(10, 5, 5, 5))\n", " original_output = layer(inputs)\n", "\n", " # Grab the regularization loss as well\n", " original_regularization_loss = tf.math.add_n(layer.losses)\n", "\n", "print(original_regularization_loss)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "X4Wq3wuaHjEV", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "random_tool = v1.keras.utils.DeterministicRandomTestTool(mode='num_random_ops')\n", "with random_tool.scope():\n", " tf.keras.utils.set_random_seed(42)\n", " layer = PartiallyMigratedModel(10)\n", "\n", " inputs = tf.random.normal(shape=(10, 5, 5, 5))\n", " migrated_output = layer(inputs)\n", "\n", " # Grab the regularization loss as well\n", " migrated_regularization_loss = tf.math.add_n(layer.losses)\n", "\n", "print(migrated_regularization_loss)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "mMMXS7EHjvCy", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "# Verify that the regularization loss and output both match\n", "np.testing.assert_allclose(original_regularization_loss.numpy(), migrated_regularization_loss.numpy())\n", "np.testing.assert_allclose(original_output.numpy(), migrated_output.numpy())" ] }, { "cell_type": "markdown", "metadata": { "id": "RMxiMVFwbiQy" }, "source": [ "您现在已经用原生 Keras 层替换了所有单独的 `compat.v1.layers`。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3dFCnyYc9DrX", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class NearlyFullyNativeModel(tf.keras.layers.Layer):\n", "\n", " def __init__(self, units, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.units = units\n", " self.conv_layer = tf.keras.layers.Conv2D(\n", " 3, 3,\n", " kernel_regularizer=\"l2\")\n", " self.flatten_layer = tf.keras.layers.Flatten()\n", " self.dense_layer = tf.keras.layers.Dense(\n", " self.units,\n", " kernel_regularizer=\"l2\")\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs):\n", " with tf.compat.v1.variable_scope('model'):\n", " out = self.conv_layer(inputs)\n", " out = self.flatten_layer(out)\n", " out = self.dense_layer(out)\n", " return out\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QGPqEjkGHgar", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "random_tool = v1.keras.utils.DeterministicRandomTestTool(mode='num_random_ops')\n", "with random_tool.scope():\n", " tf.keras.utils.set_random_seed(42)\n", " layer = NearlyFullyNativeModel(10)\n", "\n", " inputs = tf.random.normal(shape=(10, 5, 5, 5))\n", " migrated_output = layer(inputs)\n", "\n", " # Grab the regularization loss as well\n", " migrated_regularization_loss = tf.math.add_n(layer.losses)\n", "\n", "print(migrated_regularization_loss)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "uAs60eCdj6x_", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "# Verify that the regularization loss and output both match\n", "np.testing.assert_allclose(original_regularization_loss.numpy(), migrated_regularization_loss.numpy())\n", "np.testing.assert_allclose(original_output.numpy(), migrated_output.numpy())" ] }, { "cell_type": "markdown", "metadata": { "id": "oA6viSo3bo3y" }, "source": [ "最后,移除任何其余的(不再需要的)`variable_scope` 用法和 `track_tf1_style_variables` 装饰器本身。\n", "\n", "您现在得到了一个完全使用原生 API 的模型版本。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "mIHpHWIRDunU", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class FullyNativeModel(tf.keras.layers.Layer):\n", "\n", " def __init__(self, units, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.units = units\n", " self.conv_layer = tf.keras.layers.Conv2D(\n", " 3, 3,\n", " kernel_regularizer=\"l2\")\n", " self.flatten_layer = tf.keras.layers.Flatten()\n", " self.dense_layer = tf.keras.layers.Dense(\n", " self.units,\n", " kernel_regularizer=\"l2\")\n", "\n", " def call(self, inputs):\n", " out = self.conv_layer(inputs)\n", " out = self.flatten_layer(out)\n", " out = self.dense_layer(out)\n", " return out\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ttAmiCvLHW54", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "random_tool = v1.keras.utils.DeterministicRandomTestTool(mode='num_random_ops')\n", "with random_tool.scope():\n", " tf.keras.utils.set_random_seed(42)\n", " layer = FullyNativeModel(10)\n", "\n", " inputs = tf.random.normal(shape=(10, 5, 5, 5))\n", " migrated_output = layer(inputs)\n", "\n", " # Grab the regularization loss as well\n", " migrated_regularization_loss = tf.math.add_n(layer.losses)\n", "\n", "print(migrated_regularization_loss)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ym5DYtT4j7e3", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "# Verify that the regularization loss and output both match\n", "np.testing.assert_allclose(original_regularization_loss.numpy(), migrated_regularization_loss.numpy())\n", "np.testing.assert_allclose(original_output.numpy(), migrated_output.numpy())" ] }, { "cell_type": "markdown", "metadata": { "id": "oX4pdrzycIsa" }, "source": [ "### 在迁移到原生 TF2 期间保持检查点兼容性\n", "\n", "上述向原生 TF2 API 的迁移过程更改了变量名称(因为 Keras API 产生了极为不同的权重名称),以及指向模型中不同权重的面向对象路径。这些更改的影响是它们将破坏任何现有 TF1 样式的基于名称的检查点或 TF2 样式的面向对象的检查点。\n", "\n", "但是,在某些情况下,您可以使用原始的基于名称的检查点,并使用[重用 TF1.x 检查点指南](./migrating_checkpoints.ipynb)中详述的方式找到变量与其新名称的映射。\n", "\n", "使这种方法可行的一些技巧如下:\n", "\n", "- 变量仍然具有可以设置的 `name` 参数。\n", "- Keras 模型还采用 `name` 参数,并将其设置为变量的前缀。\n", "- v1.name_scope 函数可用于设置变量名前缀,这与 tf.variable_scope 截然不同。它只影响名称,而不跟踪变量和重用。\n", "\n", "考虑到以上几点,以下示例代码演示了一个工作流,您可以调整您的代码以增量更新模型的一部分,同时更新检查点。\n", "\n", "注:由于使用 Keras 层命名变量的复杂性,这不能保证适用于所有用例。" ] }, { "cell_type": "markdown", "metadata": { "id": "EFmMY3dcx3mR" }, "source": [ "1. 首先,将函数式 `tf.compat.v1.layers` 切换到面向对象的版本。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "cRxCFmNjl2ta", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class FunctionalStyleCompatModel(tf.keras.layers.Layer):\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs, training=None):\n", " with tf.compat.v1.variable_scope('model'):\n", " out = tf.compat.v1.layers.conv2d(\n", " inputs, 3, 3,\n", " kernel_regularizer=\"l2\")\n", " out = tf.compat.v1.layers.conv2d(\n", " out, 4, 4,\n", " kernel_regularizer=\"l2\")\n", " out = tf.compat.v1.layers.conv2d(\n", " out, 5, 5,\n", " kernel_regularizer=\"l2\")\n", " return out\n", "\n", "layer = FunctionalStyleCompatModel()\n", "layer(tf.ones(shape=(10, 10, 10, 10)))\n", "[v.name for v in layer.weights]" ] }, { "cell_type": "markdown", "metadata": { "id": "QvzUyXxjydAd" }, "source": [ "1. 接下来,将 compat.v1.layer 对象和由 `compat.v1.get_variable` 创建的任何变量分配为 `tf.keras.layers.Layer`/`tf.Module` 对象的属性,其方法用 `track_tf1_style_variables` 装饰(注意,任何面向对象的 TF2 样式检查点现在会同时保存按变量名的路径和新的面向对象的路径)。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "02jMQkJFmFwl", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class OOStyleCompatModel(tf.keras.layers.Layer):\n", "\n", " def __init__(self, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.conv_1 = tf.compat.v1.layers.Conv2D(\n", " 3, 3,\n", " kernel_regularizer=\"l2\")\n", " self.conv_2 = tf.compat.v1.layers.Conv2D(\n", " 4, 4,\n", " kernel_regularizer=\"l2\")\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs, training=None):\n", " with tf.compat.v1.variable_scope('model'):\n", " out = self.conv_1(inputs)\n", " out = self.conv_2(out)\n", " out = tf.compat.v1.layers.conv2d(\n", " out, 5, 5,\n", " kernel_regularizer=\"l2\")\n", " return out\n", "\n", "layer = OOStyleCompatModel()\n", "layer(tf.ones(shape=(10, 10, 10, 10)))\n", "[v.name for v in layer.weights]" ] }, { "cell_type": "markdown", "metadata": { "id": "8evFpd8Nq63v" }, "source": [ "1. 此时重新保存加载的检查点,以按照变量名称(对于 compat.v1.layers)或面向对象的对象计算图保存路径。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7neFr-9pqmJX", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "weights = {v.name: v for v in layer.weights}\n", "assert weights['model/conv2d/kernel:0'] is layer.conv_1.kernel\n", "assert weights['model/conv2d_1/bias:0'] is layer.conv_2.bias" ] }, { "cell_type": "markdown", "metadata": { "id": "pvsi743Xh9wn" }, "source": [ "1. 您现在可以将面向对象的 `compat.v1.layers` 替换为原生 Keras 层,同时仍然能够加载最近保存的检查点。通过继续记录被替换层的自动生成的 `variable_scopes`,确保为其余的 `compat.v1.layers` 保留变量名称。这些切换的层/变量现在将仅使用检查点中变量的对象特性路径,而不是变量名称路径。\n", "\n", "通常,您可以通过以下方式替换附加到属性的变量中的 `compat.v1.get_variable` 用法:\n", "\n", "- 将它们切换为使用 `tf.Variable`,**或者**\n", "- 使用 [`tf.keras.layers.Layer.add_weight`](https://tensorflow.google.cn/api_docs/python/tf/keras/layers/Layer#add_weight) 更新它们。请注意,如果您没有一次性切换所有层,这可能会更改缺少 `name` 参数的其余 `compat.v1.layers` 的自动生成层/变量命名。如果是这种情况,您必须通过手动打开和关闭与已移除的 `compat.v1.layer` 生成的作用域名称相对应的 `variable_scope` 来保持其余 `compat.v1.layers` 的变量名称相同。否则,来自现有检查点的路径可能会发生冲突,并且检查点加载的行为不正确。\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NbixtIW-maoH", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "def record_scope(scope_name):\n", " \"\"\"Record a variable_scope to make sure future ones get incremented.\"\"\"\n", " with tf.compat.v1.variable_scope(scope_name):\n", " pass\n", "\n", "class PartiallyNativeKerasLayersModel(tf.keras.layers.Layer):\n", "\n", " def __init__(self, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.conv_1 = tf.keras.layers.Conv2D(\n", " 3, 3,\n", " kernel_regularizer=\"l2\")\n", " self.conv_2 = tf.keras.layers.Conv2D(\n", " 4, 4,\n", " kernel_regularizer=\"l2\")\n", "\n", " @tf.compat.v1.keras.utils.track_tf1_style_variables\n", " def call(self, inputs, training=None):\n", " with tf.compat.v1.variable_scope('model'):\n", " out = self.conv_1(inputs)\n", " record_scope('conv2d') # Only needed if follow-on compat.v1.layers do not pass a `name` arg\n", " out = self.conv_2(out)\n", " record_scope('conv2d_1') # Only needed if follow-on compat.v1.layers do not pass a `name` arg\n", " out = tf.compat.v1.layers.conv2d(\n", " out, 5, 5,\n", " kernel_regularizer=\"l2\")\n", " return out\n", "\n", "layer = PartiallyNativeKerasLayersModel()\n", "layer(tf.ones(shape=(10, 10, 10, 10)))\n", "[v.name for v in layer.weights]" ] }, { "cell_type": "markdown", "metadata": { "id": "2eaPpevGs3dA" }, "source": [ "在构造变量后,在这一步保存检查点将使其***仅***包含当前可用的对象路径。\n", "\n", "确保记录已移除的 `compat.v1.layers` 的作用域,以便为其余的 `compat.v1.layers` 保留自动生成的权重名称。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "EK7vtWBprObA", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "weights = set(v.name for v in layer.weights)\n", "assert 'model/conv2d_2/kernel:0' in weights\n", "assert 'model/conv2d_2/bias:0' in weights" ] }, { "cell_type": "markdown", "metadata": { "id": "DQ5-SfmWFTvY" }, "source": [ "1. 重复上述步骤,直到您将模型中的所有 `compat.v1.layers` 和 `compat.v1.get_variable` 替换为完全原生的对应项。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "PA1d2POtnTQa", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "class FullyNativeKerasLayersModel(tf.keras.layers.Layer):\n", "\n", " def __init__(self, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " self.conv_1 = tf.keras.layers.Conv2D(\n", " 3, 3,\n", " kernel_regularizer=\"l2\")\n", " self.conv_2 = tf.keras.layers.Conv2D(\n", " 4, 4,\n", " kernel_regularizer=\"l2\")\n", " self.conv_3 = tf.keras.layers.Conv2D(\n", " 5, 5,\n", " kernel_regularizer=\"l2\")\n", "\n", "\n", " def call(self, inputs, training=None):\n", " with tf.compat.v1.variable_scope('model'):\n", " out = self.conv_1(inputs)\n", " out = self.conv_2(out)\n", " out = self.conv_3(out)\n", " return out\n", "\n", "layer = FullyNativeKerasLayersModel()\n", "layer(tf.ones(shape=(10, 10, 10, 10)))\n", "[v.name for v in layer.weights]" ] }, { "cell_type": "markdown", "metadata": { "id": "vZejG7rTsTb6" }, "source": [ "请记得进行测试以确保新更新的检查点的行为仍然符合预期。在此过程的每个增量步骤中应用[验证数字正确性指南](./validate_correctness.ipynb)中介绍的技术,以确保您的迁移代码正确运行。" ] }, { "cell_type": "markdown", "metadata": { "id": "Ewi_h-cs6n-I" }, "source": [ "## 处理建模 shim 未涵盖的 TF1.x 到 TF2 行为更改\n", "\n", "本指南中介绍的建模 slim 可以确保使用 `get_variable`、`tf.compat.v1.layers` 和 `variable_scope` 语义创建的变量、层和正则化损失在使用 Eager Execution 和 `tf.function` 时继续像以前一样有效,无需依赖集合。\n", "\n", "本文未涵盖您的模型前向传递可能依赖的***所有*** TF1.x 特定语义。在某些情况下,shim 可能不足以让您的模型前向传递在 TF2 中自行运行。阅读 [TF1.x 与 TF2 行为指南](./tf1_vs_tf2),详细了解 TF1.x 与 TF2 之间的行为差异。" ] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "model_mapping.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }