{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "id": "rX8mhOLljYeM" }, "outputs": [], "source": [ "##### Copyright 2022 The TensorFlow Authors." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "BZSlp3DAjdYf", "vscode": { "languageId": "python" } }, "outputs": [], "source": [ "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "3wF5wszaj97Y" }, "source": [ "# TensorFlow Core API 快速入门" ] }, { "cell_type": "markdown", "metadata": { "id": "DUNzJc4jTj6G" }, "source": [ "
![]() | \n",
" ![]() | \n",
" ![]() | \n",
" ![]() | \n",
"
tf.random.shuffle
重排数据集,以避免有偏差的拆分。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "0mJU4kt6YiAp",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"dataset_shuffled = tf.random.shuffle(dataset_tf, seed=22)\n",
"train_data, test_data = dataset_shuffled[100:], dataset_shuffled[:100]\n",
"x_train, y_train = train_data[:, 1:], train_data[:, 0]\n",
"x_test, y_test = test_data[:, 1:], test_data[:, 0]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Bscb2Vsbi3TE"
},
"source": [
"通过对 `\"Origin\"` 特征进行独热编码来执行基本[特征工程](https://developers.google.com/machine-learning/crash-course/representation/feature-engineering)。`tf.one_hot` 函数可用于将此分类列转换为 3 个单独的二进制列。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_B8N9IV1i6IV",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"def onehot_origin(x):\n",
" origin = tf.cast(x[:, -1], tf.int32)\n",
" # Use `origin - 1` to account for 1-indexed feature\n",
" origin_oh = tf.one_hot(origin - 1, 3)\n",
" x_ohe = tf.concat([x[:, :-1], origin_oh], axis = 1)\n",
" return x_ohe\n",
"\n",
"x_train_ohe, x_test_ohe = onehot_origin(x_train), onehot_origin(x_test)\n",
"x_train_ohe.numpy()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qnoCDzzedite"
},
"source": [
"此示例显示了一个多元回归问题,其中预测器或特征具有截然不同的尺度。因此,标准化数据以使每个特征具有零均值和单位方差会有所帮助。使用 `tf.reduce_mean` 和 `tf.math.reduce_std` 函数进行标准化。然后,可以对回归模型的预测进行非标准化以获得其用原始单位表示的值。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dJJFdvqydhyp",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"class Normalize(tf.Module):\n",
" def __init__(self, x):\n",
" # Initialize the mean and standard deviation for normalization\n",
" self.mean = tf.math.reduce_mean(x, axis=0)\n",
" self.std = tf.math.reduce_std(x, axis=0)\n",
"\n",
" def norm(self, x):\n",
" # Normalize the input\n",
" return (x - self.mean)/self.std\n",
"\n",
" def unnorm(self, x):\n",
" # Unnormalize the input\n",
" return (x * self.std) + self.mean"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5BONV6fYYwZb",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"norm_x = Normalize(x_train_ohe)\n",
"norm_y = Normalize(y_train)\n",
"x_train_norm, y_train_norm = norm_x.norm(x_train_ohe), norm_y.norm(y_train)\n",
"x_test_norm, y_test_norm = norm_x.norm(x_test_ohe), norm_y.norm(y_test)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BPZ68wASog_I"
},
"source": [
"## 构建机器学习模型\n",
"\n",
"使用 TensorFlow Core API 构建线性回归模型。多元线性回归的方程如下:\n",
"\n",
"$${\\mathrm{Y}} = {\\mathrm{X}}w + b$$\n",
"\n",
"其中\n",
"\n",
"- $\\underset{m\\times 1}{\\mathrm{Y}}$:目标向量\n",
"- $\\underset{m\\times n}{\\mathrm{X}}$:特征矩阵\n",
"- $\\underset{n\\times 1}w$:权重向量\n",
"- $b$:偏差\n",
"\n",
"通过使用 `@tf.function` 装饰器,跟踪相应的 Python 代码以生成可调用的 TensorFlow 计算图。这种方式有利于在训练后保存和加载模型。它还可以为具有多层和复杂运算的模型带来性能提升。 "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "h3IKyzTCDNGo",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"class LinearRegression(tf.Module):\n",
"\n",
" def __init__(self):\n",
" self.built = False\n",
"\n",
" @tf.function\n",
" def __call__(self, x):\n",
" # Initialize the model parameters on the first call\n",
" if not self.built:\n",
" # Randomly generate the weight vector and bias term\n",
" rand_w = tf.random.uniform(shape=[x.shape[-1], 1])\n",
" rand_b = tf.random.uniform(shape=[])\n",
" self.w = tf.Variable(rand_w)\n",
" self.b = tf.Variable(rand_b)\n",
" self.built = True\n",
" y = tf.add(tf.matmul(x, self.w), self.b)\n",
" return tf.squeeze(y, axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "l2hiez2eIUz8"
},
"source": [
"对于每个样本,该模型通过计算其特征的加权和加上一个偏差项来返回对输入汽车 MPG 的预测值。然后,可以对该预测值进行非标准化以获得其用原始单位表示的值。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "OeOrNdnkEEcR",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"lin_reg = LinearRegression()\n",
"prediction = lin_reg(x_train_norm[:1])\n",
"prediction_unnorm = norm_y.unnorm(prediction)\n",
"prediction_unnorm.numpy()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FIHANxNSvWr9"
},
"source": [
"## 定义损失函数\n",
"\n",
"现在,定义一个损失函数来评估模型在训练过程中的性能。\n",
"\n",
"由于回归问题处理的是连续输出,均方误差 (MSE) 是损失函数的理想选择。MSE 由以下方程定义:\n",
"\n",
"$$MSE = \\frac{1}{m}\\sum_{i=1}^{m}(\\hat{y}_i -y_i)^2$$\n",
"\n",
"其中\n",
"\n",
"- $\\hat{y}$:预测向量\n",
"- $y$:真实目标向量\n",
"\n",
"此回归问题的目标是找到最小化 MSE 损失函数的最优权重向量 $w$ 和偏差 $b$。 "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8tYNVUkmw35s",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"def mse_loss(y_pred, y):\n",
" return tf.reduce_mean(tf.square(y_pred - y))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "htI-7aJPqclK"
},
"source": [
"## 训练并评估模型\n",
"\n",
"使用 mini-batch 进行训练既可以提高内存效率,又能加快收敛速度。`tf.data.Dataset` API 具有用于批处理和重排的有用函数。借助该 API,您可以从简单、可重用的部分构建复杂的输入流水线。在[此指南](https://tensorflow.google.cn/guide/data)中详细了解如何构建 TensorFlow 输入流水线。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "kxST2w_Nq0C5",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"batch_size = 64\n",
"train_dataset = tf.data.Dataset.from_tensor_slices((x_train_norm, y_train_norm))\n",
"train_dataset = train_dataset.shuffle(buffer_size=x_train.shape[0]).batch(batch_size)\n",
"test_dataset = tf.data.Dataset.from_tensor_slices((x_test_norm, y_test_norm))\n",
"test_dataset = test_dataset.shuffle(buffer_size=x_test.shape[0]).batch(batch_size)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "C9haUW8Yq3xD"
},
"source": [
"接下来,编写一个训练循环,通过使用 MSE 损失函数及其相对于输入参数的梯度来迭代更新模型的参数。\n",
"\n",
"这种迭代方法称为[梯度下降](https://developers.google.com/machine-learning/glossary#gradient-descent){:.external}。在每次迭代中,模型的参数通过在其计算梯度的相反方向上迈出一步来更新。这一步的大小由学习率决定,学习率是一个可配置的超参数。回想一下,函数的梯度表示其最陡上升的方向;因此,向相反方向迈出一步表示最陡下降的方向,这最终有助于最小化 MSE 损失函数。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "y7suUbJXVLqP",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"# Set training parameters\n",
"epochs = 100\n",
"learning_rate = 0.01\n",
"train_losses, test_losses = [], []\n",
"\n",
"# Format training loop\n",
"for epoch in range(epochs):\n",
" batch_losses_train, batch_losses_test = [], []\n",
"\n",
" # Iterate through the training data\n",
" for x_batch, y_batch in train_dataset:\n",
" with tf.GradientTape() as tape:\n",
" y_pred_batch = lin_reg(x_batch)\n",
" batch_loss = mse_loss(y_pred_batch, y_batch)\n",
" # Update parameters with respect to the gradient calculations\n",
" grads = tape.gradient(batch_loss, lin_reg.variables)\n",
" for g,v in zip(grads, lin_reg.variables):\n",
" v.assign_sub(learning_rate * g)\n",
" # Keep track of batch-level training performance \n",
" batch_losses_train.append(batch_loss)\n",
" \n",
" # Iterate through the testing data\n",
" for x_batch, y_batch in test_dataset:\n",
" y_pred_batch = lin_reg(x_batch)\n",
" batch_loss = mse_loss(y_pred_batch, y_batch)\n",
" # Keep track of batch-level testing performance \n",
" batch_losses_test.append(batch_loss)\n",
"\n",
" # Keep track of epoch-level model performance\n",
" train_loss = tf.reduce_mean(batch_losses_train)\n",
" test_loss = tf.reduce_mean(batch_losses_test)\n",
" train_losses.append(train_loss)\n",
" test_losses.append(test_loss)\n",
" if epoch % 10 == 0:\n",
" print(f'Mean squared error for step {epoch}: {train_loss.numpy():0.3f}')\n",
"\n",
"# Output final losses\n",
"print(f\"\\nFinal train loss: {train_loss:0.3f}\")\n",
"print(f\"Final test loss: {test_loss:0.3f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4mDAAPFqVVgn"
},
"source": [
"绘制 MSE 损失随时间变化的图。计算指定[验证集](https://developers.google.com/machine-learning/glossary#validation-set){:.external}或[测试集](https://developers.google.com/machine-learning/glossary#test-set){:.external}上的性能指标可确保模型不会对训练数据集过拟合,并且可以很好地泛化到未知数据。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "F7dTAzgHDUh7",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"matplotlib.rcParams['figure.figsize'] = [9, 6]\n",
"\n",
"plt.plot(range(epochs), train_losses, label = \"Training loss\")\n",
"plt.plot(range(epochs), test_losses, label = \"Testing loss\")\n",
"plt.xlabel(\"Epoch\")\n",
"plt.ylabel(\"Mean squared error loss\")\n",
"plt.legend()\n",
"plt.title(\"MSE loss vs training iterations\");"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Aj8NrlzlJqDG"
},
"source": [
"看起来该模型在拟合训练数据方面做得很好,同时也良好地泛化了未知测试数据。"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AUNIPubuPYDR"
},
"source": [
"## 保存和加载模型\n",
"\n",
"首先,构建一个接受原始数据并执行以下运算的导出模块:\n",
"\n",
"- 特征提取\n",
"- 归一化\n",
"- 预测\n",
"- 非归一化"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "g-uOrGa9ZehG",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"class ExportModule(tf.Module):\n",
" def __init__(self, model, extract_features, norm_x, norm_y):\n",
" # Initialize pre and postprocessing functions\n",
" self.model = model\n",
" self.extract_features = extract_features\n",
" self.norm_x = norm_x\n",
" self.norm_y = norm_y\n",
"\n",
" @tf.function(input_signature=[tf.TensorSpec(shape=[None, None], dtype=tf.float32)]) \n",
" def __call__(self, x):\n",
" # Run the ExportModule for new data points\n",
" x = self.extract_features(x)\n",
" x = self.norm_x.norm(x)\n",
" y = self.model(x)\n",
" y = self.norm_y.unnorm(y)\n",
" return y "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YPYYLQ8EZiU8",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"lin_reg_export = ExportModule(model=lin_reg,\n",
" extract_features=onehot_origin,\n",
" norm_x=norm_x,\n",
" norm_y=norm_y)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6v8xi06XZWiC"
},
"source": [
"如果要将模型保存为当前状态,请使用 `tf.saved_model.save` 函数。要加载保存的模型并进行预测,请使用 `tf.saved_model.load` 函数。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "K1IvMoHbptht",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"import tempfile\n",
"import os\n",
"\n",
"models = tempfile.mkdtemp()\n",
"save_path = os.path.join(models, 'lin_reg_export')\n",
"tf.saved_model.save(lin_reg_export, save_path)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rYb6DrEH0GMv",
"vscode": {
"languageId": "python"
}
},
"outputs": [],
"source": [
"lin_reg_loaded = tf.saved_model.load(save_path)\n",
"test_preds = lin_reg_loaded(x_test)\n",
"test_preds[:10].numpy()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-47O6_GLdRuT"
},
"source": [
"## 结论\n",
"\n",
"恭喜!您已经使用 TensorFlow Core 低级 API 训练了一个回归模型。\n",
"\n",
"有关使用 TensorFlow Core API 的更多示例,请查看以下指南:\n",
"\n",
"- 二元分类的[逻辑回归](./logistic_regression_core.ipynb)\n",
"- 用于手写数字识别的[多层感知器](./mlp_core.ipynb)\n"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [
"rX8mhOLljYeM"
],
"name": "quickstart_core.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}