{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 快速入门\n", "\n", "{guilabel}`参考`:[quickstart_tutorial](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html)\n", "\n", "本节介绍机器学习中常见任务的 API。\n", "\n", "## 数据准备\n", "\n", "PyTorch 有两个原语来处理数据:{class}`~torch.utils.data.DataLoader` 和 {class}`~torch.utils.data.Dataset`。{class}`~torch.utils.data.Dataset` 存储样本及其对应的标签,而 {class}`~torch.utils.data.DataLoader` 在 {class}`~torch.utils.data.Dataset` 周围包装了可迭代对象。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import torch\n", "from torch import nn\n", "from torch.utils.data import DataLoader\n", "from torchvision import datasets\n", "from torchvision.transforms import ToTensor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "PyTorch 提供了领域专用库,如 {mod}`TorchText `、{mod}`TorchVision ` 和 [TorchAudio](torchaudio),所有这些库都包含了 `datasets`。比如:\n", "\n", "[torchvision.datasets](datasets) 模块包含许多真实世界的视觉数据的 {class}`~torchvision.datasets.vision.VisionDataset` 对象,如 CIFAR, COCO。在本教程中,我们使用 FashionMNIST 数据集。每个 {class}`~torchvision.datasets.vision.VisionDataset` 包括两个参数:``transform`` 和 ``target_transform`` 来分别修改样本和标签。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz\n", "Using downloaded and verified file: data/FashionMNIST/raw/train-images-idx3-ubyte.gz\n", "Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw\n", "\n", "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz\n", "Using downloaded and verified file: data/FashionMNIST/raw/train-labels-idx1-ubyte.gz\n", "Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw\n", "\n", "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz\n", "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100.0%\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw\n", "\n", "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz\n", "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "119.3%" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# 下载训练数据集\n", "training_data = datasets.FashionMNIST(\n", " root=\"data\",\n", " train=True,\n", " download=True,\n", " transform=ToTensor(),\n", ")\n", "\n", "# 下载测试数据集\n", "test_data = datasets.FashionMNIST(\n", " root=\"data\",\n", " train=False,\n", " download=True,\n", " transform=ToTensor(),\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "将 ``Dataset`` 作为参数传递给 ``DataLoader``。它在数据集上包装了可迭代对象,并支持自动批处理、采样、洗牌和多进程数据加载。\n", "\n", "比如定义 64 的批处理大小,即 dataloader 可迭代对象中的每个元素将返回包含 64 个特征和标签的批处理。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "X [N, C, H, W] 的 shape:torch.Size([64, 1, 28, 28])\n", "y 的 shape 和 dtype:torch.Size([64]) torch.int64\n" ] } ], "source": [ "batch_size = 64\n", "\n", "# 创建数据 loader\n", "train_dataloader = DataLoader(training_data, batch_size=batch_size)\n", "test_dataloader = DataLoader(test_data, batch_size=batch_size)\n", "\n", "for X, y in test_dataloader:\n", " print(f\"X [N, C, H, W] 的 shape:{X.shape}\")\n", " print(f\"y 的 shape 和 dtype:{y.shape} {y.dtype}\")\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 构建模型\n", "\n", "要在 PyTorch 中定义神经网络,需要创建继承自 {class}`torch.nn.Module` 的类。在 {func}`__init__` 函数中定义网络层,并在 {func}`forward` 函数中指定数据将如何传递给网络。为了加速神经网络的运算,将其移动到 GPU(如果可用的话)。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "使用 cuda 设备\n", "NeuralNetwork(\n", " (flatten): Flatten(start_dim=1, end_dim=-1)\n", " (linear_relu_stack): Sequential(\n", " (0): Linear(in_features=784, out_features=512, bias=True)\n", " (1): ReLU()\n", " (2): Linear(in_features=512, out_features=512, bias=True)\n", " (3): ReLU()\n", " (4): Linear(in_features=512, out_features=10, bias=True)\n", " )\n", ")\n" ] } ], "source": [ "# 获取 CPU 或 GPU 设备进行训练。\n", "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n", "print(f\"使用 {device} 设备\")\n", "\n", "# 定义模型\n", "class NeuralNetwork(nn.Module):\n", " def __init__(self):\n", " super().__init__()\n", " self.flatten = nn.Flatten()\n", " self.linear_relu_stack = nn.Sequential(\n", " nn.Linear(28*28, 512),\n", " nn.ReLU(),\n", " nn.Linear(512, 512),\n", " nn.ReLU(),\n", " nn.Linear(512, 10)\n", " )\n", "\n", " def forward(self, x):\n", " x = self.flatten(x)\n", " logits = self.linear_relu_stack(x)\n", " return logits\n", "\n", "model = NeuralNetwork().to(device)\n", "print(model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "--------------\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 模型参数优化\n", "\n", "为了训练模型,需要定义 [损失函数](https://pytorch.org/docs/stable/nn.html#loss-functions) 和 [优化器](optim)。" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [], "source": [ "loss_fn = nn.CrossEntropyLoss()\n", "optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在单个训练循环中,模型对训练数据集进行预测(批量反馈给它),并反向传播预测误差以调整模型的参数。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def train(dataloader, model, loss_fn, optimizer):\n", " size = len(dataloader.dataset)\n", " model.train()\n", " for batch, (X, y) in enumerate(dataloader):\n", " X, y = X.to(device), y.to(device)\n", "\n", " # 计算预测误差\n", " pred = model(X)\n", " loss = loss_fn(pred, y)\n", "\n", " # 后向传播\n", " optimizer.zero_grad()\n", " loss.backward()\n", " optimizer.step()\n", "\n", " if batch % 100 == 0:\n", " loss, current = loss.item(), batch * len(X)\n", " print(f\"loss: {loss:>7f} [{current:>5d}/{size:>5d}]\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "还根据测试数据集检查模型的性能,以确保它正在学习。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def test(dataloader, model, loss_fn):\n", " size = len(dataloader.dataset)\n", " num_batches = len(dataloader)\n", " model.eval()\n", " test_loss, correct = 0, 0\n", " with torch.no_grad():\n", " for X, y in dataloader:\n", " X, y = X.to(device), y.to(device)\n", " pred = model(X)\n", " test_loss += loss_fn(pred, y).item()\n", " correct += (pred.argmax(1) == y).type(torch.float).sum().item()\n", " test_loss /= num_batches\n", " correct /= size\n", " print(f\"Test Error: \\n Accuracy: \"\n", " \"{(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \\n\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "训练过程是在几个迭代(*epoch*)中进行的。在每个时期,模型学习参数以做出更好的预测。打印模型在每个时期的精度和损失;希望在每个时期看到精度的提高和损失的减少。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "epochs = 5\n", "for t in range(epochs):\n", " print(f\"Epoch {t+1}\\n-------------------------------\")\n", " train(train_dataloader, model, loss_fn, optimizer)\n", " test(test_dataloader, model, loss_fn)\n", "print(\"Done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 保存模型\n", "\n", "保存模型的常见方法是序列化内部状态字典(包含模型参数)。" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "保存 PyTorch 模型状态到 model.pth\n" ] } ], "source": [ "torch.save(model.state_dict(), \"model.pth\")\n", "print(\"保存 PyTorch 模型状态到 model.pth\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 加载模型\n", "\n", "加载模型的过程包括重新创建模型结构和将状态字典加载到其中。" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = NeuralNetwork()\n", "model.load_state_dict(torch.load(\"model.pth\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "此模型现在可以用来进行预测。" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Predicted: \"Ankle boot\", Actual: \"Ankle boot\"\n" ] } ], "source": [ "classes = [\n", " \"T-shirt/top\",\n", " \"Trouser\",\n", " \"Pullover\",\n", " \"Dress\",\n", " \"Coat\",\n", " \"Sandal\",\n", " \"Shirt\",\n", " \"Sneaker\",\n", " \"Bag\",\n", " \"Ankle boot\",\n", "]\n", "\n", "model.eval()\n", "x, y = test_data[0][0], test_data[0][1]\n", "with torch.no_grad():\n", " pred = model(x)\n", " predicted, actual = classes[pred[0].argmax(0)], classes[y]\n", " print(f'Predicted: \"{predicted}\", Actual: \"{actual}\"')" ] } ], "metadata": { "interpreter": { "hash": "7a45eadec1f9f49b0fdfd1bc7d360ac982412448ce738fa321afc640e3212175" }, "kernelspec": { "display_name": "Python 3.10.4 ('torchx')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.4" } }, "nbformat": 4, "nbformat_minor": 0 }