{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 调试\n",
    "\n",
    "通常在创作变换的过程中，我们的代码并不完全正确。在这种情况下，可能需要进行一些调试。关键是 backwards 工作：首先，检查调用生成的 module 的结果，以证明或否定正确性。然后，检查和调试生成的代码。然后，调试导致生成代码的变换过程。\n",
    "\n",
    "## 变换创作中的常见陷阱\n",
    "\n",
    "不确定的 {class}`set` 迭代顺序。在 Python 中，设置的数据类型是无序的。例如，使用 {class}`set` 来包含节点等对象的集合可能会导致意外的不确定性。一个例子是迭代一组节点，将它们插入到图中。因为设置的数据类型是无序的，输出程序中运算的顺序将是不确定的，并且可以在程序调用之间更改。推荐的替代方法是使用 {class}`dict` 数据类型，这是 Python 3.7（以及 cPython 3.6）开始按照[插入顺序](https://mail.python.org/pipermail/python-dev/2017-December/151283.html)排序。通过将要重复数据删除的值存储在 {class}`dict` 的键中，{class}`dict` 可以等价地用于 {class}`set`。\n",
    "\n",
    "## 检查 module 的正确性\n",
    "\n",
    "因为大多数深度学习 module 的输出都是由浮点 {class}`torch.Tensor` 实例组成，检查两个 {class}`torch.nn.Module` 结果之间的等价性不像做简单的相等性检查那样直接。为了激发这个想法，举个例子（RuntimeError：有多个值的张量的布尔值不明确）："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "ename": "RuntimeError",
     "evalue": "Boolean value of Tensor with more than one value is ambiguous",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mRuntimeError\u001b[0m                              Traceback (most recent call last)",
      "\u001b[1;32m/media/pc/data/4tb/lxw/home/lxw/hub/torch-book/doc/tutorial/fx/Debugging.ipynb Cell 2\u001b[0m in \u001b[0;36m<cell line: 21>\u001b[0;34m()\u001b[0m\n\u001b[1;32m     <a href='vscode-notebook-cell://ssh-remote%2Bxin/media/pc/data/4tb/lxw/home/lxw/hub/torch-book/doc/tutorial/fx/Debugging.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=16'>17</a>\u001b[0m transformed_resnet18 \u001b[39m=\u001b[39m transform(resnet18)\n\u001b[1;32m     <a href='vscode-notebook-cell://ssh-remote%2Bxin/media/pc/data/4tb/lxw/home/lxw/hub/torch-book/doc/tutorial/fx/Debugging.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=18'>19</a>\u001b[0m input_image \u001b[39m=\u001b[39m torch\u001b[39m.\u001b[39mrandn(\u001b[39m5\u001b[39m, \u001b[39m3\u001b[39m, \u001b[39m224\u001b[39m, \u001b[39m224\u001b[39m)\n\u001b[0;32m---> <a href='vscode-notebook-cell://ssh-remote%2Bxin/media/pc/data/4tb/lxw/home/lxw/hub/torch-book/doc/tutorial/fx/Debugging.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=20'>21</a>\u001b[0m \u001b[39massert\u001b[39;00m resnet18(input_image) \u001b[39m==\u001b[39m transformed_resnet18(input_image)\n",
      "\u001b[0;31mRuntimeError\u001b[0m: Boolean value of Tensor with more than one value is ambiguous"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "import torch.fx\n",
    "import torchvision.models as models\n",
    "\n",
    "\n",
    "def transform(m : torch.nn.Module) -> torch.nn.Module:\n",
    "    gm = torch.fx.symbolic_trace(m)\n",
    "\n",
    "    # Imagine we're doing some transforms here\n",
    "    # <...>\n",
    "\n",
    "    gm.recompile()\n",
    "\n",
    "    return gm\n",
    "\n",
    "resnet18 = models.resnet18()\n",
    "transformed_resnet18 = transform(resnet18)\n",
    "\n",
    "input_image = torch.randn(5, 3, 224, 224)\n",
    "\n",
    "assert resnet18(input_image) == transformed_resnet18(input_image)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在这里，尝试用 `==` 运算符检查两个深度学习模型的值是否相等。然而，由于运算符返回的是张量而不是 `bool` 值的问题，而且由于浮点值的比较应该使用误差边界（或 epsilon）来解释浮点运算的[非交换性](https://floating-point-gui.de/errors/comparison/)，这两个问题都没有很好地定义。可以使用 {func}`torch.allclose`，它会考虑到相对和绝对公差阈值的近似比较："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert torch.allclose(resnet18(input_image), transformed_resnet18(input_image))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与参考实现相比，这是工具箱中检查变换模块行为是否如期望的那样的第一个工具。\n",
    "\n",
    "## 调试生成的代码\n",
    "\n",
    "因为 FX 在 {class}`torch.fx.GraphModule` 上生成 {func}`forward` 函数，所以使用传统的调试技术（如 `print` 语句或 `pdb`）就不那么直接了。幸运的是，有几种技术可以用来调试生成的代码。\n",
    "\n",
    "### 使用 `pdb`\n",
    "\n",
    "调用 `pdb` 进入正在运行的程序。尽管表示 {class}`torch.fx.Graph` 的代码不在任何源文件中，但是当调用 `forward` 传递时，仍然可以使用 `pdb` 手动进入它。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--Return--\n",
      "None\n",
      "> \u001b[0;32m/tmp/ipykernel_2297333/4158250709.py\u001b[0m(21)\u001b[0;36m<cell line: 21>\u001b[0;34m()\u001b[0m\n",
      "\u001b[0;32m     19 \u001b[0;31m\u001b[0;31m# interactive `pdb` prompt. We can use the `step` or `s` command to\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0m\u001b[0;32m     20 \u001b[0;31m\u001b[0;31m# step into the execution of the next line\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0m\u001b[0;32m---> 21 \u001b[0;31m\u001b[0;32mimport\u001b[0m \u001b[0mpdb\u001b[0m\u001b[0;34m;\u001b[0m \u001b[0mpdb\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mset_trace\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0m\u001b[0;32m     22 \u001b[0;31m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0m\u001b[0;32m     23 \u001b[0;31m\u001b[0mmy_module_transformed\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput_value\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "from torch import fx\n",
    "import torchvision.models as models\n",
    "\n",
    "def my_pass(inp: torch.nn.Module, tracer_class : type = fx.Tracer) -> torch.nn.Module:\n",
    "    graph = tracer_class().trace(inp)\n",
    "    # Transformation logic here\n",
    "    # <...>\n",
    "\n",
    "    # Return new Module\n",
    "    return fx.GraphModule(inp, graph)\n",
    "\n",
    "my_module = models.resnet18()\n",
    "my_module_transformed = my_pass(my_module)\n",
    "\n",
    "input_value = torch.randn(5, 3, 224, 224)\n",
    "\n",
    "# When this line is executed at runtime, we will be dropped into an\n",
    "# interactive `pdb` prompt. We can use the `step` or `s` command to\n",
    "# step into the execution of the next line\n",
    "import pdb; pdb.set_trace()\n",
    "\n",
    "my_module_transformed(input_value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 打印生成代码\n",
    "\n",
    "如果您想要多次运行相同的代码，那么使用 `pdb` 逐步找到正确的代码可能有点乏味。在这种情况下，一种方法是简单地将生成的 `forward` 传递复制粘贴到代码中，并从那里检查它。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "# Assume that `traced` is a GraphModule that has undergone some\n",
    "# number of transforms\n",
    "\n",
    "# Copy this code for later\n",
    "print(traced)\n",
    "# Print the code generated from symbolic tracing. This outputs:\n",
    "\"\"\"\n",
    "def forward(self, y):\n",
    "    x = self.x\n",
    "    add_1 = x + y;  x = y = None\n",
    "    return add_1\n",
    "\"\"\"\n",
    "\n",
    "# Subclass the original Module\n",
    "class SubclassM(M):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "\n",
    "    # Paste the generated `forward` function (the one we printed and\n",
    "    # copied above) here\n",
    "    def forward(self, y):\n",
    "        x = self.x\n",
    "        add_1 = x + y;  x = y = None\n",
    "        return add_1\n",
    "\n",
    "# Create an instance of the original, untraced Module. Then, create an\n",
    "# instance of the Module with the copied `forward` function. We can\n",
    "# now compare the output of both the original and the traced version.\n",
    "pre_trace = M()\n",
    "post_trace = SubclassM()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 使用 {func}`~torch.fx.GraphModule.to_folder` 函数\n",
    "\n",
    "{func}`~torch.fx.GraphModule.to_folder` 是 {class}`~torch.fx.GraphModule` 中的方法，它允许你将生成的 FX 代码转储到文件夹中。尽管像打印生成的代码那样，将 `forward` 传递复制到代码中通常就足够了，但是使用 {func}`~torch.fx.GraphModule.to_folder` 检查模块和参数可能更容易。\n",
    "\n",
    "```python\n",
    "m = symbolic_trace(M())\n",
    "m.to_folder(\"foo\", \"Bar\")\n",
    "from foo import Bar\n",
    "y = Bar()\n",
    "```\n",
    "\n",
    "在运行上面的示例之后，可以查看 `foo/module.py` 中的代码，并根据需要修改它（例如添加 `print` 语句或使用 `pdb`)，以调试生成的代码。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 调试变换\n",
    "\n",
    "既然已经确定了变换正在创建不正确的代码，现在是调试变换本身的时候了。\n",
    "\n",
    "```python\n",
    "# Sample Module\n",
    "class M(torch.nn.Module):\n",
    "    def forward(self, x, y):\n",
    "        return x + y\n",
    "\n",
    "# Create an instance of `M`\n",
    "m = M()\n",
    "\n",
    "# Symbolically trace an instance of `M` (returns a GraphModule). In\n",
    "# this example, we'll only be discussing how to inspect a\n",
    "# GraphModule, so we aren't showing any sample transforms for the\n",
    "# sake of brevity.\n",
    "traced = symbolic_trace(m)\n",
    "\n",
    "# Print the code produced by tracing the module.\n",
    "print(traced)\n",
    "# The generated `forward` function is:\n",
    "\"\"\"\n",
    "def forward(self, x, y):\n",
    "    add = x + y;  x = y = None\n",
    "    return add\n",
    "\"\"\"\n",
    "\n",
    "# Print the internal Graph.\n",
    "print(traced.graph)\n",
    "# This print-out returns:\n",
    "\"\"\"\n",
    "graph():\n",
    "    %x : [#users=1] = placeholder[target=x]\n",
    "    %y : [#users=1] = placeholder[target=y]\n",
    "    %add : [#users=1] = call_function[target=operator.add](args = (%x, %y), kwargs = {})\n",
    "    return add\n",
    "\"\"\"\n",
    "\n",
    "# Print a tabular representation of the internal Graph.\n",
    "traced.graph.print_tabular()\n",
    "# This gives us:\n",
    "\"\"\"\n",
    "opcode         name    target                   args    kwargs\n",
    "-------------  ------  -----------------------  ------  --------\n",
    "placeholder    x       x                        ()      {}\n",
    "placeholder    y       y                        ()      {}\n",
    "call_function  add     <built-in function add>  (x, y)  {}\n",
    "output         output  output                   (add,)  {}\n",
    "\"\"\"\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用上面的实用函数，可以在应用变换之前和之后比较跟踪的 {class}`torch.nn.Module`。\n",
    "\n",
    "抛开上面的例子，考虑下面的代码：\n",
    "\n",
    "```python\n",
    "# Sample user-defined function\n",
    "def transform_graph(module: torch.nn.Module, tracer_class : type = fx.Tracer) -> torch.nn.Module:\n",
    "    # Get the Graph from our traced Module\n",
    "    g = tracer_class().trace(module)\n",
    "\n",
    "    \"\"\"\n",
    "    Transformations on `g` go here\n",
    "    \"\"\"\n",
    "\n",
    "    return fx.GraphModule(module, g)\n",
    "\n",
    "# Transform the Graph\n",
    "transformed = transform_graph(traced)\n",
    "\n",
    "# Print the new code after our transforms. Check to see if it was\n",
    "# what we expected\n",
    "print(transformed)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用上面的例子，假设对 `print(tracing)` 的调用告诉我们变换中有一个错误。希望使用调试器找到哪里出了问题。可以通过中断 `transform_graph(已跟踪)，然后按s“进入”对transform_graph(已跟踪)的调用来查看转换过程中发生了什么。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.10.4 ('tvmx': conda)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.4"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "e579259ee6098e2b9319de590d145b4b096774fe457bdf04260e3ba5c171e887"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}