{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# THOP: PyTorch OpCounter\n",
    "\n",
    "参考：[pytorch-OpCounter](https://github.com/Lyken17/pytorch-OpCounter)\n",
    "\n",
    "## 知识点\n",
    "\n",
    "- FLOPS（Floating Point Operations Per Second）：每秒浮点运算次数，是衡量硬件速度的指标。\n",
    "- FLOPs（Floating Point Operations）：浮点运算次数，用来衡量模型计算复杂度，常用来做神经网络模型速度的间接衡量标准。FLOPS 与 FLOPs 常常被人们混淆使用。\n",
    "- MACs（Multiply–Accumulate Operations)：乘加累积运算数（`a <- a + (b x c)`），常常被人们与 FLOPs 概念混淆实际上 1 MACs 包含一个乘法运算与一个加法运算，大约包含 2 FLOPs。通常 MACs 与 FLOPs 存在 2 倍的关系。\n",
    "\n",
    "然而，现实世界中的应用程序要复杂得多。考虑矩阵乘法的例子。`A` 是维数为 `(m,n)` 的矩阵，`B` 是 `(n, 1)`的向量。\n",
    "\n",
    "```python\n",
    "for i in range(m):\n",
    "    for j in range(n):\n",
    "        C[i][j] += A[i][j] * B[j] # one mul-add\n",
    "```\n",
    "\n",
    "会有 `mn`个 `MACs` 和 `2mn` 个 `FLOPs`。但是这样的实现是缓慢的，并行化是加快运行速度的必要条件：\n",
    "\n",
    "```python\n",
    "for i in range(m):\n",
    "  parallelfor j in range(n):\n",
    "      d[j] = A[i][j] * B[j] # one mul\n",
    "  C[i][j] = sum(d) # n adds\n",
    "```\n",
    "\n",
    "那么 `MACs` 的数量就不再是 `mn`。\n",
    "\n",
    "当比较 `MACs /FLOPs` 时，希望这个数字与实现无关，并且尽可能一般化。因此在 THOP 中，只考虑乘法的次数，而忽略其他所有运算。\n",
    "\n",
    "```{note}\n",
    "FLOPs 近似等于乘法运算的 2 倍。\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 基本用法"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.\n",
      "[INFO] Register count_normalization() for <class 'torch.nn.modules.batchnorm.BatchNorm2d'>.\n",
      "[INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.\n",
      "[INFO] Register zero_ops() for <class 'torch.nn.modules.pooling.MaxPool2d'>.\n",
      "[INFO] Register zero_ops() for <class 'torch.nn.modules.container.Sequential'>.\n",
      "[INFO] Register count_adap_avgpool() for <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>.\n",
      "[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "from torchvision.models import resnet50\n",
    "from thop import profile\n",
    "\n",
    "model = resnet50()\n",
    "input = torch.randn(1, 3, 224, 224)\n",
    "macs, params = profile(model, inputs=(input, ))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 定义第三方模块的规则\n",
    "\n",
    "\n",
    "```python\n",
    "class YourModule(nn.Module):\n",
    "    # your definition\n",
    "def count_your_model(model, x, y):\n",
    "    # your rule here\n",
    "\n",
    "input = torch.randn(1, 3, 224, 224)\n",
    "macs, params = profile(model, inputs=(input, ), \n",
    "                        custom_ops={YourModule: count_your_model})\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 提高输出可读性\n",
    "\n",
    "\n",
    "回调 `thop.clever_format`，以提供更好的输出格式。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MACs:  4.134G\n",
      "参数量:  25.557M\n"
     ]
    }
   ],
   "source": [
    "from thop import clever_format\n",
    "\n",
    "macs, params = clever_format([macs, params], \"%.3f\")\n",
    "print(\"MACs: \", macs)\n",
    "print(\"参数量: \", params)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 基准"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from dataclasses import dataclass, asdict\n",
    "\n",
    "@dataclass\n",
    "class Info:\n",
    "    params: int # 参数量\n",
    "    macs: int"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/pc/.local/lib/python3.8/site-packages/torchvision/models/googlenet.py:77: FutureWarning: The default weight initialization of GoogleNet will be changed in future releases of torchvision. If you wish to keep the old behavior (which leads to long initialization times due to scipy/scipy#11299), please set init_weights=True.\n",
      "  warnings.warn('The default weight initialization of GoogleNet will be changed in future releases of '\n",
      "/home/pc/.local/lib/python3.8/site-packages/torchvision/models/inception.py:80: FutureWarning: The default weight initialization of inception_v3 will be changed in future releases of torchvision. If you wish to keep the old behavior (which leads to long initialization times due to scipy/scipy#11299), please set init_weights=True.\n",
      "  warnings.warn('The default weight initialization of inception_v3 will be changed in future releases of '\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "from torchvision import models\n",
    "from thop.profile import profile\n",
    "\n",
    "model_names = sorted(\n",
    "    name\n",
    "    for name in models.__dict__\n",
    "    if name.islower()\n",
    "    and not name.startswith(\"__\")  # and \"inception\" in name\n",
    "    and callable(models.__dict__[name])\n",
    ")\n",
    "\n",
    "# print(\"%s | %s | %s\" % (\"Model\", \"Params(M)\", \"FLOPs(G)\"))\n",
    "# print(\"---|---|---\")\n",
    "\n",
    "device = \"cpu\"\n",
    "if torch.cuda.is_available():\n",
    "    device = \"cuda\"\n",
    "\n",
    "bunch = {}\n",
    "for name in model_names:\n",
    "    model = models.__dict__[name]().to(device)\n",
    "    dsize = (1, 3, 224, 224)\n",
    "    if \"inception\" in name:\n",
    "        dsize = (1, 3, 299, 299)\n",
    "    inputs = torch.randn(dsize).to(device)\n",
    "    total_ops, total_params = profile(model, (inputs,), verbose=False)\n",
    "    bunch[name] = asdict(Info(total_params / (2 ** 20), total_ops / (2 ** 30)))\n",
    "    # print(\n",
    "    #     \"%s | %.2f | %.2f\" % (name, total_params / (1000 ** 2), total_ops / (1000 ** 3))\n",
    "    # )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Params(M)</th>\n",
       "      <th>MACs(G)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>alexnet</th>\n",
       "      <td>58.27</td>\n",
       "      <td>0.67</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>densenet121</th>\n",
       "      <td>7.61</td>\n",
       "      <td>2.70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>densenet161</th>\n",
       "      <td>27.35</td>\n",
       "      <td>7.31</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>densenet169</th>\n",
       "      <td>13.49</td>\n",
       "      <td>3.20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>densenet201</th>\n",
       "      <td>19.09</td>\n",
       "      <td>4.09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>googlenet</th>\n",
       "      <td>6.32</td>\n",
       "      <td>1.41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>inception_v3</th>\n",
       "      <td>22.73</td>\n",
       "      <td>5.35</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mnasnet0_5</th>\n",
       "      <td>2.12</td>\n",
       "      <td>0.11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mnasnet0_75</th>\n",
       "      <td>3.02</td>\n",
       "      <td>0.22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mnasnet1_0</th>\n",
       "      <td>4.18</td>\n",
       "      <td>0.31</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mnasnet1_3</th>\n",
       "      <td>5.99</td>\n",
       "      <td>0.52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mobilenet_v2</th>\n",
       "      <td>3.34</td>\n",
       "      <td>0.30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mobilenet_v3_large</th>\n",
       "      <td>5.23</td>\n",
       "      <td>0.22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mobilenet_v3_small</th>\n",
       "      <td>2.43</td>\n",
       "      <td>0.06</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>resnet101</th>\n",
       "      <td>42.49</td>\n",
       "      <td>7.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>resnet152</th>\n",
       "      <td>57.40</td>\n",
       "      <td>10.81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>resnet18</th>\n",
       "      <td>11.15</td>\n",
       "      <td>1.70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>resnet34</th>\n",
       "      <td>20.79</td>\n",
       "      <td>3.43</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>resnet50</th>\n",
       "      <td>24.37</td>\n",
       "      <td>3.85</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>resnext101_32x8d</th>\n",
       "      <td>84.68</td>\n",
       "      <td>15.40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>resnext50_32x4d</th>\n",
       "      <td>23.87</td>\n",
       "      <td>3.99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>shufflenet_v2_x0_5</th>\n",
       "      <td>1.30</td>\n",
       "      <td>0.04</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>shufflenet_v2_x1_0</th>\n",
       "      <td>2.17</td>\n",
       "      <td>0.14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>shufflenet_v2_x1_5</th>\n",
       "      <td>3.34</td>\n",
       "      <td>0.29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>shufflenet_v2_x2_0</th>\n",
       "      <td>7.05</td>\n",
       "      <td>0.56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>squeezenet1_0</th>\n",
       "      <td>1.19</td>\n",
       "      <td>0.76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>squeezenet1_1</th>\n",
       "      <td>1.18</td>\n",
       "      <td>0.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>vgg11</th>\n",
       "      <td>126.71</td>\n",
       "      <td>7.09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>vgg11_bn</th>\n",
       "      <td>126.71</td>\n",
       "      <td>7.11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>vgg13</th>\n",
       "      <td>126.88</td>\n",
       "      <td>10.53</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>vgg13_bn</th>\n",
       "      <td>126.89</td>\n",
       "      <td>10.58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>vgg16</th>\n",
       "      <td>131.95</td>\n",
       "      <td>14.41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>vgg16_bn</th>\n",
       "      <td>131.96</td>\n",
       "      <td>14.46</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>vgg19</th>\n",
       "      <td>137.01</td>\n",
       "      <td>18.28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>vgg19_bn</th>\n",
       "      <td>137.02</td>\n",
       "      <td>18.34</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>wide_resnet101_2</th>\n",
       "      <td>121.01</td>\n",
       "      <td>21.27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>wide_resnet50_2</th>\n",
       "      <td>65.69</td>\n",
       "      <td>10.67</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                    Params(M)  MACs(G)\n",
       "alexnet                 58.27     0.67\n",
       "densenet121              7.61     2.70\n",
       "densenet161             27.35     7.31\n",
       "densenet169             13.49     3.20\n",
       "densenet201             19.09     4.09\n",
       "googlenet                6.32     1.41\n",
       "inception_v3            22.73     5.35\n",
       "mnasnet0_5               2.12     0.11\n",
       "mnasnet0_75              3.02     0.22\n",
       "mnasnet1_0               4.18     0.31\n",
       "mnasnet1_3               5.99     0.52\n",
       "mobilenet_v2             3.34     0.30\n",
       "mobilenet_v3_large       5.23     0.22\n",
       "mobilenet_v3_small       2.43     0.06\n",
       "resnet101               42.49     7.33\n",
       "resnet152               57.40    10.81\n",
       "resnet18                11.15     1.70\n",
       "resnet34                20.79     3.43\n",
       "resnet50                24.37     3.85\n",
       "resnext101_32x8d        84.68    15.40\n",
       "resnext50_32x4d         23.87     3.99\n",
       "shufflenet_v2_x0_5       1.30     0.04\n",
       "shufflenet_v2_x1_0       2.17     0.14\n",
       "shufflenet_v2_x1_5       3.34     0.29\n",
       "shufflenet_v2_x2_0       7.05     0.56\n",
       "squeezenet1_0            1.19     0.76\n",
       "squeezenet1_1            1.18     0.33\n",
       "vgg11                  126.71     7.09\n",
       "vgg11_bn               126.71     7.11\n",
       "vgg13                  126.88    10.53\n",
       "vgg13_bn               126.89    10.58\n",
       "vgg16                  131.95    14.41\n",
       "vgg16_bn               131.96    14.46\n",
       "vgg19                  137.01    18.28\n",
       "vgg19_bn               137.02    18.34\n",
       "wide_resnet101_2       121.01    21.27\n",
       "wide_resnet50_2         65.69    10.67"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "df = pd.DataFrame(bunch).T\n",
    "df.columns = [\"Params(M)\", \"MACs(G)\"]\n",
    "df.round(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.8.10 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}