FTVMQnnLegalize()

FTVMQnnLegalize()#

源码:tvm/python/tvm/relay/qnn/transform.py & tvm/src/relay/qnn/pass/legalize.cc

import testing
from tvm.relay.qnn.op import register_qnn_legalize

register_qnn_legalize?
Signature: register_qnn_legalize(op_name, legal_op=None, level=10)
Docstring:
Register legal transformation function for a QNN op.

This helps QNN match hardware intrinsics better and is run before
canonicalization.

Parameters
----------
op_name : str
    The name of the operator

legal_op: function (attrs: Attrs, inputs: List[Expr]) -> new_expr: Expr
    The function for transforming an expr to another expr.

level : int
    The priority level
File:      /media/pc/data/lxw/ai/tvm/python/tvm/relay/qnn/op/op.py
Type:      function
from tvm.relay.qnn.transform import Legalize

Legalize?
Signature: Legalize()
Docstring:
Legalizes QNN ops. As opposed to Relay Legalize, this one legalizes only QNN ops. One can
register a transformation/legalization function for an op by using the FTVMQnnLegalize attr_name
for FTVMLegalize op attribute. The isolation of QNN and Relay Legalize gives us separation of
concerns, leading to a better software practice. The legalization can be configured to happen
per target. An example of this type of legalization is shown below.

Examples
________

Suppose the original graph is as follows

        data(u8)  weight(u8)
            |       |
            |       |
           qnn.conv2d (int32)
               |
               |
            nn.relu (int32)

Now, we know that Intel Cascade Lake has VNNI instructions to speedup convolution. However, it
only works on u8 x i8 inputs. So, here, we can use QNN Legalize to transform the above graph as
follows

        data(u8)  weight(u8)
           |          |
           |          |
           |     requantize(i8)
           |        |
           |        |
           qnn.conv2d (int32)
               |
               |
             nn.relu (int32)

In this legalization, since we have isolated legalization for QNN ops, it will only trigger the
transformation for qnn.conv2d (and not nn.relu). This pass can be followed by CanonicalizeOps to
further lower the qnn.requantize and qnn.conv2d into an expr containing only Relay ops.

Returns
-------
ret : tvm.transform.Pass
    The registered pass that legalizes QNN ops.
File:      /media/pc/data/lxw/ai/tvm/python/tvm/relay/qnn/transform.py
Type:      function

这段代码定义了一个名为 Legalize 的函数,它的作用是将 QNN 算子合法化。与 Relay Legalize 不同,这个函数只对 QNN 算子进行合法化。可以通过使用 FTVMQnnLegalize 属性名称为 FTVMLegalize 算子属性注册一个算子的 transformation/legalization 函数。QNN 和 Relay Legalize 的隔离使我们能够更好地分离关注点,从而得到更好的软件实践。合法化可以针对每个目标进行配置。

假设原始图如下:

        data(u8)  weight(u8)
            |       |
            |       |
           qnn.conv2d (int32)
               |
               |
            nn.relu (int32)

现在,我们知道 Intel Cascade Lake 有 VNNI 指令来加速卷积。然而,它只适用于 u8 x i8 输入。因此,在这里,我们可以使用 QNN Legalize 将上述图转换为以下形式:

        data(u8)  weight(u8)
           |          |
           |          |
           |     requantize(i8)
           |        |
           |        |
           qnn.conv2d (int32)
               |
               |
             nn.relu (int32)

在这个合法化中,由于我们已经为 QNN 算子进行了隔离合法化,它只会触发 qnn.conv2d (而不是 nn.relu)的转换。此传递可以跟随 CanonicalizeOps 进一步将 qnn.requantizeqnn.conv2d 降低为仅包含 Relay 算子的表达式。

函数返回 tvm.transform.Pass 对象,该对象将 QNN 算子合法化。