FTVMQnnLegalize()
#
源码:tvm/python/tvm/relay/qnn/transform.py
& tvm/src/relay/qnn/pass/legalize.cc
import testing
from tvm.relay.qnn.op import register_qnn_legalize
register_qnn_legalize?
Signature: register_qnn_legalize(op_name, legal_op=None, level=10)
Docstring:
Register legal transformation function for a QNN op.
This helps QNN match hardware intrinsics better and is run before
canonicalization.
Parameters
----------
op_name : str
The name of the operator
legal_op: function (attrs: Attrs, inputs: List[Expr]) -> new_expr: Expr
The function for transforming an expr to another expr.
level : int
The priority level
File: /media/pc/data/lxw/ai/tvm/python/tvm/relay/qnn/op/op.py
Type: function
from tvm.relay.qnn.transform import Legalize
Legalize?
Signature: Legalize()
Docstring:
Legalizes QNN ops. As opposed to Relay Legalize, this one legalizes only QNN ops. One can
register a transformation/legalization function for an op by using the FTVMQnnLegalize attr_name
for FTVMLegalize op attribute. The isolation of QNN and Relay Legalize gives us separation of
concerns, leading to a better software practice. The legalization can be configured to happen
per target. An example of this type of legalization is shown below.
Examples
________
Suppose the original graph is as follows
data(u8) weight(u8)
| |
| |
qnn.conv2d (int32)
|
|
nn.relu (int32)
Now, we know that Intel Cascade Lake has VNNI instructions to speedup convolution. However, it
only works on u8 x i8 inputs. So, here, we can use QNN Legalize to transform the above graph as
follows
data(u8) weight(u8)
| |
| |
| requantize(i8)
| |
| |
qnn.conv2d (int32)
|
|
nn.relu (int32)
In this legalization, since we have isolated legalization for QNN ops, it will only trigger the
transformation for qnn.conv2d (and not nn.relu). This pass can be followed by CanonicalizeOps to
further lower the qnn.requantize and qnn.conv2d into an expr containing only Relay ops.
Returns
-------
ret : tvm.transform.Pass
The registered pass that legalizes QNN ops.
File: /media/pc/data/lxw/ai/tvm/python/tvm/relay/qnn/transform.py
Type: function
这段代码定义了一个名为 Legalize
的函数,它的作用是将 QNN 算子合法化。与 Relay Legalize 不同,这个函数只对 QNN 算子进行合法化。可以通过使用 FTVMQnnLegalize
属性名称为 FTVMLegalize
算子属性注册一个算子的 transformation/legalization 函数。QNN 和 Relay Legalize 的隔离使我们能够更好地分离关注点,从而得到更好的软件实践。合法化可以针对每个目标进行配置。
假设原始图如下:
data(u8) weight(u8)
| |
| |
qnn.conv2d (int32)
|
|
nn.relu (int32)
现在,我们知道 Intel Cascade Lake 有 VNNI 指令来加速卷积。然而,它只适用于 u8 x i8 输入。因此,在这里,我们可以使用 QNN Legalize 将上述图转换为以下形式:
data(u8) weight(u8)
| |
| |
| requantize(i8)
| |
| |
qnn.conv2d (int32)
|
|
nn.relu (int32)
在这个合法化中,由于我们已经为 QNN 算子进行了隔离合法化,它只会触发 qnn.conv2d
(而不是 nn.relu
)的转换。此传递可以跟随 CanonicalizeOps
进一步将 qnn.requantize
和 qnn.conv2d
降低为仅包含 Relay 算子的表达式。
函数返回 tvm.transform.Pass
对象,该对象将 QNN 算子合法化。