Hybrid 前端语言参考

Hybrid 前端语言参考#

概述#

这个 hybrid 前端允许用户编写一些 TVM 尚未正式支持的惯用语的初步版本。

特性#

软件仿真#

软件仿真和编译均得到支持。要定义函数，您需要使用 tvm.te.hybrid.script 装饰器来表明其为 hybrid 函数：

@tvm.te.hybrid.script
def outer_product(a, b):
    c = output_tensor((100, 99), 'float32')
    for i in range(a.shape[0]):
        for j in range(b.shape[0]):
            c[i, j] = a[i] * b[j]
    return c
a = numpy.random.randn(100)
b = numpy.random.randn(99)
c = outer_product(a, b)

此装饰器在软件仿真时会自动导入所需的 Keywords。软件仿真完成后，导入的关键词将被清理。用户无需担心关键词冲突和污染问题。

在参数列表中为软件仿真传递的每个元素要么是 Python 变量，要么是 numpy 数值类型。

后端编译#

不建议使用此函数，鼓励用户使用第二种接口。当前的解析接口如下：

a = tvm.te.placeholder((100, ), name='a')
b = tvm.te.placeholder((99, ), name='b')
parser = tvm.hybrid.parse(outer_product, [a, b]) # return the parser of this function

如果传递这些 TVM 数据结构，如 Tensor、Var、Expr.*Imm 或 tvm.container.Array，给这个函数，它将返回 op 节点：

a = tvm.te.placeholder((100, ), name='a')
b = tvm.te.placeholder((99, ), name='b')
c = outer_product(a, b) # return the output tensor(s) of the operator

您可以应用任何可用于 TVM OpNode 的方法，例如创建调度（create_schedule），尽管到目前为止，调度的功能与 ExternOpNode 一样有限。至少，它可以被构建为 LLVM 模块。

调优#

继续上面的例子，您可以使用一些类似 TVM 的接口来调优代码：

i, j = c.op.axis
sch = te.create_schedule(op)
jo, ji = sch.split(j, 4)
sch.vectorize(ji)

目前，您可以使用循环注解（unroll、parallel、vectorize 和 bind）、循环操作（split 和 fuse）以及 reorder。

备注

这是初步功能，因此用户应负责调优后功能的正确性。具体来说，用户在融合和重新排序不完美循环时应格外小心。

循环#

在 HalideIR 中，循环总共有 4 种类型：serial、unrolled、parallel 和 vectorized。

这里使用 range 即 serial、unroll、parallel 和 vectorize，这 4 个关键字来注释相应类型的 for 循环。用法与 Python 标准 range 大致相同。

除了 Halide 中支持的所有循环类型外，const_range 支持某些特定条件。有时，tvm.container.Array 需要作为参数传递，但在 TVM-HalideIR 中，没有将 tvm.container.Array 转换为 Expr 的支持。因此，支持有限的功能。用户可以通过常量或注释的常量循环访问容器。

@tvm.te.hybrid.script
def foo(a, b): # b is a tvm.container.Array
    c = output_tensor(a.shape, a.dtype)
    for i in const_range(len(a)): # because you have b access, i should be explicitly annotated as const_range
        c[i] = a[i] + b[i]
    return c

变量#

所有可变变量将被降级为大小为 1 的数组。它将变量的第一次存储视为其声明。

备注

与传统的 Python 不同，在 hybrid 脚本中，声明的变量只能在其声明的范围级别中使用。

备注

目前，您只能使用基本类型的变量，即变量的类型应为 float32 或 int32。

for i in range(5):
    s = 0 # declaration, this s will be a 1-array in lowered IR
    for j in range(5):
      s += a[i, j] # do something with s
    b[i] = s # you can still use s in this level
a[0] = s # you CANNOT use s here, even though it is allowed in conventional Python

属性#

到目前为止，仅支持张量的 shape 和 dtype 属性！shape 属性本质上是一个元组，因此您必须将其作为数组访问。目前，仅支持常量索引访问。

x = a.shape[2] # OK!
for i in range(3):
   for j in a.shape[i]: # BAD! i is not a constant!
       # do something

条件语句和表达式#

if condition1 and condition2 and condition3:
    # do something
else:
    # do something else
# Select
a = b if condition else c

然而，目前不支持 True 和 False 关键字。

数学内置函数#

到目前为止，支持这些数学内置函数：log、exp、sigmoid、tanh、power 和 popcount。不需要导入，正如 Software Emulation 中提到的，直接使用即可！

数组分配#

正在建设中，此功能将在稍后支持！

使用函数调用 allocation(shape, type, share/local) 来声明数组缓冲区。基本用法与普通的 numpy.array 大致相同，您应该以 a[i, j, k] 的方式访问高维数组，而不是 a[i][j][k]，即使对于 tvm.container.Array 也是如此，以便进行编译。

线程绑定#

您还可以通过编写如下代码来进行循环线程绑定：

for tx in bind("threadIdx.x", 100):
    a[tx] = b[tx]

断言语句#

支持断言语句，您可以像在标准 Python 中一样使用它。

assert cond, mesg

备注

Assert 不是函数调用。鼓励用户以上述方式使用 assert --- 条件后跟消息。它既适合 Python AST 也适合 HalideIR。

关键字#

对于关键字：serial、range、unroll、parallel、vectorize、bind、const_range"
数学关键字：log、exp、sqrt、rsqrt、sigmoid、tanh、power、popcount、round、ceil_div"
分配关键字：allocate、output_tensor"
数据类型关键字：uint8、uint16、uint32、uint64、int8、int16、int32、int64、float16、float32、float64"
其他： max_num_threads