tvm.te

目录

tvm.te#

Namespace for Tensor Expression Language

Classes:

ComputeOp()

Scalar operation.

ExternOp()

External operation.

HybridOp()

Hybrid operation.

PlaceholderOp()

Placeholder operation.

ScanOp()

Scan operation.

Schedule()

Schedule for all the stages.

SpecializedCondition(conditions)

Specialized condition to enable op specialization.

Stage()

A Stage represents schedule for one operation.

Tensor()

Tensor object, to construct, see function.Tensor

TensorComputeOp()

Tensor operation.

TensorSlice(tensor, indices)

Auxiliary data structure for enable slicing syntax from tensor.

Functions:

abs(x[, span])

Get absolute value of the input element-wise.

acos(x)

Take acos of input x.

acosh(x)

Take acos of input x.

add(lhs, rhs[, span])

Generic add operator.

all(*args[, span])

Create a new expression of the intersection of all conditions in the

any(*args[, span])

Create a new experssion of the union of all conditions in the arguments

asin(x)

Take asin of input x.

asinh(x)

Take asinh of input x.

atan(x)

Take atan of input x.

atanh(x)

Take atanh of input x.

ceil(x[, span])

Take ceil of float input x.

comm_reducer(fcombine, fidentity[, name])

Create a commutative reducer for reduction.

compute(shape, fcompute[, name, tag, attrs, ...])

Construct a new tensor by computing over the shape domain.

const(value[, dtype, span])

Create a new constant with specified value and dtype

cos(x)

Take cos of input x.

cosh(x)

Take cosh of input x.

create_prim_func(ops[, index_dtype_override])

Create a TensorIR PrimFunc from tensor expression

create_schedule(ops)

Create a schedule for list of ops

decl_tensor_intrin(op, fcompute[, name, ...])

Declare a tensor intrinsic function.

div(a, b[, span])

Compute a / b as in C/C++ semantics.

erf(x)

Take gauss error function of the input x.

exp(x)

Take exponential of input x.

extern(shape, inputs, fcompute[, name, ...])

Compute several tensors via an extern function.

extern_primfunc(input_tensors, primfunc, ...)

Compute tensors via a schedulable TIR PrimFunc

floor(x[, span])

Take floor of float input x.

floordiv(a, b[, span])

Compute the floordiv of two expressions.

floormod(a, b[, span])

Compute the floormod of two expressions.

fmod(x, y)

Return the remainder of x divided by y with the same sign as x.

gradient(output, inputs[, head])

Perform reverse-mode automatic differentiation.

if_then_else(cond, t, f[, span])

Conditional selection expression.

indexdiv(a, b[, span])

Compute floor(a / b) where a and b are non-negative.

indexmod(a, b[, span])

Compute the remainder of indexdiv.

isfinite(x[, span])

Check if input value is finite.

isinf(x[, span])

Check if input value is infinite.

isnan(x[, span])

Check if input value is Nan.

log(x)

Take log of input x.

log10(x)

Take log10 of input x.

log2(x)

Take log2 of input x.

max(expr, axis[, where, init])

Create a max expression over axis.

max_value(dtype[, span])

maximum value of dtype

min(expr, axis[, where, init])

Create a min expression over axis.

min_value(dtype[, span])

minimum value of dtype

multiply(lhs, rhs[, span])

Generic multiply operator.

nearbyint(x[, span])

Round elements of the array to the nearest integer.

placeholder(shape[, dtype, name])

Construct an empty tensor object.

popcount(x)

Count the number of set bits in input x.

power(x, y[, span])

x power y

reduce_axis(dom[, name, thread_tag, span])

Create a new IterVar for reduction.

round(x[, span])

Round elements of the array to the nearest integer.

rsqrt(x)

Take reciprocal of square root of input x.

scan(init, update, state_placeholder[, ...])

Construct new tensors by scanning over axis.

sigmoid(x)

Quick function to get sigmoid

sin(x)

Take sin of input x.

sinh(x)

Take sinh of input x.

size_var([name, dtype, span])

Create a new variable represents a tensor shape size, which is non-negative.

sqrt(x)

Take square root of input x.

subtract(lhs, rhs[, span])

Generic subtract operator.

sum(expr, axis[, where, init])

Create a sum expression over axis.

tag_scope(tag)

The operator tag scope.

tan(x)

Take tan of input x.

tanh(x)

Take hyperbolic tanh of input x.

thread_axis([dom, tag, name, span])

Create a new IterVar to represent thread index.

trace(args[, trace_action])

Trace tensor data at the runtime.

trunc(x[, span])

Get truncated value of the input.

truncdiv(a, b[, span])

Compute the truncdiv of two expressions.

truncmod(a, b[, span])

Compute the truncmod of two expressions.

var([name, dtype, span])

Create a new variable with specified name and dtype

class tvm.te.ComputeOp[源代码]#

Scalar operation.

class tvm.te.ExternOp[源代码]#

External operation.

class tvm.te.HybridOp[源代码]#

Hybrid operation.

Attributes:

axis

Represent the IterVar axis, also defined when it is a HybridOp

property axis#

Represent the IterVar axis, also defined when it is a HybridOp

class tvm.te.PlaceholderOp[源代码]#

Placeholder operation.

class tvm.te.ScanOp[源代码]#

Scan operation.

Attributes:

scan_axis

Represent the scan axis, only defined when it is a ScanOp

property scan_axis#

Represent the scan axis, only defined when it is a ScanOp

class tvm.te.Schedule[源代码]#

Schedule for all the stages.

Methods:

cache_read(tensor, scope, readers)

Create a cache read of original tensor for readers.

cache_write(tensor, scope)

Create a cache write of original tensor, before storing into tensor.

create_group(outputs, inputs[, include_inputs])

Create stage group by giving output and input boundary.

normalize()

Build a normalized schedule from the current schedule.

rfactor(tensor, axis[, factor_axis])

Factor a reduction axis in tensor's schedule to be an explicit axis.

cache_read(tensor, scope, readers)[源代码]#

Create a cache read of original tensor for readers.

This will mutate the body of the readers. A new cache stage will be created for the tensor. Call this before doing any split/fuse schedule.

Parameters#

tensorTensor

The tensor to be cached.

scopestr

The scope of cached

readerslist of Tensor or Operation

The readers to read the cache.

Returns#

cacheTensor

The created cache tensor.

cache_write(tensor, scope)[源代码]#

Create a cache write of original tensor, before storing into tensor.

This will mutate the body of the tensor. A new cache stage will created before feed into the tensor.

This function can be used to support data layout transformation. If there is a split/fuse/reorder on the data parallel axis of tensor before cache_write is called. The intermediate cache stores the data in the layout as the iteration order of leave axis. The data will be transformed back to the original layout in the original tensor. User can further call compute_inline to inline the original layout and keep the data stored in the transformed layout.

Parameters#

tensorTensor, list or tuple

The tensors to be feed to. All the tensors must be produced by one computeOp

scopestr

The scope of cached

Returns#

cacheTensor

The created cache tensor.

create_group(outputs, inputs, include_inputs=False)[源代码]#

Create stage group by giving output and input boundary.

The operators between outputs and inputs are placed as member of group. outputs are include in the group, while inputs are not included.

Parameters#

outputslist of Tensors

The outputs of the group.

inputslist of Tensors

The inputs of the group.

include_inputsboolean, optional

Whether include input operations in the group if they are used by outputs.

Returns#

groupStage

A virtual stage represents the group, user can use compute_at to move the attachment point of the group.

normalize()[源代码]#

Build a normalized schedule from the current schedule.

Insert necessary rebase to make certain iter var to start from 0. This is needed before bound inference and followup step.

Returns#

schSchedule

The normalized schedule.

rfactor(tensor, axis, factor_axis=0)[源代码]#

Factor a reduction axis in tensor’s schedule to be an explicit axis.

This will create a new stage that generated the new tensor with axis as the first dimension. The tensor’s body will be rewritten as a reduction over the factored tensor.

Parameters#

tensorTensor

The tensor to be factored.

axisIterVar

The reduction axis in the schedule to be factored.

factor_axisint

The position where the new axis is placed.

Returns#

tfactorTensor or Array of Tensor

The created factored tensor.

class tvm.te.SpecializedCondition(conditions)[源代码]#

Specialized condition to enable op specialization.

Methods:

__init__(conditions)

Create a specialized condition.

current()

Returns the current specialized condition

__init__(conditions)[源代码]#

Create a specialized condition.

备注

Conditions are represented in conjunctive joint form (CNF). Each condition should be a simple expression, e.g., n > 16, m % 8 == 0, etc., where n, m are tvm.Var that represents a dimension in the tensor shape.

Parameters#

conditionsList of tvm.Expr

List of conditions in conjunctive joint form (CNF).

static current()[源代码]#

Returns the current specialized condition

class tvm.te.Stage[源代码]#

A Stage represents schedule for one operation.

Methods:

bind(ivar, thread_ivar)

Bind ivar to thread index thread_ivar

compute_at(parent, scope)

Attach the stage at parent's scope

compute_inline()

Mark stage as inline

compute_root()

Attach the stage at parent, and mark it as root

double_buffer()

Compute the current stage via double buffering.

env_threads(threads)

Mark threads to be launched at the outer scope of composed op.

fuse(*args)

Fuse multiple consecutive iteration variables into a single iteration variable.

parallel(var)

Parallelize the iteration.

pragma(var, pragma_type[, pragma_value])

Annotate the iteration with pragma

prefetch(tensor, var, offset)

Prefetch the specified variable

reorder(*args)

reorder the arguments in the specified order.

rolling_buffer()

Compute the current stage via rolling buffering.

set_scope(scope)

Set the thread scope of this stage

set_store_predicate(predicate)

Set predicate under which store to the array can be performed.

split(parent[, factor, nparts, ...])

Split the stage either by factor providing outer scope, or both

storage_align(axis, factor, offset)

Set alignment requirement for specific axis

tensorize(var, tensor_intrin)

Tensorize the computation enclosed by var with tensor_intrin

tile(x_parent, y_parent, x_factor, y_factor)

Perform tiling on two dimensions

transform_layout(mapping_function)

Defines the layout transformation for the current stage's tensor.

unroll(var)

Unroll the iteration.

vectorize(var)

Vectorize the iteration.

bind(ivar, thread_ivar)[源代码]#

Bind ivar to thread index thread_ivar

Parameters#

ivarIterVar

The iteration to be binded to thread.

thread_ivarIterVar

The thread to be binded.

compute_at(parent, scope)[源代码]#

Attach the stage at parent’s scope

Parameters#

parentStage

The parent stage

scopeIterVar

The loop scope t be attached to.

compute_inline()[源代码]#

Mark stage as inline

Parameters#

parentStage

The parent stage

compute_root()[源代码]#

Attach the stage at parent, and mark it as root

Parameters#

parentStage

The parent stage

double_buffer()[源代码]#

Compute the current stage via double buffering.

This can only be applied to intermediate stage. This will double the storage cost of the current stage. Can be useful to hide load latency.

env_threads(threads)[源代码]#

Mark threads to be launched at the outer scope of composed op.

Parameters#

threadslist of threads

The threads to be launched.

fuse(*args)[源代码]#

Fuse multiple consecutive iteration variables into a single iteration variable.

fused = fuse(…fuse(fuse(args[0], args[1]), args[2]),…, args[-1]) The order is from outer to inner.

Parameters#

argslist of IterVars

Itervars that proceeds each other

Returns#

fusedIterVar

The fused variable of iteration.

parallel(var)[源代码]#

Parallelize the iteration.

Parameters#

varIterVar

The iteration to be parallelized.

pragma(var, pragma_type, pragma_value=None)[源代码]#

Annotate the iteration with pragma

This will translate to a pragma_scope surrounding the corresponding loop generated. Useful to support experimental features and extensions.

Parameters#

varIterVar

The iteration to be anotated

pragma_typestr

The pragma string to be annotated

pragma_valueExpr, optional

The pragma value to pass along the pragma

Note#

Most pragmas are advanced/experimental features and may subject to change. List of supported pragmas:

  • debug_skip_region

    Force skip the region marked by the axis and turn it into no-op. This is useful for debug purposes.

  • parallel_launch_point

    Specify to launch parallel threads outside the specified iteration loop. By default the threads launch at the point of parallel construct. This pragma moves the launching point to even outer scope. The threads are launched once and reused across multiple parallel constructs as BSP style program.

  • parallel_barrier_when_finish

    Insert a synchronization barrier between working threads after the specified loop iteration finishes.

  • parallel_stride_pattern

    Hint parallel loop to execute in strided pattern. for (int i = task_id; i < end; i += num_task)

prefetch(tensor, var, offset)[源代码]#

Prefetch the specified variable

Parameters#

tensorTensor

The tensor to be prefetched

varIterVar

The loop point at which the prefetching is applied

offsetExpr

The number of iterations to be prefetched before actual execution

reorder(*args)[源代码]#

reorder the arguments in the specified order.

Parameters#

argslist of IterVar

The order to be ordered

rolling_buffer()[源代码]#

Compute the current stage via rolling buffering.

This can only be applied to intermediate stage. This will change the storage cost of the current stage.

set_scope(scope)[源代码]#

Set the thread scope of this stage

Parameters#

scopestr

The thread scope of this stage

set_store_predicate(predicate)[源代码]#

Set predicate under which store to the array can be performed.

Use this when there are duplicated threads doing the same store and we only need one of them to do the store.

Parameters#

predicateExpr

The guard condition fo store.

split(parent, factor=None, nparts=None, disable_predication=False)[源代码]#

Split the stage either by factor providing outer scope, or both

Parameters#

parentIterVar

The parent iter var.

factorExpr, optional

The splitting factor

npartsExpr, optional

The number of outer parts.

disable_predicationbool, optional

If enabled, don’t create a predicate for guarding the loop. This can be useful when splitting with scalable factors that the schedule writer knows are divisible by the loop bound.

Warning: enabling this feature may result in incorrect code generation if not used carefully.

Returns#

outerIterVar

The outer variable of iteration.

innerIterVar

The inner variable of iteration.

storage_align(axis, factor, offset)[源代码]#

Set alignment requirement for specific axis

This ensures that stride[axis] == k * factor + offset for some k. This is useful to set memory layout to for more friendly memory access pattern. For example, we can set alignment to be factor=2, offset=1 to avoid bank conflict for thread access on higher dimension in GPU shared memory.

Parameters#

axisIterVar

The axis dimension to be aligned.

factorint

The factor in alignment specification.

offsetint

The offset in the alignment specification.

tensorize(var, tensor_intrin)[源代码]#

Tensorize the computation enclosed by var with tensor_intrin

Parameters#

varIterVar

The iteration boundary of tensorization.

tensor_intrinTensorIntrin

The tensor intrinsic used for computation.

tile(x_parent, y_parent, x_factor, y_factor)[源代码]#

Perform tiling on two dimensions

The final loop order from outmost to inner most are [x_outer, y_outer, x_inner, y_inner]

Parameters#

x_parentIterVar

The original x dimension

y_parentIterVar

The original y dimension

x_factorExpr

The stride factor on x axis

y_factorExpr

The stride factor on y axis

Returns#

x_outerIterVar

Outer axis of x dimension

y_outerIterVar

Outer axis of y dimension

x_innerIterVar

Inner axis of x dimension

p_y_innerIterVar

Inner axis of y dimension

transform_layout(mapping_function)[源代码]#

Defines the layout transformation for the current stage’s tensor.

The map from initial_indices to final_indices must be an invertible affine transformation. This method may be called more than once for a given tensor, in which case each transformation is applied sequentially.

If the stage is a ComputeOp, then the iteration order of the compute stage is rewritten to be a row-major traversal of the tensor, and the new loop iteration variables are returned. For all other stages, the loop iteration order is unmodified, and the return value is None.

Parameters#

mapping_function : Callable[…, List[tvm.tir.PrimExpr]]

A callable that accepts N arguments of type tvm.tir.Var, and outputs a list of PrimExpr. The input arguments represent the location of a value in the current stage’s tensor, using the pre-transformation layout. The return value of the function gives the location of that value in the current stage’s tensor, using the post-transformation layout.

Returns#

new_iter_vars : Optional[List[tvm.tir.IterVar]]

If the stage is a ComputeOp, then the return will be the updated loop iteration variables over the data array, in the same order as the output values from the mapping_function.

Otherwise, the return value is None.

Examples#

# ``A`` is a tensor whose compute definition is in NHWC
# format, and should be transformed into NCHWc format.

s[A].transform_layout(
    lambda n,h,w,c: [n, c//4, h, w, c%4]
)
# ``A`` is a tensor whose compute definition is in an
# arbitrary format, and should be transformed such that
# the last index is split, with the slower-changing index
# of the split placed at the slowest changing dimension.

s[A].transform_layout(
    lambda *indices, i: [i//4, *indices, i%4]
)
# ``B`` is a tensor defined by te.compute to be a copy of
# ``A`, and should be transformed such that ``B``'s layout
# is a transpose of ``A``'s layout.  The loop iteration
# that computes ``B`` will correspond to ``B``'s memory
# layout.

A = te.placeholder([n,m])
B = te.compute(A.shape, lambda i,j: A[i,j])
s = te.create_schedule(B.op)

s[B].transform_layout(lambda i,j: [j,i])
参数:

mapping_function (Callable[[...], List[PrimExpr]])

unroll(var)[源代码]#

Unroll the iteration.

Parameters#

varIterVar

The iteration to be unrolled.

vectorize(var)[源代码]#

Vectorize the iteration.

Parameters#

varIterVar

The iteration to be vectorize

class tvm.te.Tensor[源代码]#

Tensor object, to construct, see function.Tensor

Attributes:

axis

Axis of the tensor.

ndim

Dimension of the tensor.

op

The corressponding Operation.

shape

The output shape of the tensor.

value_index

The output value index the tensor corresponds to.

property axis#

Axis of the tensor.

property ndim#

Dimension of the tensor.

property op#

The corressponding Operation.

property shape#

The output shape of the tensor.

property value_index#

The output value index the tensor corresponds to.

class tvm.te.TensorComputeOp[源代码]#

Tensor operation.

class tvm.te.TensorSlice(tensor, indices)[源代码]#

Auxiliary data structure for enable slicing syntax from tensor.

Methods:

asobject()

Convert slice to object.

Attributes:

dtype

Data content of the tensor.

asobject()[源代码]#

Convert slice to object.

property dtype#

Data content of the tensor.

tvm.te.abs(x, span=None)[源代码]#

Get absolute value of the input element-wise.

Parameters#

xPrimExpr

Input argument.

spanOptional[Span]

The location of this operator in the source code.

Returns#

yPrimExpr

The result.

tvm.te.acos(x)[源代码]#

Take acos of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.acosh(x)[源代码]#

Take acos of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.add(lhs, rhs, span=None)[源代码]#

Generic add operator.

Parameters#

lhsobject

The left operand.

rhsobject

The right operand.

spanOptional[Span]

The location of this operator in the source.

Returns#

optvm.Expr

The result Expr of add operaton.

tvm.te.all(*args, span=None)[源代码]#
Create a new expression of the intersection of all conditions in the

arguments

Parameters#

argslist

List of symbolic boolean expressions

spanOptional[Span]

The location of this operator in the source code.

Returns#

expr: Expr

Expression

tvm.te.any(*args, span=None)[源代码]#

Create a new experssion of the union of all conditions in the arguments

Parameters#

argslist

List of symbolic boolean expressions

spanOptional[Span]

The location of this operator in the source code.

Returns#

expr: Expr

Expression

tvm.te.asin(x)[源代码]#

Take asin of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.asinh(x)[源代码]#

Take asinh of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.atan(x)[源代码]#

Take atan of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.atanh(x)[源代码]#

Take atanh of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.ceil(x, span=None)[源代码]#

Take ceil of float input x.

Parameters#

xPrimExpr

Input argument.

spanOptional[Span]

The location of this operator in the source code.

Returns#

yPrimExpr

The result.

tvm.te.comm_reducer(fcombine, fidentity, name='reduce')[源代码]#

Create a commutative reducer for reduction.

Parameters#

fcombinefunction(Expr -> Expr -> Expr)

A binary function which takes two Expr as input to return a Expr.

fidentityfunction(str -> Expr)

A function which takes a type string as input to return a const Expr.

Returns#

reducerfunction

A function which creates a reduce expression over axis. There are two ways to use it:

  1. accept (expr, axis, where) to produce an Reduce Expr on specified axis;

  2. simply use it with multiple Exprs.

Example#

n = te.var("n")
m = te.var("m")
mysum = te.comm_reducer(lambda x, y: x+y,
    lambda t: tvm.tir.const(0, dtype=t), name="mysum")
A = te.placeholder((n, m), name="A")
k = te.reduce_axis((0, m), name="k")
B = te.compute((n,), lambda i: mysum(A[i, k], axis=k), name="B")
tvm.te.compute(shape, fcompute, name='compute', tag='', attrs=None, varargs_names=None)[源代码]#

Construct a new tensor by computing over the shape domain.

The compute rule is result[axis] = fcompute(axis)

Parameters#

shape: Tuple of Expr

The shape of the tensor

fcompute: lambda function of indices-> value

Specifies the input source expression

name: str, optional

The name hint of the tensor

tag: str, optional

Additional tag information about the compute.

attrs: dict, optional

The additional auxiliary attributes about the compute.

varargs_names: list, optional

The names to use for each of the varargs. If not supplied, the varargs will be called i1, i2, …

Returns#

tensor: Tensor

The created tensor

tvm.te.const(value, dtype='int32', span=None)[源代码]#

Create a new constant with specified value and dtype

Parameters#

valueUnion[bool, int, float, numpy.ndarray, tvm.nd.NDArray]

The constant value.

dtypestr

The data type

spanOptional[Span]

The location of this variable in the source.

Returns#

constPrimExpr

The result constant expr.

tvm.te.cos(x)[源代码]#

Take cos of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.cosh(x)[源代码]#

Take cosh of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.create_prim_func(ops, index_dtype_override=None)[源代码]#

Create a TensorIR PrimFunc from tensor expression

Parameters#

opsList[Union[_tensor.Tensor, tvm.tir.Var]]

The source expression.

Example#

We define a matmul kernel using following code:

import tvm
from tvm import te
from tvm.te import create_prim_func
import tvm.script

A = te.placeholder((128, 128), name="A")
B = te.placeholder((128, 128), name="B")
k = te.reduce_axis((0, 128), "k")
C = te.compute((128, 128), lambda x, y: te.sum(A[x, k] * B[y, k], axis=k), name="C")
func = create_prim_func([A, B, C])
print(func.script())

If we want to use TensorIR schedule to do transformations on such kernel, we need to use create_prim_func([A, B, C]) to create a schedulable PrimFunc. The generated function looks like:

@T.prim_func
def tir_matmul(a: T.handle, b: T.handle, c: T.handle) -> None:
    A = T.match_buffer(a, (128, 128))
    B = T.match_buffer(b, (128, 128))
    C = T.match_buffer(c, (128, 128))

    for i, j, k in T.grid(128, 128, 128):
        with T.block():
            vi, vj, vk = T.axis.remap("SSR", [i, j, k])
            with T.init():
                C[vi, vj] = 0.0
            C[vi, vj] += A[vi, vk] * B[vj, vk]

Returns#

functir.PrimFunc

The created function.

参数:
返回类型:

PrimFunc

tvm.te.create_schedule(ops)[源代码]#

Create a schedule for list of ops

Parameters#

opslist of Operations

The source expression.

Returns#

schschedule.Schedule

The created schedule.

tvm.te.decl_tensor_intrin(op, fcompute, name='tensor_intrin', binds=None, scalar_params=None, default_buffer_params=None)[源代码]#

Declare a tensor intrinsic function.

Parameters#

op: Operation

The symbolic description of the intrinsic operation

fcompute: lambda function of inputs, outputs-> stmt

Specifies the IR statement to do the computation. See the following note for function signature of fcompute

备注

Parameters

Returns

  • stmt (tvm.tir.Stmt, or tuple of three stmts)

  • If a single stmt is returned, it represents the body

  • If tuple of three stmts are returned they corresponds to body, reduce_init, reduce_update

name: str, optional

The name of the intrinsic.

binds: dict of Tensor to tvm.tir.Buffer, optional

Dictionary that maps the Tensor to Buffer which specified the data layout requirement of the function. By default, a new compact buffer is created for each tensor in the argument.

scalar_params: a list of variables used by op, whose values will be passed

as scalar_inputs when the tensor intrinsic is called.

default_buffer_params: Optional[dict]

Dictionary of buffer arguments to be passed when constructing a buffer.

Returns#

intrin: TensorIntrin

A TensorIntrin that can be used in tensorize schedule.

tvm.te.div(a, b, span=None)[源代码]#

Compute a / b as in C/C++ semantics.

Parameters#

aPrimExpr

The left hand operand, known to be non-negative.

bPrimExpr

The right hand operand, known to be non-negative.

spanOptional[Span]

The location of this operator in the source.

Returns#

resPrimExpr

The result expression.

Note#

When operands are integers, returns truncdiv(a, b, span).

tvm.te.erf(x)[源代码]#

Take gauss error function of the input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.exp(x)[源代码]#

Take exponential of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.extern(shape, inputs, fcompute, name='extern', dtype=None, in_buffers=None, out_buffers=None, tag='', attrs=None)[源代码]#

Compute several tensors via an extern function.

Parameters#

shape: tuple or list of tuples.

The shape of the outputs.

inputs: list of Tensor

The inputs

fcompute: lambda function of inputs, outputs-> stmt

Specifies the IR statement to do the computation. See the following note for function signature of fcompute

备注

Parameters

Returns

  • stmt (tvm.tir.Stmt) - The statement that carries out array computation.

name: str, optional

The name hint of the tensor

dtype: str or list of str, optional

The data types of outputs, by default dtype will be same as inputs.

in_buffers: tvm.tir.Buffer or list of tvm.tir.Buffer, optional

Input buffers.

out_buffers: tvm.tir.Buffer or list of tvm.tir.Buffer, optional

Output buffers.

tag: str, optional

Additonal tag information about the compute.

attrs: dict, optional

The additional auxiliary attributes about the compute.

Returns#

tensor: Tensor or list of Tensors

The created tensor or tuple of tensors contains multiple outputs.

Example#

In the code below, C is generated by calling external PackedFunc tvm.contrib.cblas.matmul

A = te.placeholder((n, l), name="A")
B = te.placeholder((l, m), name="B")
C = te.extern((n, m), [A, B],
               lambda ins, outs: tvm.tir.call_packed(
                  "tvm.contrib.cblas.matmul",
                    ins[0], ins[1], outs[0], 0, 0), name="C")
tvm.te.extern_primfunc(input_tensors, primfunc, **kwargs)[源代码]#

Compute tensors via a schedulable TIR PrimFunc

Parameters#

input_tensors: list of Tensor

Input tensors that map to the corresponding primfunc input params.

primfunc: PrimFunc

The TIR PrimFunc

Returns#

tensor: Tensor or list of Tensors

The created tensor or tuple of tensors if it contains multiple outputs.

Example#

In the code below, a TVMScript defined TIR PrimFunc is inlined into a TE ExternOp. Applying te.create_prim_func on this

A = te.placeholder((128, 128), name="A")
B = te.placeholder((128, 128), name="B")

@T.prim_func
def before_split(a: T.handle, b: T.handle) -> None:
    A = T.match_buffer(a, (128, 128))
    B = T.match_buffer(b, (128, 128))
    for i, j in T.grid(128, 128):
        with T.block("B"):
            vi, vj = T.axis.remap("SS", [i, j])
            B[vi, vj] = A[vi, vj] * 2.0

C = te.extern_primfunc([A, B], func)
参数:
tvm.te.floor(x, span=None)[源代码]#

Take floor of float input x.

Parameters#

xPrimExpr

Input argument.

spanOptional[Span]

The location of this operator in the source code.

Returns#

yPrimExpr

The result.

参数:

x (PrimExprWithOp)

tvm.te.floordiv(a, b, span=None)[源代码]#

Compute the floordiv of two expressions.

Parameters#

aPrimExpr

The left hand operand

bPrimExpr

The right hand operand

spanOptional[Span]

The location of this operator in the source.

Returns#

resPrimExpr

The result expression.

tvm.te.floormod(a, b, span=None)[源代码]#

Compute the floormod of two expressions.

Parameters#

aPrimExpr

The left hand operand

bPrimExpr

The right hand operand

spanOptional[Span]

The location of this operator in the source.

Returns#

resPrimExpr

The result expression.

tvm.te.fmod(x, y)[源代码]#

Return the remainder of x divided by y with the same sign as x.

Parameters#

xPrimExpr

Input argument.

yPrimExpr

Input argument.

Returns#

zPrimExpr

The result.

tvm.te.gradient(output, inputs, head=None)[源代码]#

Perform reverse-mode automatic differentiation.

Parameters#

outputTensor

The tensor to differentiate.

inputsList[Tensor]

The list of input tensors to be differentiated wrt.

headTensor

The adjoint of the output, in other words, some tensor, by which the Jacobians will be multiplied. Its shape must be of the form prefix + output.shape. If None is passed, the identity tensor of shape output.shape + output.shape will be used.

Returns#

tensors: List[Tensor]

The result gradient, in the same order as the inputs

Example#

x = tvm.placeholder((32, 3, 28, 28), name='x')
w1 = tvm.placeholder((10, 3, 3, 3), name='w1')
w2 = tvm.placeholder((10, 10, 3, 3), name='w2')
z1 = topi.nn.conv2d(x, w1, 1, 1, 1)
z2 = topi.nn.conv2d(z1, w2, 1, 1, 1)
y = topi.sum(z2)

# produce gradients
[dw1, dw2] = tvm.gradient(y, [w1, w2])

# produce Jacobians
[jw1, jw2] = tvm.gradient(z2, [w1, w2])

# produce gradients, the head adjoint for z2 is provided manually
[dw1, dw2] = tvm.gradient(z2, [w1, w2], topi.full_like(z2, 1.0))
tvm.te.if_then_else(cond, t, f, span=None)[源代码]#

Conditional selection expression.

Parameters#

condPrimExpr

The condition

tPrimExpr

The result expression if cond is true.

fPrimExpr

The result expression if cond is false.

spanOptional[Span]

The location of this operator in the source.

Returns#

resultNode

The result of conditional expression.

Note#

Unlike Select, if_then_else will not execute the branch that does not satisfy the condition. You can use it to guard against out of bound access. Unlike Select, if_then_else cannot be vectorized if some lanes in the vector have different conditions.

tvm.te.indexdiv(a, b, span=None)[源代码]#

Compute floor(a / b) where a and b are non-negative.

Parameters#

aPrimExpr

The left hand operand, known to be non-negative.

bPrimExpr

The right hand operand, known to be non-negative.

spanOptional[Span]

The location of this operator in the source.

Returns#

resPrimExpr

The result expression.

Note#

Use this function to split non-negative indices. This function may take advantage of operands’ non-negativeness.

tvm.te.indexmod(a, b, span=None)[源代码]#

Compute the remainder of indexdiv. a and b are non-negative.

Parameters#

aPrimExpr

The left hand operand, known to be non-negative.

bPrimExpr

The right hand operand, known to be non-negative.

spanOptional[Span]

The location of this operator in the source.

Returns#

resPrimExpr

The result expression.

Note#

Use this function to split non-negative indices. This function may take advantage of operands’ non-negativeness.

tvm.te.isfinite(x, span=None)[源代码]#

Check if input value is finite.

Parameters#

xPrimExpr

Input argument.

spanOptional[Span]

The location of this operator in the source code.

Returns#

yPrimExpr

The result.

tvm.te.isinf(x, span=None)[源代码]#

Check if input value is infinite.

Parameters#

xPrimExpr

Input argument.

spanOptional[Span]

The location of this operator in the source code.

Returns#

yPrimExpr

The result.

tvm.te.isnan(x, span=None)[源代码]#

Check if input value is Nan.

Parameters#

xPrimExpr

Input argument.

spanOptional[Span]

The location of this operator in the source code.

Returns#

yPrimExpr

The result.

tvm.te.log(x)[源代码]#

Take log of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.log10(x)[源代码]#

Take log10 of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.log2(x)[源代码]#

Take log2 of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.max(expr, axis, where=None, init=None, *args)#

Create a max expression over axis.

Parameters#

exprPrimExpr

The source expression.

axisIterVar

The reduction IterVar axis

whereoptional, Expr

Filtering predicate of the reduction.

Returns#

valuePrimExpr

The result value.

Example#

m = te.var("m")
n = te.var("n")
A = te.placeholder((m, n), name="A")
k = te.reduce_axis((0, n), name="k")

# there are two way to use this max reducer:
# mode 1, accept (expr, axis, where) to produce an Reduce Expr
# tvm.max represents tvm.te.max or tvm.tir.max.
B = te.compute((m,), lambda i: tvm.max(A[i, k], axis=k), name="B")

# mode 2, simply use it with multiple Exprs:
max_res = tvm.max(m, n)
tvm.te.max_value(dtype, span=None)[源代码]#

maximum value of dtype

Parameters#

dtypestr

The data type.

spanOptional[Span]

The location of this operator in the source code.

Returns#

valuetvm.Expr

The maximum value of dtype.

参数:
返回类型:

Any

tvm.te.min(expr, axis, where=None, init=None, *args)#

Create a min expression over axis.

Parameters#

exprPrimExpr

The source expression.

axisIterVar

The reduction IterVar axis

whereoptional, Expr

Filtering predicate of the reduction.

Returns#

valuePrimExpr

The result value.

Example#

m = te.var("m")
n = te.var("n")
A = te.placeholder((m, n), name="A")
k = te.reduce_axis((0, n), name="k")

# there are two way to use this min reducer:
# mode 1, accept (expr, axis, where) to produce an Reduce Expr
# tvm.min represents tvm.te.min or tvm.tir.min.
B = te.compute((m,), lambda i: tvm.min(A[i, k], axis=k), name="B")

# mode 2, simply use it with multiple Exprs:
min_res = tvm.min(m, n)
tvm.te.min_value(dtype, span=None)[源代码]#

minimum value of dtype

Parameters#

dtypestr

The data type.

spanOptional[Span]

The location of this operator in the source code.

Returns#

valuetvm.Expr

The minimum value of dtype.

tvm.te.multiply(lhs, rhs, span=None)[源代码]#

Generic multiply operator.

Parameters#

lhsobject

The left operand.

rhsobject

The right operand.

spanOptional[Span]

The location of this operator in the source.

Returns#

optvm.Expr

The result Expr of multiply operaton.

tvm.te.nearbyint(x, span=None)[源代码]#

Round elements of the array to the nearest integer. This intrinsic uses llvm.nearbyint instead of llvm.round which is faster but will results different from te.round. Notably nearbyint rounds according to the rounding mode, whereas te.round (llvm.round) ignores that. For differences between the two see: https://en.cppreference.com/w/cpp/numeric/math/round https://en.cppreference.com/w/cpp/numeric/math/nearbyint

Parameters#

xPrimExpr

Input argument.

spanOptional[Span]

The location of this operator in the source code.

Returns#

yPrimExpr

The result.

tvm.te.placeholder(shape, dtype=None, name='placeholder')[源代码]#

Construct an empty tensor object.

Parameters#

shape: Tuple of Expr

The shape of the tensor

dtype: str, optional

The data type of the tensor

name: str, optional

The name hint of the tensor

Returns#

tensor: Tensor

The created tensor

tvm.te.popcount(x)[源代码]#

Count the number of set bits in input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.power(x, y, span=None)[源代码]#

x power y

Parameters#

xPrimExpr

Input argument.

yPrimExpr

The exponent

spanOptional[Span]

The location of this operator in the source code.

Returns#

zPrimExpr

The result.

tvm.te.reduce_axis(dom, name='rv', thread_tag='', span=None)[源代码]#

Create a new IterVar for reduction.

Parameters#

domRange

The domain of iteration.

namestr

The name of the variable.

thread_tagOptional[str]

The name of the thread_tag.

spanOptional[Span]

The location of this variable in the source.

Returns#

axisIterVar

An iteration variable representing the value.

tvm.te.round(x, span=None)[源代码]#

Round elements of the array to the nearest integer.

Parameters#

xPrimExpr

Input argument.

spanOptional[Span]

The location of this operator in the source code.

Returns#

yPrimExpr

The result.

tvm.te.rsqrt(x)[源代码]#

Take reciprocal of square root of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.scan(init, update, state_placeholder, inputs=None, name='scan', tag='', attrs=None)[源代码]#

Construct new tensors by scanning over axis.

Parameters#

init: Tensor or list of Tensor

The initial condition of first init.shape[0] timestamps

update: Tensor or list of Tensor

The update rule of the scan given by symbolic tensor.

state_placeholder: Tensor or list of Tensor

The placeholder variables used by update.

inputs: Tensor or list of Tensor, optional

The list of inputs to the scan. This is not required, but can be useful for the compiler to detect scan body faster.

name: str, optional

The name hint of the tensor

tag: str, optional

Additonal tag information about the compute.

attrs: dict, optional

The additional auxiliary attributes about the compute.

Returns#

tensor: Tensor or list of Tensors

The created tensor or tuple of tensors contains multiple outputs.

Example#

# The following code is equivalent to numpy.cumsum
m = te.var("m")
n = te.var("n")
X = te.placeholder((m, n), name="X")
s_state = te.placeholder((m, n))
s_init = te.compute((1, n), lambda _, i: X[0, i])
s_update = te.compute((m, n), lambda t, i: s_state[t-1, i] + X[t, i])
res = tvm.te.scan(s_init, s_update, s_state, X)
tvm.te.sigmoid(x)[源代码]#

Quick function to get sigmoid

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.sin(x)[源代码]#

Take sin of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.sinh(x)[源代码]#

Take sinh of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.size_var(name='size', dtype='int32', span=None)[源代码]#

Create a new variable represents a tensor shape size, which is non-negative.

Parameters#

namestr

The name

dtypestr

The data type

spanOptional[Span]

The location of this variable in the source.

Returns#

varSizeVar

The result symbolic shape variable.

tvm.te.sqrt(x)[源代码]#

Take square root of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.subtract(lhs, rhs, span=None)[源代码]#

Generic subtract operator.

Parameters#

lhsobject

The left operand.

rhsobject

The right operand.

spanOptional[Span]

The location of this operator in the source.

Returns#

optvm.Expr

The result Expr of subtract operaton.

tvm.te.sum(expr, axis, where=None, init=None, *args)#

Create a sum expression over axis.

Parameters#

exprPrimExpr

The source expression.

axisIterVar

The reduction IterVar axis

whereoptional, Expr

Filtering predicate of the reduction.

Returns#

valuePrimExpr

The result value.

Example#

m = te.var("m")
n = te.var("n")
A = te.placeholder((m, n), name="A")
k = te.reduce_axis((0, n), name="k")

# there are two way to use this sum reducer:
# mode 1, accept (expr, axis, where) to produce an Reduce Expr
# tvm.sum represents tvm.te.sum or tvm.tir.sum.
B = te.compute((m,), lambda i: tvm.sum(A[i, k], axis=k), name="B")

# mode 2, simply use it with multiple Exprs:
sum_res = tvm.sum(m, n)
tvm.te.tag_scope(tag)[源代码]#

The operator tag scope.

Parameters#

tag: str

The tag name.

Returns#

tag_scope: TagScope

The tag scope object, which can be used as decorator or context manger.

Example#

n = te.var('n')
m = te.var('m')
l = te.var('l')
A = te.placeholder((n, l), name='A')
B = te.placeholder((m, l), name='B')
k = te.reduce_axis((0, l), name='k')

with tvm.te.tag_scope(tag='matmul'):
    C = te.compute((n, m), lambda i, j: te.sum(A[i, k] * B[j, k], axis=k))

# or use tag_scope as decorator
@tvm.te.tag_scope(tag="conv")
def compute_relu(data):
    return te.compute(data.shape, lambda *i: tvm.tir.Select(data(*i) < 0, 0.0, data(*i)))
tvm.te.tan(x)[源代码]#

Take tan of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.tanh(x)[源代码]#

Take hyperbolic tanh of input x.

Parameters#

xPrimExpr

Input argument.

Returns#

yPrimExpr

The result.

tvm.te.thread_axis(dom=None, tag='', name='', span=None)[源代码]#

Create a new IterVar to represent thread index.

Parameters#

domRange or str

The domain of iteration When str is passed, dom is set to None and str is used as tag

tagstr, optional

The thread tag

namestr, optional

The name of the var.

spanOptional[Span]

The location of this variable in the source.

Returns#

axisIterVar

The thread itervar.

tvm.te.trace(args, trace_action='tvm.default_trace_action')[源代码]#

Trace tensor data at the runtime.

The trace function allows to trace specific tensor at the runtime. The tracing value should come as last argument. The trace action should be specified, by default tvm.default_trace_action is used.

Parameters#

argslist of Expr or Buffers.

Positional arguments.

trace_actionstr.

The name of the trace action.

Returns#

callPrimExpr

The call expression.

See Also#

tvm.tir.call_packed : Creates packed function.

tvm.te.trunc(x, span=None)[源代码]#

Get truncated value of the input.

The truncated value of the scalar x is the nearest integer i which is closer to zero than x is.

Parameters#

xPrimExpr

Input argument.

spanOptional[Span]

The location of this operator in the source code.

Returns#

yPrimExpr

The result.

tvm.te.truncdiv(a, b, span=None)[源代码]#

Compute the truncdiv of two expressions.

Parameters#

aPrimExpr

The left hand operand

bPrimExpr

The right hand operand

spanOptional[Span]

The location of this operator in the source.

Returns#

resPrimExpr

The result expression.

Note#

This is the default integer division behavior in C.

tvm.te.truncmod(a, b, span=None)[源代码]#

Compute the truncmod of two expressions.

Parameters#

aPrimExpr

The left hand operand

bPrimExpr

The right hand operand

spanOptional[Span]

The location of this operator in the source.

Returns#

resPrimExpr

The result expression.

Note#

This is the default integer division behavior in C.

tvm.te.var(name='tindex', dtype='int32', span=None)[源代码]#

Create a new variable with specified name and dtype

Parameters#

namestr

The name

dtypestr

The data type

spanOptional[Span]

The location of this variable in the source.

Returns#

varVar

The result symbolic variable.

tvm.te.hybrid#

Hybrid Programming APIs of TVM Python Package.

This package maps a subset of python to HalideIR so that: 1. Users can write some preliminary versions of the computation patterns have not been supported yet and verify it across the real execution and python semantic emulation. 2. So far, it is a text format dedicated to HalideIR Phase 0. Refer tvm.lower for more details. A larger ambition of this module is to support all levels of HalideIR.

Classes:

HybridModule([src, name])

The usage of Hybrid Module is very similar to conventional TVM module, but conventional TVM module requires a function body which is already fully lowered.

Functions:

_pruned_source(func)

Prune source code's extra leading spaces

build(sch, inputs, outputs[, name])

Dump the current schedule to hybrid module

decorate(func, fwrapped)

A wrapper call of decorator package, differs to call time

script(pyfunc)

Decorate a python function as hybrid script.

source_to_op(src, args, symbols, closure_vars)

Another level of wrapper

class tvm.te.hybrid.HybridModule(src=None, name=None)[源代码]#

The usage of Hybrid Module is very similar to conventional TVM module, but conventional TVM module requires a function body which is already fully lowered. This contradicts to the fact that Hybrid Module is originally a text format for Phase 0 HalideIR. Thus, a totally separated module is defined.

Methods:

__init__([src, name])

The constructor of this a hybrid module

load(path)

Load the module from a python file

__init__(src=None, name=None)[源代码]#

The constructor of this a hybrid module

Parameters#

srcstr

The source code of this module

namestr

The name of this module

load(path)[源代码]#

Load the module from a python file

Parameters#

pathstr

Path to the given python file

tvm.te.hybrid._pruned_source(func)[源代码]#

Prune source code’s extra leading spaces

tvm.te.hybrid.build(sch, inputs, outputs, name='hybrid_func')[源代码]#

Dump the current schedule to hybrid module

Parameters#

sch: tvm.te.Schedule

The schedule to be dumped

inputs: An array of Tensors or Vars

The inputs of the function body

outputs: An array of Tensors

The outputs of the function body

Returns#

module: HybridModule

The built results is wrapped in a HybridModule. The usage of HybridModule is roughly the same as normal TVM-built modules.

tvm.te.hybrid.decorate(func, fwrapped)[源代码]#

A wrapper call of decorator package, differs to call time

Parameters#

funcfunction

The original function

fwrappedfunction

The wrapped function

tvm.te.hybrid.script(pyfunc)[源代码]#

Decorate a python function as hybrid script.

The hybrid function support emulation mode and parsing to the internal language IR.

Returns#

hybrid_funcfunction

A decorated hybrid script function.

tvm.te.hybrid.source_to_op(src, args, symbols, closure_vars)[源代码]#

Another level of wrapper

Parameters#

srcast.node or str

If an ast.node, then directly lower it. If a str, then parse it to ast and lower it.

argslist of Tensors or Vars

The argument lists to the function. It is NOT encouraged to write a function without arguments. It is NOT encouraged to write a function with side effect.

symbolslist of str

The symbol list of the global context of the function.

closure_vars: dict

A dict of external name reference captured by this function.

Returns#

reslist of output tensors

The result of output tensors of the formed OpNode.