tvm.relay.image#

Image network related operators.

Functions:

affine_grid(data[, target_shape])

affine_grid operator that generates 2D sampling grid.

const(value[, dtype, span])

Create a constant value.

crop_and_resize(data, boxes, box_indices, ...)

Crop input images and resize them.

dilation2d(data, weight[, strides, padding, ...])

Morphological Dilation 2D.

grid_sample(data, grid[, method, layout, ...])

Applies grid sampling to input feature map.

resize1d(data, size[, roi, layout, method, ...])

Image resize1d operator.

resize2d(data, size[, roi, layout, method, ...])

Image resize2d operator.

resize3d(data, size[, roi, layout, method, ...])

Image resize3d operator.

tvm.relay.image.affine_grid(data, target_shape=None)#

affine_grid operator that generates 2D sampling grid.

This operation is described in https://arxiv.org/pdf/1506.02025.pdf. It generates a uniform sampling grid within the target shape and normalizes it to [-1, 1]. The provided affine transformation is then applied on the sampling grid.

Parameters#

datatvm.Tensor

3-D with shape [batch, 2, 3]. The affine matrix.

target_shape: list/tuple of two int

Specifies the output shape (H, W).

Returns#

Outputtvm.Tensor

4-D with shape [batch, 2, target_height, target_width]

tvm.relay.image.const(value, dtype=None, span=None)#

Create a constant value.

Parameters#

value: Union[bool, int, float, numpy.ndarray, tvm.nd.NDArray]

The constant value.

dtype: str, optional

The data type of the resulting constant.

span: Optional[tvm.relay.Span]

Span that points to original source code.

Note#

When dtype is None, we use the following rule:

  • int maps to “int32”

  • float maps to “float32”

  • bool maps to “bool”

  • other using the same default rule as numpy.

tvm.relay.image.crop_and_resize(data, boxes, box_indices, crop_size, layout, method='bilinear', extrapolation_value=0, out_dtype=None)#

Crop input images and resize them.

method indicates the algorithm to be used while calculating the out value and method can be either “bilinear” or “nearest_neighbor”.

Parameters#

datarelay.Expr

The input data to the operator.

boxesrelay.Expr

A 2-D tensor of shape [num_boxes, 4]. Each row of the tensor specifies the coordinates of a box.

box_indicesrelay.Expr

A 1-D tensor of shape [num_boxes], box_ind[i] specifies the data that the i-th box refers to.

crop_sizeTuple of PrimExpr

The target size to which each box will be resized.

layoutstr, optional

Layout of the input.

methodstr, optional

Scale method, it can be either “nearest_neighbor” or “bilinear”.

extrapolation_valuefloat, optional

Value used for extrapolation, when applicable.

out_dtypestr, optional

Type to return. If left None returns the same type as input.

Returns#

result: relay.Expr

The computed result.

tvm.relay.image.dilation2d(data, weight, strides=(1, 1), padding=(0, 0), dilations=(1, 1), data_layout='NCHW', kernel_layout='IHW', out_dtype='')#

Morphological Dilation 2D. This operator takes the weight as the dilation kernel and dilates it with data to produce an output. In the default case, where the data_layout is NCHW and kernel_layout is OIHW, dilation2d takes in a data Tensor with shape (batch_size, in_channels, height, width), and a weight Tensor with shape (channels, kernel_height, kernel_width) to produce an output Tensor with the following rule:

\[\mbox{out}[b, c, y, x] = \max_{dy, dx} \mbox{data}[b, c, \mbox{strides}[0] * y + dy, \mbox{strides}[1] * x + dx] + \mbox{weight}[c, dy, dx]\]

Padding and dilation are applied to data and weight respectively before the computation. This operator accepts data layout specification. Semantically, the operator will convert the layout to the canonical layout (NCHW for data and IHW for weight) and perform the computation.

weighttvm.relay.Expr

The weight expressions.

stridesOptional[Tuple[int]]

The strides of convolution.

paddingOptional[Tuple[int]]

The padding of convolution on both sides of inputs before convolution.

dilationsOptional[Tuple[int]]

Specifies the dilation rate to be used for dilated convolution.

data_layoutOptional[str]

Layout of the input.

kernel_layoutOptional[str]

Layout of the weight.

out_dtypeOptional[str]

Specifies the output data type.

Returns#

resulttvm.relay.Expr

The computed result.

tvm.relay.image.grid_sample(data, grid, method='bilinear', layout='NCHW', padding_mode='zeros', align_corners=True)#

Applies grid sampling to input feature map.

Given \(data\) and \(grid\), then for 4-D the output is computed by

\[x_{src} = grid[batch, 0, y_{dst}, x_{dst}] \ y_{src} = grid[batch, 1, y_{dst}, x_{dst}] \ output[batch, channel, y_{dst}, x_{dst}] = G(data[batch, channel, y_{src}, x_{src}])\]

\(x_{dst}\), \(y_{dst}\) enumerate all spatial locations in \(output\), and \(G()\) denotes the interpolation function.

The out-boundary points will be padded with zeros if padding_mode is “zeros”, or border pixel value if padding_mode is “border”, or inner pixel value if padding_mode is “reflection”.

The left-top corner (-1, -1) and right-bottom corner (1, 1) in grid will be map to (0, 0) and (h - 1, w - 1) of data if align_corners is “True”, or (-0.5, -0.5) and (h - 0.5, w - 0.5) of data if align_corners is “False”.

The shape of the output will be 4-D (data.shape[0], data.shape[1], grid.shape[2], grid.shape[3]), or 5-D (data.shape[0], data.shape[1], grid.shape[2], grid.shape[3], grid.shape[4]).

The operator assumes that \(grid\) has been normalized to [-1, 1].

grid_sample often cooperates with affine_grid which generates sampling grids for grid_sample.

Parameters#

datatvm.Tensor

4-D with shape [batch, in_channel, in_height, in_width], or 5-D with shape [batch, in_channel, in_depth, in_height, in_width]

gridtvm.Tensor

4-D with shape [batch, 2, out_height, out_width], or 5-D with shape [batch, 3, out_depth, out_height, out_width]

methodstr

The interpolation method, 4-D “nearest”, “bilinear”, “bicubic” and 5-D “nearest”, “bilinear”(“trilinear”) are supported.

layoutstr

The layout of input data and the output.

padding_modestr

The padding mode for outside grid values, “zeros”, “border”, “reflection” are supported.

align_corners: bool

Geometrically, we consider the pixels of the input as squares rather than points. If set to “True”, the extrema (“-1” and “1”) are considered as referring to the center points of the input corner pixels. If set to “False”, they are instead considered as referring to the corner points of the input corner pixels, making the sampling more resolution agnostic.

Returns#

Outputtvm.Tensor

4-D with shape [batch, in_channel, out_height, out_width], or 5-D with shape [batch, in_channel, out_depth, out_height, out_width]

tvm.relay.image.resize1d(data, size, roi=None, layout='NCW', method='linear', coordinate_transformation_mode='half_pixel', rounding_method='', cubic_alpha=-0.5, cubic_exclude=0, extrapolation_value=0.0, out_dtype=None)#

Image resize1d operator.

This operator takes data as input and does 1D scaling to the given scale factor. In the default case, where the data_layout is NCW with data of shape (n, c, w) out will have a shape (n, c, size[0])

method indicates the algorithm to be used while calculating the out value and method can be one of (“linear”, “nearest_neighbor”, “cubic”)

Parameters#

datarelay.Expr

The input data to the operator.

size: Tuple of Int or Expr

The out size to which the image will be resized.

roi: Tuple of Float or Expr, optional

The region of interest for cropping the input image. Expected to be of size 2, and format [start_w, end_w]. Only used if coordinate_transformation_mode is tf_crop_and_resize.

layoutstr, optional

Layout of the input.

methodstr, optional

Scale method to used [nearest_neighbor, linear, cubic].

coordinate_transformation_modestring, optional

Describes how to transform the coordinate in the resized tensor to the coordinate in the original tensor. Defintions can be found in topi/image/resize.py. [half_pixel, align_corners, asymmetric, pytorch_half_pixel, tf_half_pixel_for_nn, and tf_crop_and_resize].

rounding_method: string, optional

indicates how to find the “nearest” pixel in nearest_neighbor method [round, floor, ceil]

cubic_alpha: float

Spline Coefficient for cubic interpolation

cubic_exclude: int

Flag to exclude exterior of the image during cubic interpolation

extrapolation_value: float

Fill value to use when roi is outside of the image

out_dtypestr, optional

Type to return. If left None returns the same type as input.

Returns#

result: relay.Expr

The resized result.

tvm.relay.image.resize2d(data, size, roi=None, layout='NCHW', method='linear', coordinate_transformation_mode='half_pixel', rounding_method='', cubic_alpha=-0.5, cubic_exclude=0, extrapolation_value=0.0, out_dtype=None)#

Image resize2d operator.

This operator takes data as input and does 2D scaling to the given scale factor. In the default case, where the data_layout is NCHW with data of shape (n, c, h, w) out will have a shape (n, c, size[0], size[1])

method indicates the algorithm to be used while calculating the out value and method can be one of (“linear”, “nearest_neighbor”, “cubic”)

Parameters#

datarelay.Expr

The input data to the operator.

size: Tuple of Int or Expr

The out size to which the image will be resized.

roi: Tuple of Float or Expr, optional

The region of interest for cropping the input image. Expected to be of size 4, and format [start_h, start_w, end_h, end_w]. Only used if coordinate_transformation_mode is tf_crop_and_resize.

layoutstr, optional

Layout of the input.

methodstr, optional

Scale method to used [nearest_neighbor, linear, cubic].

coordinate_transformation_modestring, optional

Describes how to transform the coordinate in the resized tensor to the coordinate in the original tensor. Defintions can be found in topi/image/resize.py. [half_pixel, align_corners, asymmetric, pytorch_half_pixel, tf_half_pixel_for_nn, and tf_crop_and_resize].

rounding_method: string, optional

indicates how to find the “nearest” pixel in nearest_neighbor method [round, floor, ceil]

cubic_alpha: float

Spline Coefficient for bicubic interpolation

cubic_exclude: int

Flag to exclude exterior of the image during bicubic interpolation

extrapolation_value: float

Fill value to use when roi is outside of the image

out_dtypestr, optional

Type to return. If left None returns the same type as input.

Returns#

result: relay.Expr

The resized result.

tvm.relay.image.resize3d(data, size, roi=None, layout='NCDHW', method='linear', coordinate_transformation_mode='half_pixel', rounding_method='', cubic_alpha=-0.5, cubic_exclude=0, extrapolation_value=0.0, out_dtype=None)#

Image resize3d operator.

This operator takes data as input and does 3D scaling to the given scale factor. In the default case, where the data_layout is NCDHW with data of shape (n, c, d, h, w) out will have a shape (n, c, size[0], size[1], size[2])

method indicates the algorithm to be used while calculating the out value and method can be one of (“linear”, “nearest_neighbor”, “cubic”)

Parameters#

datarelay.Expr

The input data to the operator.

size: Tuple of Int or Expr

The out size to which the image will be resized.

roi: Tuple of Float or Expr, optional

The region of interest for cropping the input image. Expected to be of size 6, and format [start_d, start_h, start_w, end_d, end_h, end_w]. Only used if coordinate_transformation_mode is tf_crop_and_resize.

layoutstr, optional

Layout of the input.

methodstr, optional

Scale method to used [nearest_neighbor, linear, cubic].

coordinate_transformation_modestring, optional

Describes how to transform the coordinate in the resized tensor to the coordinate in the original tensor. Defintions can be found in topi/image/resize.py. [half_pixel, align_corners, asymmetric, pytorch_half_pixel, tf_half_pixel_for_nn, and tf_crop_and_resize].

rounding_method: string, optional

indicates how to find the “nearest” pixel in nearest_neighbor method [round, floor, ceil]

cubic_alpha: float

Spline Coefficient for cubic interpolation

cubic_exclude: int

Flag to exclude exterior of the image during cubic interpolation

extrapolation_value: float

Fill value to use when roi is outside of the image

out_dtypestr, optional

Type to return. If left None returns the same type as input.

Returns#

result: relay.Expr

The resized result.