# Metaprogramming

Taichi provides metaprogramming infrastructures. There are many benefits of metaprogramming in Taichi:

- Enabling the development of dimensionality-independent code, e.g., code which is adaptive for both 2D/3D physical simulations.
- Improving runtime performance by moving computations from runtime to compile time.
- Simplifying the development of Taichi standard library.

##### note

Taichi kernels are **lazily instantiated** and large amounts of computation can be executed at **compile-time**.
Every kernel in Taichi is a template kernel, even if it has no template arguments.

## Template metaprogramming

By using `ti.template()`

as an argument type hint, a Taichi field or a python object can be passed into a kernel. Template programming also enables the code to be reused for fields with different shapes:

`@ti.kernel`

def copy_1D(x: ti.template(), y: ti.template()):

for i in x:

y[i] = x[i]

a = ti.field(ti.f32, 4)

b = ti.field(ti.f32, 4)

c = ti.field(ti.f32, 12)

d = ti.field(ti.f32, 12)

# Pass field a and b as arguments of the kernel `copy_1D`:

copy_1D(a, b)

# Reuse the kernel for field c and d:

copy_1D(c, d)

##### note

If a template parameter is not a Taichi object, it cannot be reassigned inside Taichi kernel.

##### note

The template parameters are inlined into the generated kernel after compilation.

## Dimensionality-independent programming using grouped indices

Taichi provides `ti.grouped`

syntax which supports grouping loop indices into a `ti.Vector`

.
It enables dimensionality-independent programming, i.e., code are adaptive to scenarios of
different dimensionalities automatically:

`@ti.kernel`

def copy_1D(x: ti.template(), y: ti.template()):

for i in x:

y[i] = x[i]

@ti.kernel

def copy_2d(x: ti.template(), y: ti.template()):

for i, j in x:

y[i, j] = x[i, j]

@ti.kernel

def copy_3d(x: ti.template(), y: ti.template()):

for i, j, k in x:

y[i, j, k] = x[i, j, k]

# Kernels listed above can be unified into one kernel using `ti.grouped`:

@ti.kernel

def copy(x: ti.template(), y: ti.template()):

for I in ti.grouped(y):

# I is a vector with dimensionality same to y

# If y is 0D, then I = ti.Vector([]), which is equivalent to `None` used in x[I]

# If y is 1D, then I = ti.Vector([i])

# If y is 2D, then I = ti.Vector([i, j])

# If y is 3D, then I = ti.Vector([i, j, k])

# ...

x[I] = y[I]

## Field metadata

The two attributes **data type** and **shape** of fields can be accessed by `field.dtype`

and `field.shape`

, in both Taichi-scope and Python-scope:

`x = ti.field(dtype=ti.f32, shape=(3, 3))`

# Print field metadata in Python-scope

print("Field dimensionality is ", x.shape)

print("Field data type is ", x.dtype)

# Print field metadata in Taichi-scope

@ti.kernel

def print_field_metadata(x: ti.template()):

print("Field dimensionality is ", len(x.shape))

for i in ti.static(range(len(x.shape))):

print("Size along dimension ", i, "is", x.shape[i])

ti.static_print("Field data type is ", x.dtype)

##### note

For sparse fields, the full domain shape will be returned.

## Matrix & vector metadata

For matrices, `matrix.m`

and `matrix.n`

returns the number of columns and rows, respectively.
For vectors, they are treated as matrices with one column in Taichi, where `vector.n`

is the number of elements of the vector.

`@ti.kernel`

def foo():

matrix = ti.Matrix([[1, 2], [3, 4], [5, 6]])

print(matrix.n) # number of row: 3

print(matrix.m) # number of column: 2

vector = ti.Vector([7, 8, 9])

print(vector.n) # number of elements: 3

print(vector.m) # always equals to 1 for a vector

## Compile-time evaluations

Using compile-time evaluation allows for some computation to be executed when kernels are instantiated. This helps the compiler to conduct optimization and reduce computational overhead at runtime:

### Static Scope

`ti.static`

is a function which receives one argument. It is a hint for the compiler to evaluate the argument at compile time.
The scope of the argument of `ti.static`

is called static-scope.

### Compile-time branching

- Use
`ti.static`

for compile-time branching (for those who are familiar with C++17, this is similar to if constexpr.):

`enable_projection = True`

@ti.kernel

def static():

if ti.static(enable_projection): # No runtime overhead

x[0] = 1

##### note

One of the two branches of the `static if`

will be discarded after compilation.

### Loop unrolling

- Use
`ti.static`

for forced loop unrolling:

`@ti.kernel`

def func():

for i in ti.static(range(4)):

print(i)

# The code snippet above is equivalent to:

print(0)

print(1)

print(2)

print(3)

## When to use `ti.static`

with for loops

There are two reasons to use `ti.static`

with for loops:

- Loop unrolling for improving runtime performance (see Compile-time evaluations).
- Accessing elements of Taichi matrices/vectors. Indices for accessing Taichi fields can be runtime variables, while indices for Taichi matrices/vectors
**must be a compile-time constant**.

For example, when accessing a vector field `x`

with `x[field_index][vector_component_index]`

, the `field_index`

can be a runtime variable, while the `vector_component_index`

must be a compile-time constant:

`# Here we declare a field contains 3 vector. Each vector contains 8 elements.`

x = ti.Vector.field(8, ti.f32, shape=(3))

@ti.kernel

def reset():

for i in x:

for j in ti.static(range(x.n)):

# The inner loop must be unrolled since j is an index for accessing a vector

x[i][j] = 0

## Compile-time recursion of `ti.func`

A compile-time recursive function is a function with recursion that can be recursively inlined at compile time. The condition which determines whether to recurse is evaluated at compile time.

You can combine compile-time branching and template to write compile-time recursive functions.

For example, `sum_from_one_to`

is a compile-time recursive function that calculates the sum of numbers from `1`

to `n`

.

`@ti.func`

def sum_from_one_to(n: ti.template()) -> ti.i32:

ret = 0

if ti.static(n > 0):

ret = n + sum_from_one_to(n - 1)

return ret

@ti.kernel

def sum_from_one_to_ten():

print(sum_from_one_to(10)) # prints 55

##### WARNING

When the recursion is too deep, it is not recommended to use compile-time recursion because deeper compile-time recursion expands to longer code during compilation, resulting in increased compilation time.