template<concepts::TransitionFunction F,
uindex_t n_processing_elements = 1,
uindex_t tile_width = 1024,
uindex_t tile_height = 1024, tdv::single_pass::Strategy< F, n_processing_elements > TDVStrategy = tdv::single_pass::InlineStrategy>
class stencil::tiling::StencilUpdate< F, n_processing_elements, tile_width, tile_height, TDVStrategy >
A grid updater that applies an iterative stencil code to a grid.
This updater applies an iterative stencil code, defined by the template parameter F
, to the grid; As often as requested.
- Template Parameters
-
F | The transition function to apply to input grids. |
n_processing_elements | (Optimization parameter) The number of processing elements (PEs) to implement. Increasing the number of PEs leads to a higher performance since more iterations are computed in parallel. However, it will also increase the resource and space usage of the design. Too many PEs might also decrease the clock frequency. |
tile_width | (Optimization parameter) The width of the tile that is updated in one pass. For best hardware utilization, this should be a power of two. Increasing the maximal width of a tile may increase the performance of the design by introducing longer steady-states and reducing halo computation overheads. However, it will also increase the logic resource utilization and might lower the clock frequency. |
tile_height | (Optimization parameter) The height of the tile that is updated in one pass. Increasing the maximal height of a tile may increase the performance of the design by introducing longer steady-states and reducing halo computation overheads. However, it will also increase the logic and on-chip memory utilization and might lower the clock frequency. |
TDVStrategy | (Optimization parameter) The precomputation strategy for the time-dependent value system (See guide). |
template<concepts::TransitionFunction F,
uindex_t n_processing_elements = 1,
uindex_t tile_width = 1024,
uindex_t tile_height = 1024, tdv::single_pass::Strategy< F, n_processing_elements > TDVStrategy = tdv::single_pass::InlineStrategy>
Compute a new grid based on the source grid, using the configured transition function.
The computation does not work in-place. Instead, it will allocate two additional grids with the same size as the source grid and use them for a double buffering scheme. Therefore, you are free to reuse the source grid as it will not be altered.
If Params::blocking is set to true, this method will block until the computation is complete. Otherwise, it will return as soon as all kernels are submitted.