Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations
Rawat, P. S.,
Vaidya, M.,
Sukumaran-Rajam, A.,
Ravishankar, M.,
Grover, V.,
Rountev, A.,
Pouchet, L.,
and Sadayappan, P.
Proceedings of the IEEE
2018
[Abs]
Stencil computations arise in a number of computational domains. They exhibit significant data parallelism and are thus well suited for execution on graphical processing units (GPUs), but can be memory-bandwidth limited unless temporal locality is utilized via tiling. This paper describes how effective tiled code can be generated for GPUs from a domain-specific language (DSL) for stencils. Experimental results demonstrate the benefits of such a domain-specific optimization approach over state-of-the-art general-purpose compiler optimizations.