Introduction
The latest version of OMPi extends GPU support by adding the ability to offload computations to any GPU through OpenCL. OMPi analyzes the OpenMP target directives appearing in a user application and generates kernel source files written in the OpenCL C language; these are then compiled into OpenCL kernels, and get linked with an OpenCL library that offers OpenMP facilities during kernel execution. OpenCL support is provided by the new OMPi opencl module.
Requirements
The requirements for the opencl module to be activated are the following:
- The standard OpenCL header and development packages must be installed (usually named
opencl-headers, opencl-dev or ocl-icd-devel) - A GPU that supports OpenCL version ≥ 1.2; you can always double-check this by running the
clinfoutility, if it is installed on your system. - The corresponding vendor drivers and/or relevant libraries must be installed (we cannot provide help on how to install them as this differs from vendor to vendor; Intel drivers most probably can be found in the
intel-neorepository; for AMD you need theamdgpuand/orROCmdrivers; NVIDIA includes the OpenCL drivers/libraries in their CUDA SDK).
Installing OMPi with OpenCL support
You do not need any special preparations; simply deploy OMPi on your system as usual:
meson setup build --prefix=<install-dir> cd build/ meson compile meson install
If all the requirements listed above are met, the installation process will automatically detect the GPU(s) and build the opencl module; if something is missing, the module won’t be included. You can verify the correct installation of the opencl module by executing:
ompiconf --devvinfo
which lists all the identified modules/devices, along with their numeric device IDs. If the above command fails to show an OpenCL-capable GPU, double check that all requirements are met and ensure you have installed OMPi correctly.
Quick start
To offload to a GPU through OpenCL, you need to compile your applications with a --devs=opencl argument:
ompicc --devs=opencl app.c
The compiler will produce the main application executable (a.out) and a number of OpenCL kernel files, one for each target construct in the OpenMP application.
Limitations
Currently, the opencl module has a number of limitations:
- Only the datatypes supported by OpenCL C can be used; some GPUs may lack
doublesupport. - Only static loop scheduling is supported in
target teams distribute parallel fordirectives, - Stand-alone
parallelconstructs are not handled yet; theparallelconstruct must be part of a combinedtarget,teamsand/ordistributeconstruct. - OpenCL ≥ 1.2 is required; in addition the OpenCL C driver/compiler must allow
__globaldeclarations at file scope; while this requirement is not part of the OpenCL 1.2 standard (it is even optional in OpenCL 2.0), all major vendors support it.
Notes
The OMPi opencl module has been tested with the following GPUs:
- Integrated Intel GPUs (UHD 770, Iris Xe)
- AMD Radeon GPUs (R9 285, RX 550)
- NVIDIA GPUs (GeForce GT730, Tesla P40, Ampere A2)