OMPi -- a portable OpenMP C compiler
Copyright since 2001, University of Ioannina,
Dept. of Computer Science & Engineering
==================================================================


Seting up and using the remote offloading mechanism of OMPi
-----------------------------------------------------------
                     The OMPi team



In a cluster environment, OMPi supports offloading code to devices
installed on nodes other than the one the application is executed at,
i.e. it can offloading kernels to remote devices. Because MPI is
used as the communication mechanism, OMPi requires a special setup
in order to provide remote offloading facilities.


--
1. Installation
--

In most situations, the only requirement is that an MPI installation
already exists. OMPi has been tested successfully with OpenMPI, MPICH
as well as Intel MPI installations.

In order to build OMPi with the remote offloading mechanism enabled,
you must create a very simple configuration file named
".ompi_remote_devices" at your home directory where you list all
remote nodes you want to use as well as the devices they contain.
To assist with the creation of the configuration file, run the:

     utilities/remote_offload_create_config.sh

script; it generates a sample file which you can modify according
to the properties of your cluster.

Once you have filled the configuration file, the installation process
can then be automated through the "remote_offload_setup.sh" helper
script; the script connects remotely to each cluster node and tries
to build any required modules for that node's local devices.
Execute the script passing your usual installation options, e.g:

     ./remote_offload_setup.sh --prefix="$HOME/opt/ompi"

To see all available script options, execute:

     ./remote_offload_setup.sh --help

Nothing else is needed from your part; the script not only configures
OMPi but also builds the compiler and the runtime libraries, so OMPi
is ready to use.


--
2. Usage
--

If built correctly, the mechanism should work as if the device calls
were local. The programmer doesn't need to do anything other than
running the executable through his/her MPI installation.
For OpenMPI running the executable directly (./a.out) might work but
the following should work universally:

     mpirun -n 1 ./a.out

All devices are numbered sequentially, but OMPi also provides runtime
facilities to access remote devices by node or by type.

For additional information refer to our webpage
(https://paragroup.cse.uoi.gr/wpsite/software/ompi/documentation/)
and feel free to contact us if you encounter any problems.
