I'm confused! Where can I get help?

Try the mailing list or look for gentryx/heller on #ste||ar on Freenode.

Can LibGeoDecomp help me parallelize my code?

That depends on your code. The short answer: if your code is a custom simulation or an iterative algorithm, then the answer is most likely yes.

The long answer is that LibGeoDecomp focuses on stencil codes: space- and time-discrete simulations in which the simulation space is decomposed according to a regular grid and time is split into discrete global time steps. Nevertheless, it has limited support for irregular spatial decompositions, too.

Stencil codes are a well studied subject. Your research is irrelevant. Or is it?

Yes, stencil codes are not a new topic. But that is the point, really. There are a lot of papers out there which follow the pattern We took stencil code FOO and did implement it with record breaking performance on architecture BAR, where FOO mostly refers to Jacobi or LBM while BAR will most likely be a GPU, or a large scale multi-core cluster. People write temporal and spatial blocking code, domain decompositions, parallel IO... they do this again and again. And why? Because there are currently few options for reusing implementations. This is why we believe that the time for a generic solution has come.

Doesn't software XY already do what your code is aiming at?

There surely are a number of packages which appear at a first glance to provide a similar functionality. Some are DSLs, some are libraries. But then again, we believe LibGeoDecomp has some key advantages. Please look here for a short review of how it relates to the competition. Here is a list of similar packages:

  • Physis is probably the package which is most similar to LibGeoDecomp. It's being developed by Naoya Maruyama and is based on a C-like DSL. It is currently better suited for running multi-GPU stencil codes than LibGeoDecomp. LibGeoDecomp has the upper hand on other architectures (e.g. Blue Gene/Q or Intel Xeon Phi) or if the simulation model is not a plain stencil code (e.g. particle-in-cell codes).
  • Patus is a code generator geared at stencil codes, with optimizations for various architectures. Similar to Physis it comes with a DSL which users need to adopt to specify their kernels. This is different from the stance LibGeoDecomp takes: LibGeoDecomp has been designed to work with kernels written in C++ to ease adoption of legacy codes. But kernels can also be written in C and (with some restrictions) Fortran. Patus is limited to single nodes while Physis and LibGeoDecomp run just fine on clusters and even supercomputers.
  • Cactus is a problem solving environment written in C, C++ and Fortran. It comes with a huge library of existing kernels (called Thorns) that allow users to quickly put together new simulations. It's been proven to scale on most major supercomputers. Use it if one of its Thorns is a good match for your simulation model. Use LibGeoDecomp if you need to support accelerators (GPUs or Xeon Phi) or if your model is complex and can be formulated more elegantly in C++. Depending on the use case LibGeoDecomp may also run a tad faster (for instance because it supports cache blocking). We are currently working on interfacing Cactus and LibGeoDecomp. Kurt Kanzenbach has written an generator for interface code, which already allows us to run some Thorns within LibGeoDecomp. The code might even allow integration of LibGeoDecomp as a special Driver Thorn within Cactus.

What about Fortran?

LibGeoDecomp makes heavy use of C++ class templates, so making it work with Fortran might not appear to be straightforward. Indeed it takes some thinking, but in the end it's not really complicated. Here is a quick example.

Help! My cells contain invalid data. Your library must be broken!

Common pitfall! Maybe. Maybe not. There are however two common fallacies which you might want to check first. My experience is that most codes fall victim to the one or the other. Don't worry though, they're easily fixed.

  1. When writing an Initializer, the main function to implement is grid(GridBase *grid). Some folks will then go ahead and only initialize those cells which are of interest to them (e.g. in an implementation of Conway's Game of Life they might set only a couple of cells which are marked as alive. This may work sometimes, for instance when the cells in grid are freshly initialized by their default constructor. But in other cases it'll break: if you call run() in your Simulator twice, it will re-initialize it's grid with the help of the Initializer. But if that one doesn't flush all the cells bad things will happen. The reason why we don't re-initialize all cells by default is simple: performance. Some codes would suffer by this. Solution:grid has a member boundingBox(). This will yield you a CoordBox object which you can use to traverse all coordinates of to-be-initialized cells. Please visit this code for an example of proper initialization.
  2. When writing the update function of your cell, it's tempting to read it's members directly. But LibGeoDecomp always keeps two (logical) grids to prevent concurrent updates from getting in each others way: you'll need to read from one grid and write to the other. That means that you can only safely read from the new grid (where this is currently pointing) if you have already written those members in the current iterations. For similar reasons to the point above, we do not copy old values to the new grid by default: it's slow and in fact most applications can do without. Solution:You should rather read the members of the last time step's cells, which can be accessed through the neighborhood object, and only read those members from this, which have already been written by the current invocation of update(). The 3D Jacobi code in our repo demonstrates that a default copy from the last time step is in deed not required, as it will overwrite its only data member in any case. If everything else fails: *this = hood[Coord()]; might help.

Do you use a DSL?

Yes and no. While the newer SoA code may be considered an embedded DSL (embedded into C++), the main part of the library is really a library with a 2-way callback. In a way it's similar to Qt. Qt handles the event loop and will call back your code when it's appropriate. So does LibGeoDecomp. Only that we don't do GUIs, but simulations. Our callback works like this: you start a simulation by calling run() on a Simulator object. The Simulator will then call back your model (update() in your cell class). Your cell may then in turn call back the library via a so-called neighborhood object. The neighborhood object really is a proxy object by which user code may get access to neighboring cells of the last time step. For more details take a look at the examples.

So much for the callback. The reasons why we don't use a full DSL with an external compiler are simple:

  • Porting existing code towards LibGeoDecomp is easier if you don't have to reimplement it in some arcane DSL.
  • Debugging is much more convenient if you can step through the actual code you wrote, not some illegible machine-generated gibberish.
  • It's less complicated and pulls in fewer dependencies. This is a key feature when building on some highly specialized architecture (think Cray or the Fujitsu line of vector machines).
  • As our benchmarks show: our library approach yields competitive performance. We just don't need a DSL. Our template-approach allows us to do all necessary transformations.

What is the interface between library and usercode?

Quite simple, really: you call us, we call you, you call us. A bit more verbose:

  1. You instantiate a Simulator object with your simulation model as a template parameter. The model describes the data a single grid cell holds and the computations it performs in each time step.
  2. After all IO objects (an Initializer to set up initial conditions,Writers for output, and optionally Steerers for modifying the simulation at runtime) have been added, you call run() on the Simulator.
  3. The Simulator will then call back your code's update(): once for each cell and timestep.
  4. You in turn may call back the library via a neighborhood object provided to you during update() to retrieve the state of your cell and the neighboring cells in the last time step.

More details can be found in the examples. I highly recommend to take a look at the implementations of Jacobi and Game of Life.

Who is developing LibGeoDecomp??

Most development is done at the FAU in Erlangen, Germany. He try to keep an updated list here.

Where can I report bugs?

Please see our issue tracker on Bitbucket.

What prerequisites are required for running/compiling LibGeoDecomp?

This depends on which of the libraries features you wish to use, but for a basic build you'll only need:

Additionally you may wish to provide:

  • CUDA toolkit for running on NVIDIA graphics cards,
  • an MPI installation (we mostly use Open MPI) for cluster support.

If you don't just want to use the library, but also to modify it, then the build system may want to auto-regenerate LibGeoDecomp's internal MPI datatypes. This requires:

Some of the examples come with a Qt GUI and may a camera for input. For those you will need:

Which compiler works with LibGeoDecomp?

LibGeoDecomp is written in standard C++ so it should work out of the box with all C++ compilers. But because the world is big and bad (and so is the C++ standard) things are not that easy. We test LibGeoDecomp regularly with the following compilers:

  • Intel C++ Compiler (icpc) 10.1
  • Intel C++ Compiler (icpc) 11.1
  • Intel C++ Compiler (icpc) 12.1
  • Intel C++ Compiler (icpc) 13.1
  • GCC 4.2.2
  • GCC 4.3.4
  • GCC 4.4.7
  • GCC 4.5.3
  • GCC 4.6.3
  • GCC 4.7.0
  • GCC 4.7.1
  • GCC 4.7.3
  • GCC 4.8.1
  • Clang (currently trunk only, tested revision #190022)

Which version of MPI/package XY should I use?

Theoretically LibGeoDecomp shouldn't be picky regarding the version of a given package. However, here is a list of versions that we regularly test for compatibility:

Package Minimum Version Tested Versions
CMake 2.6.4 2.6.4, 2.8.2, 2.8.5, 2.8.8, 2.8.9, 2.8.11
Open MPI 1.4.2, 1.5.3, 1.5.5, 1.6.0, 1.6.5
Intel MPI 4.0.1, 4.0.3
Boost 1.37, y1.44, 1.47, 1.48, 1.49, 1.53
Qt 4.0 4.8.1, 4.8.2, 4.8.5
OpenCV 2.3.1a, 2.4.5

How do I build LibGeoDecomp with Intel's C++ compiler?

With CMake you can select the C++ compiler via CMAKE_CXX_COMPILER. Additionally I'd suggest you to set some compiler flags to tone down the warnings caused by some Boost libraries:

cmake -DCMAKE_CXX_COMPILER=icpc -DADDITIONAL_COMPILE_FLAGS="-Wall -Wno-sign-compare -wd383 -wd981 -wd1418 ../.."

Build recipe for JUQUEEN?

JUQUEEN is the IBM Blue Gene/Q at Forschungszentrum Jülich. At the point of this writing, it is the fastest European system. Building on JUQUEEN is slightly tricky, as you'll want to use xlc++, IBM's C++ compiler, which still has troubles with templates. And pretty much everything in LibGeoDecomp is a template. Also, no interactive jobs are available on this machine, which means you can't run the unit tests on your build machine.

module load boost
module load cmake
cmake -DUNITEXEC=echo -DMPIEXEC=echo -DCMAKE_C_COMPILER=xlc -DCMAKE_CXX_COMPILER=mpixlcxx -DADDITIONAL_COMPILE_FLAGS="-I/bgsys/local/boost/1.47.0" -DBoost_NO_BOOST_CMAKE=true -DBOOST_INCLUDEDIR="/bgsys/local/boost/1.47.0" -DLIB_LINKAGE_TYPE=STATIC ../../

How to build on SuperMUC?

You can't pull external HG repositories on SuperMUC and there is no build of Boost for the PGI and Intel compilers. You'll thus have to scp your checkout of LibGeoDecomp to one of the login nodes. And I suggest using GCC for compilation. In essence, the default build flags are absolutely fine for this machine:

module load cmake
module load boost/1.47_gcc
cd libgeodecomp/
BUILD_DIR=build/`uname -ms | sed s/\ /-/g`
mkdir -p $BUILD_DIR
cd $BUILD_DIR
cmake ../../
make -j10

What if CMake doesn't find my Boost installation?

If you've installed Boost in a non-standard location (e.g. your home), then you can always set BOOST_ROOT. This is how we build on Tsubame 2.0:

cmake -DBOOST_ROOT=/home/usr5/A2401222/boost_1_50_0/ -DCMAKE_CXX_COMPILER=icpc -DADDITIONAL_COMPILE_FLAGS="-Wall -Wno-sign-compare -wd383 -wd981 -wd1418" ../../

LibGeoDecomp doesn't build on my host, what should I do?

Easy: send us your CMake output and make output! Which system (hardware vendor, operating system) are you trying to build on? CMake should do the right thing to discover your environment. But given the variety of target platforms, I wouldn't wonder if it fails on some systems. We'll be happy to fix that! Below is a script which captures the relevant output in the files cmake.log and make.log. Run it from your build directory.

cmake ../.. 2>&1 | tee cmake.log
make 2>&1 | tee make.log

Why doesn't CMake honor the compile flags I've specified?

Under certain conditions the value of a build variable and CMake's argument parsing may interfere. In this case you should add :STRING to the definition. Here you see how to set a certain machine architecture, handy if you're building on a 64-bit capable machine whose toolchain defaults to building 32-bit code.

cmake -DADDITIONAL_COMPILE_FLAGS:STRING="-march=core2 -m64" ../../

Please send your questions and inquiries to our mailing list.

last modified: Tue Apr 15 23:03:47 2014 +0200