Q: I'm confused! Where can I get help?

A: Try the mailing list or look for gentryx/heller on #ste||ar on Freenode.

Q: Can LibGeoDecomp help me parallelize my code?

A: That depends on your code. The short answer: if your code is a custom simulation or an iterative algorithm, then the answer is most likely yes.

The long answer is that LibGeoDecomp focuses on stencil codes: space- and time-discrete simulations in which the simulation space is decomposed according to a regular grid and time is split into discrete global time steps. Nevertheless, it has limited support for irregular spatial decompositions, too.

Q: Where can I report bugs?

A: Please see our issue tracker on GitHub.

Q: What's the procedure to submit patches?

Please file a pull request on GitHub (LibGeoDecomp or LibFlatArray. We typically require green light from CircleCI and/or Travis and adherence to the coding style.

Q: How do I use class XY?

A: Take a look at the unit tests! Tests for src/foo/xy.h are commonly found in src/foo/test/unit/xytest.h. If the class uses parallelism, e.g. MPI, then look for src/foo/test/parallel_*/bartest.h.

Comments in our source are not few because we're lazy, but because we've -- quite painfully -- learned that the only thing worse than undocumented code is code whose documentation doesn't match it's internal workings. Given the fact that, over the course of the years, most parts of the library have been rewritten multiple times, I estimate the likelihood of the source and its documentation diverging at about 1.0.

Instead our comments focus on explaining why some part of the code is not written in a straightforward manner -- while our code strives to be self-explaining: speaking names, short methods, etc. The tests document how a class' interface is meant to be used and how the code works.

Q: Help! My cells contain invalid data. Your library must be broken!

A: Common pitfall! Maybe. Maybe not. There are however two common fallacies which you might want to check first. My experience is that most codes fall victim to the one or the other. Don't worry though, they're easily fixed.

  1. When writing an Initializer, the main function to implement is grid(GridBase *grid). Some folks will then go ahead and only initialize those cells which are of interest to them (e.g. in an implementation of Conway's Game of Life they might set only a couple of cells which are marked as alive. This may work sometimes, for instance when the cells in grid are freshly initialized by their default constructor. But in other cases it'll break: if you call run() in your Simulator twice, it will re-initialize it's grid with the help of the Initializer. But if that one doesn't flush all the cells bad things will happen. The reason why we don't re-initialize all cells by default is simple: performance. Some codes would suffer by this. Solution:grid has a member boundingBox(). This will yield you a CoordBox object which you can use to traverse all coordinates of to-be-initialized cells. Please visit this code for an example of proper initialization.
  2. When writing the update function of your cell, it's tempting to read it's members directly. But LibGeoDecomp always keeps two (logical) grids to prevent concurrent updates from getting in each others way: you'll need to read from one grid and write to the other. That means that you can only safely read from the new grid (where this is currently pointing) if you have already written those members in the current iterations. For similar reasons to the point above, we do not copy old values to the new grid by default: it's slow and in fact most applications can do without. Solution:You should rather read the members of the last time step's cells, which can be accessed through the neighborhood object, and only read those members from this, which have already been written by the current invocation of update(). The 3D Jacobi code in our repo demonstrates that a default copy from the last time step is in deed not required, as it will overwrite its only data member in any case. If everything else fails: *this = hood[Coord()]; might help.

Q: Who is developing LibGeoDecomp?

A: LibGeoDecomp is a project of the Ste||ar Group, a joint initiative of researchers with scalable computing solutions. Main development is done at the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Germany. Project lead is Andreas Schäfer. We try to keep a current list of team members here.

Q: Isn't LibGeoDecomp just another software layer, which will ultimately make the software run slower?

A: It is a popular conception to view software as a stack of layers, where each layer functions by interacting with its surrounding layers -- and in turn adds overhead. LibGeoDecomp could be viewed as such a layer, which sits between the user's model and the runtime (e.g. MPI and/or CUDA). The assumption that this automatically adds overhead is wrong though. First of all, most of the integration of the model with the library happens at compile time, so no method call overheads or similar remain. Secondly, LibGeoDecomp's functionality is beyond simply wrapping certain interfaces and data structures. By inversion of control it is able to transform algorithms and data structures to achieve an optimal match between hard- and software. In most cases the result will be a code with a better performance than a native, but baseline manual implementation. See our publications for measurements.

Q: Stencil codes are a well studied subject. Your research is irrelevant. Or is it?

A: Yes, stencil codes are not a new topic. But that is the point, really. There are a lot of papers out there which follow the pattern We took stencil code FOO and did implement it with record breaking performance on architecture BAR, where FOO mostly refers to Jacobi or LBM while BAR will most likely be a GPU, or a large scale multi-core cluster. People write temporal and spatial blocking code, domain decompositions, parallel IO... they do this again and again. And why? Because there are currently few options for reusing implementations. This is why we believe that the time for a generic solution has come.

Besides: LibGeoDecomp is no longer just a stencil code library: we can do particle-in-cell, meshfree, n-body...

Q: Doesn't software XY already do what your code is aiming at?

A: There surely are a number of packages which appear at a first glance to provide a similar functionality. Some are DSLs, some are libraries. But then again, we believe LibGeoDecomp has some key advantages. Please look here for a short review of how it relates to the competition. Here is a list of similar packages:

  • Physis is probably the package which is most similar to LibGeoDecomp. It's being developed by Naoya Maruyama and is based on a C-like DSL. It is currently better suited for running multi-GPU stencil codes than LibGeoDecomp. LibGeoDecomp has the upper hand on other architectures (e.g. Blue Gene/Q or Intel Xeon Phi) or if the simulation model is not a plain stencil code (e.g. particle-in-cell codes).
  • Patus is a code generator geared at stencil codes, with optimizations for various architectures. Similar to Physis it comes with a DSL which users need to adopt to specify their kernels. This is different from the stance LibGeoDecomp takes: LibGeoDecomp has been designed to work with kernels written in C++ to ease adoption of legacy codes. But kernels can also be written in C and (with some restrictions) Fortran. Patus is limited to single nodes while Physis and LibGeoDecomp run just fine on clusters and even supercomputers.
  • Cactus is a problem solving environment written in C, C++ and Fortran. It comes with a huge library of existing kernels (called Thorns) that allow users to quickly put together new simulations. It's been proven to scale on most major supercomputers. Use it if one of its Thorns is a good match for your simulation model. Use LibGeoDecomp if you need to support accelerators (GPUs or Xeon Phi) or if your model is complex and can be formulated more elegantly in C++. Depending on the use case LibGeoDecomp may also run a tad faster (for instance because it supports cache blocking). We are currently working on interfacing Cactus and LibGeoDecomp. Kurt Kanzenbach has written an generator for interface code, which already allows us to run some Thorns within LibGeoDecomp. The code might even allow integration of LibGeoDecomp as a special Driver Thorn within Cactus.

Q: Do you use a DSL?

A: Yes and no. While the newer SoA code may be considered an embedded DSL (embedded into C++), the main part of the library is really a library with a 2-way callback. In a way it's similar to Qt. Qt handles the event loop and will call back your code when it's appropriate. So does LibGeoDecomp. Only that we don't do GUIs, but simulations. Our callback works like this: you start a simulation by calling run() on a Simulator object. The Simulator will then call back your model (update() in your cell class). Your cell may then in turn call back the library via a so-called neighborhood object. The neighborhood object really is a proxy object by which user code may get access to neighboring cells of the last time step. For more details take a look at the examples.

So much for the callback. The reasons why we don't use a full DSL with an external compiler are simple:

  • Porting existing code towards LibGeoDecomp is easier if you don't have to reimplement it in some arcane DSL.
  • Debugging is much more convenient if you can step through the actual code you wrote, not some illegible machine-generated gibberish.
  • It's less complicated and pulls in fewer dependencies. This is a key feature when building on some highly specialized architecture (think Cray or the Fujitsu line of vector machines).
  • As our benchmarks show: our library approach yields competitive performance. We just don't need a DSL. Our template-approach allows us to do all necessary transformations.

Q: For how long will LibGeoDecomp live? Can I trust it to be maintained?

A: It is hard to predict the future, so I won't try. LibGeoDecomp has been around since 2006, which makes it older than CUDA (2007) and the latest release of PVM (2009, does anyone even remember PVM?) and only slightly younger than the first Intel dual core processor (Paxville DP, 2005). Today LibGeoDecomp is standing stronger than ever, so we have no reason to abandon it. Even if we would: its code is well hackable and maintainable, its license is a permissive, liberal open source license. Others could take over, if need be.

Q: Is LibGeoDecomp tested?

Yes, it surely is. Twice a day our custom CI scripts will test more than 20 different build configurations on a variety of different architectures, ranging from Intel x86-64 servers to NVIDIA GPUs and (yes, we do that) an IBM Cell Blade. We run functional tests (unit tests and integration tests), and performance tests.

Q: How hackable is LibGeoDecomp?

A: A common question is this: How hard is it for a third party developer to add features to LibGeoDecomp? There is probably no globally accepted metric for how well extendable a code is, so let me just list a couple of facts:

Q: What prerequisites are required for running/compiling LibGeoDecomp?

A: This depends on which of the libraries features you wish to use. Most dependencies are optional. For a basic build you'll only need:

Additionally you may wish to provide:

  • CUDA toolkit for running on NVIDIA graphics cards,
  • an MPI installation (we mostly use Open MPI) for cluster support.

If you don't just want to use the library, but also to modify it, then the build system may want to auto-regenerate LibGeoDecomp's internal MPI datatypes. This requires:

Some of the examples come with a Qt GUI and may a camera for input. For those you will need:

Q: Which compiler works with LibGeoDecomp?

A: LibGeoDecomp is written in standard C++ so it should work out of the box with all C++ compilers. But because the world is big and bad (and so is the C++ standard) things are not that easy. We test LibGeoDecomp regularly with the following compilers:

  • Intel C++ Compiler (icpc) 10.1
  • Intel C++ Compiler (icpc) 11.1
  • Intel C++ Compiler (icpc) 12.1
  • Intel C++ Compiler (icpc) 13.1
  • GCC 4.2.2
  • GCC 4.3.4
  • GCC 4.4.7
  • GCC 4.5.3
  • GCC 4.6.3
  • GCC 4.7.0
  • GCC 4.7.1
  • GCC 4.7.3
  • GCC 4.8.1
  • GCC 4.9.0
  • Clang 3.5.0
  • Clang 3.7.0

Incompatibilities have been found with Clang 3.4.2 and Clang 3.6.0. Both struggle with the expression templates in LibFlatArray and thus generate vectorized code that yields wrong results.

Q: Which version of MPI/package XY should I use?

A: Theoretically LibGeoDecomp shouldn't be picky regarding the version of a given package. However, here is a list of versions that we regularly test for compatibility.

Package Minimum Version Tested Versions
CMake 2.6.4 2.6.4, 2.8.2, 2.8.5, 2.8.8, 2.8.9, 2.8.11, 3.0.2
CUDA 4.0 5.5.22, 6.0.37, 6.5.14, 7.0.18
Open MPI 1.4.2, 1.5.3, 1.5.5, 1.6.0, 1.6.5, 1.8.3
Intel MPI 4.0.1, 4.0.3
Boost 1.37, y1.44, 1.47, 1.48, 1.49, 1.53, 1.56
Qt 4.0 4.8.1, 4.8.2, 4.8.5, 4.8.6
OpenCV 2.3.1a, 2.4.5, 2.4.9

Q: How do I build LibGeoDecomp with Intel's C++ compiler?

A: With CMake you can select the C++ compiler via CMAKE_CXX_COMPILER. Additionally I'd suggest you to set some compiler flags to tone down the warnings caused by some Boost libraries:

cmake -DCMAKE_CXX_COMPILER=icpc -DADDITIONAL_COMPILE_FLAGS="-Wall -Wno-sign-compare -wd383 -wd981 -wd1418 ../.."

Q: Build recipe for JUQUEEN?

A: JUQUEEN is the IBM Blue Gene/Q at Forschungszentrum Jülich. At the point of this writing, it is the fastest European system. Building on JUQUEEN is slightly tricky, as you'll want to use xlc++, IBM's C++ compiler, which still has troubles with templates. And pretty much everything in LibGeoDecomp is a template. Also, no interactive jobs are available on this machine, which means you can't run the unit tests on your build machine.

module load boost
module load cmake
cmake -DUNITEXEC=echo -DMPIEXEC=echo -DCMAKE_C_COMPILER=xlc -DCMAKE_CXX_COMPILER=mpixlcxx -DADDITIONAL_COMPILE_FLAGS="-I/bgsys/local/boost/1.47.0" -DBoost_NO_BOOST_CMAKE=true -DBOOST_INCLUDEDIR="/bgsys/local/boost/1.47.0" -DLIB_LINKAGE_TYPE=STATIC -DWITH_QT=false ../../

Q: How to build on NERSC's Edison?

A: Building on Edison is straightforward. I suggest to silence the Intel C++ compiler's warning 2304 as the stock Boost install on Edison has wrongly marked some c-tors marked as explicit. You should also override MPIEXEC so that tests don't try to run MPI-based unit tests on the build machine:

module load PrgEnv-intel
module load cmake
module load boost
module load mercurial
git clone https://github.com/STEllAR-GROUP/libgeodecomp.git
cd libgeodecomp/
mkdir build
cd build
make -j10
make -j10 check

Q: How to build on SuperMUC?

A: You can't pull external git repositories on SuperMUC and there is no build of Boost for the PGI and Intel compilers. You'll thus have to scp your checkout of LibGeoDecomp to one of the login nodes. And I suggest using GCC for compilation. In essence, the default build flags are absolutely fine for this machine:

module load cmake
module load boost/1.47_gcc
cd libgeodecomp/
BUILD_DIR=build/`uname -ms | sed s/\ /-/g`
mkdir -p $BUILD_DIR
cmake ../../
make -j10

Q: Building on Tsubame 2.5?

A: LibGeoDecomp will build on Tsubame almost out of the box. Since firewall rules are strict, you may not be able to directly clone our GitHub repository, but will rather have to copy it over from another machine.

cd libgeodecomp/
BUILD_DIR=build/`uname -ms | sed s/\ /-/g`
mkdir -p $BUILD_DIR
cmake -DWITH_BOOST_ASIO=false -DWITH_BOOST_MPI=false -DWITH_TYPEMAPS=false ../../
make -j10
make check

Q: Build instructions for Edison (Cray XC30)?

Here is how to build with GCC:

# fixme: drop this module load PrgEnv-gnu
module load cmake
module load boost
cd libgeodecomp/
BUILD_DIR=build/`uname -ms | sed s/\ /-/g`
mkdir -p $BUILD_DIR
make -j10

Minor changes for Intel's compiler ICPC:

module load PrgEnv-intel
module load cmake
module load boost
cd libgeodecomp/
BUILD_DIR=build/`uname -ms | sed s/\ /-/g`
mkdir -p $BUILD_DIR
make -j10

Q: What if CMake doesn't find my Boost installation?

A: If you've installed Boost in a non-standard location (e.g. your home), then you can always set BOOST_ROOT. This is how we build on Tsubame 2.0:

cmake -DBOOST_ROOT=/home/usr5/A2401222/boost_1_50_0/ -DCMAKE_CXX_COMPILER=icpc -DADDITIONAL_COMPILE_FLAGS="-Wall -Wno-sign-compare -wd383 -wd981 -wd1418" ../../

Q: LibGeoDecomp doesn't build on my host, what should I do?

A: Easy: send us your CMake output and make output! Which system (hardware vendor, operating system) are you trying to build on? CMake should do the right thing to discover your environment. But given the variety of target platforms, I wouldn't wonder if it fails on some systems. We'll be happy to fix that! Below is a script which captures the relevant output in the files cmake.log and make.log. Run it from your build directory.

cmake ../.. 2>&1 | tee cmake.log
make 2>&1 | tee make.log

Q: Why doesn't CMake honor the compile flags I've specified?

A: Under certain conditions the value of a build variable and CMake's argument parsing may interfere. In this case you should add :STRING to the definition. Here you see how to set a certain machine architecture, handy if you're building on a 64-bit capable machine whose toolchain defaults to building 32-bit code.

cmake -DADDITIONAL_COMPILE_FLAGS:STRING="-march=core2 -m64" ../../

Q: What about Fortran?

A: LibGeoDecomp makes heavy use of C++ class templates, so making it work with Fortran might not appear to be straightforward. Indeed it takes some thinking, but in the end it's not really complicated. Here is a quick example.

Q: What is the interface between library and usercode?

A: Quite simple, really: you call us, we call you, you call us. A bit more verbose:

  1. You instantiate a Simulator object with your simulation model as a template parameter. The model describes the data a single grid cell holds and the computations it performs in each time step.
  2. After all IO objects (an Initializer to set up initial conditions,Writers for output, and optionally Steerers for modifying the simulation at runtime) have been added, you call run() on the Simulator.
  3. The Simulator will then call back your code's update(): once for each cell and timestep.
  4. You in turn may call back the library via a neighborhood object provided to you during update() to retrieve the state of your cell and the neighboring cells in the last time step.

More details can be found in the examples. I highly recommend to take a look at the implementations of Jacobi and Game of Life.

Please send your questions and inquiries to our mailing list.

last modified: Sun Dec 20 08:44:09 2015 +0100