Developers' Guide

This page contains details on LibGeoDecomp's internals. We don't have a terrible lot of documentation online as most of the team work at FAU. However, here are some starting points.

  • The main repository is hosted on GitHub. You can also report bugs there.
  • The API documentation is available here.
LibGeoDecomp is written in 1TBS (one true brace style) and mostly follows the coding style of Qt and KDE. Compliance is important to keep the codebase consistent and readable.
  • Pass primitive types (int, double, char, size_t) by value, all others as const references. Exception: if a function is meant to modify its parameters, then pass those as pointers.
  • Use const TYPE& var instead of TYPE const & var.
  • Don't repeat yourself! When writing comments, please focus on the WHY and not the WHAT. What the code does, should be obvious from the method name and/or code itself. Don't repeat yourself. But if your code is doing something complex or counterintuitive, a short explanation would be helpful.
  • Sometimes you'll need to bind template arguments of a frequently used type in a typedef, e.g. typedef typename CELL_TYPE::Topology Topology. (see example on the right).
  • Header guards are also all caps with underscores. The are simply the path of the given file within the source repositiory, e.g. #ifndef LIBGEODECOMP_MISC_GRID_H for misc/grid.h.
  • Layout:
    • Indentation: 4 spaces, no tabs
    • No indentation for namespaces
    • No trailing whitespace
  • Naming conventions
    • Use CamelCase for classes, typenames, functions, variables. All types are capitalized, all functions and variables start with lower case characters.
    • ALL_CONSTANTS are all caps, delimited by underscores.
    • Please don't shorten variable names to crptcAbvtns or single letters. No one can remember what k2, alpha, xx and such mean, not even you. Bytes are cheap these days. Why not use descriptiveNames?
namespace LibGeoDecomp {

template<typename CARGO>
class Container
{
public:
    friend class ThatOtherClass;
    typedef typename CARGO::Topology Topology;
    static const int DIM = Topology::DIM;

    inline Container(size_t size, const CARGO& default)
    {
        myData = new CARGO[size];
        for (size_t i = 0; i < size; ++i) {
          myData[i] = default;
        }
    }

    ~Container()
    {
        delete myData;
    }

    inline void setFront(const CARGO& newFront)
    {
        myData[0] = newFront;
    }

    /**
     * Will exchange 
     */
    inline void swapFront(CARGO *newFront)
    {
        std::swap(myData[0], *newFront);
    }

    inline CARGO *data()
    {
      return myData;
    }
};

}

Lines of Code

The C++ stats include libgeodecomp/src only; the build system includes all CMake files and the TypemapGenerator. All stats exclude generated files.

lines of code vs. revisions

Revisions vs. Time

revisions vs. date

Fixmes

How often does the term fixme occur in the code? (Fewer would be better.)

occurrences of term FIXME vs. revisions arrow down, label: better number of undocumented classes vs. revisions arrow down, label: better

Performance plots typically use seconds (s, the fewer the better) or Giga Lattice Updates Per Second (GLUPS, the more the better) as their unit of measure. They contain multiple curves which compare implementations based on LibGeoDecomp with manual implementations. The names of the curves indicate which kind of implementation was used:

  • Metal names (bronze, silver, gold, platinum) are assigned to implementations which use LibGeoDecomp, with bronze being the least optimized implementation and platinum representing the highest degree of ptimization.
  • Spices (vanilla, pepper) are used for manual, mostly C-style implementations. They mimic the codes developers would need to write if the had no access to LibGeoDecomp. Vanilla is typically a very basic code, while pepper may contain optimizations.

Please mind the different scales in the plots. Some benchmarks were introduced at different times to the various architectures, which is why the horizontal scales differ. And of course the vertical scales differ, too, as the different hardware architectures yield different levels of performance.

1. Applications

This category contains multiple benchmarks which are identical, or at least good representatives of those kernels found in actual simulation codes. They were chosen so that they allow comparison with competing implementations.

Lattice Boltzmann Method (NVIDIA Tesla K20c)

performance of a 3D LBM iteration on an NVIDIA Tesla K20c GPU arrow up, label: better

Show: last Month, last week, all

Lattice Boltzmann Method (NVIDIA Tesla C2050)

performance of a 3D LBM iteration on an NVIDIA Tesla C2050 GPU arrow up, label: better

Show: last Month, last week, all

Lattice Boltzmann Method (Intel Xeon E5-2670, Sandy Bridge-EP)

performance of a 3D LBM iteration on a Sandy Bridge-EP CPU arrow up, label: better

Show: last Month, last week, all

Lattice Boltzmann Method (Intel Core i7-2600K, Sandy Bridge)

performance of a 3D LBM iteration on a Sandy Bridge CPU arrow up, label: better

Show: last Month, last week, all

Lattice Boltzmann Method (Intel Xeon X5650, Nehalem)

performance of a 3D LBM iteration on a Nehalem CPU arrow up, label: better

Show: last Month, last week, all

Reverse Time Migration (NVIDIA Tesla K20c)

performance of a 3D RTM iteration on an NVIDIA Tesla K20c GPU arrow up, label: better

Show: last Month, last week, all

Reverse Time Migration (NVIDIA Tesla C2050)

performance of a 3D RTM iteration on an NVIDIA Tesla C2050 GPU arrow up, label: better

Show: last Month, last week, all

Jacobi 3D, CPU (Intel Xeon E5-2670, Sandy Bridge-EP)

performance of a 3D Jacobi iteration on a Sandy Bridge-EP CPU arrow up, label: better

Show: last Month, last week, all

Jacobi 3D, CPU (Intel Core i7-2600K, Sandy Bridge)

performance of a 3D Jacobi iteration on a Sandy Bridge CPU arrow up, label: better

Show: last Month, last week, all

Jacobi 3D, CPU (Intel Xeon X5650, Nehalem)

performance of a 3D Jacobi iteration on a Nehalem CPU arrow up, label: better

Show: last Month, last week, all

2. Geometry Subsystem

The geometry subsystem is responsible for the domain decomposition. (I.e. which parts of the grid are assigned to which node and how can we determine halos and traverse the coordinates?) Their primary purpose is to discover involuntary performance degradations.

Coordinate Enumeration (Intel Xeon E5-2670, Sandy Bridge-EP)

performance of a Region::Iterator and Coord<3>::operator+=() on a Sandy Bridge-EP CPU arrow down, label: better

Show: last Month, last week, all

Coordinate Enumeration (Intel Core i7-2600K, Sandy Bridge)

performance of a Region::Iterator and Coord<3>::operator+=() on a Sandy Bridge CPU arrow down, label: better

Show: last Month, last week, all

Coordinate Enumeration (Intel Xeon X5650, Nehalem)

performance of a Region::Iterator and Coord<3>::operator+=() on a Nehalem CPU arrow down, label: better

Show: last Month, last week, all

Region Count (Intel Xeon E5-2670, Sandy Bridge-EP)

performance of the count operation of a Region object on a Sandy Bridge-EP CPU arrow down, label: better

Show: last Month, last week, all

Region Count (Intel Core i7-2600K, Sandy Bridge)

performance of the count operation of a Region object on a Sandy Bridge CPU arrow down, label: better

Show: last Month, last week, all

Region Count (Intel Xeon X5650, Nehalem)

performance of the count operation of a Region object on a Nehalem CPU arrow down, label: better

Show: last Month, last week, all

Region Insert (Intel Xeon E5-2670, Sandy Bridge-EP)

performance of the insert operation on a Region object on a Sandy Bridge-EP CPU arrow down, label: better

Show: last Month, last week, all

Region Insert (Intel Core i7-2600K, Sandy Bridge)

performance of the insert operation on a Region object on a Sandy Bridge CPU arrow down, label: better

Show: last Month, last week, all

Region Insert (Intel Xeon X5650, Nehalem)

performance of the insert operation on a Region object on a Nehalem CPU arrow down, label: better

Show: last Month, last week, all

Region Intersect (Intel Core i7-2600K, Sandy Bridge-EP)

performance of the intersect operation on a Region object on a Sandy Bridge-EP CPU arrow down, label: better

Show: last Month, last week, all

Region Intersect (Intel Core i7-2600K, Sandy Bridge)

performance of the intersect operation on a Region object on a Sandy Bridge CPU arrow down, label: better

Show: last Month, last week, all

Region Intersect (Intel Xeon X5650, Nehalem)

performance of the intersect operation on a Region object on a Nehalem CPU arrow down, label: better

Show: last Month, last week, all

3. MPI

Some of our components naturally rely on MPI. This set of benchmarks evaluates how well they perform.

PatchLink<Grid<MySimpleCell> > (Intel Xeon E5-2670, Sandy Bridge-EP)

performance of a PatchLink<Grid<MySimpleCell> > on a Sandy Bridge-EP CPU arrow up, label: better

Show: last Month, last week, all

PatchLink<Grid<MySimpleCell> > (Intel Core i7-2600K, Sandy Bridge)

performance of a PatchLink<Grid<MySimpleCell> > on a Sandy Bridge CPU arrow up, label: better

Show: last Month, last week, all

PatchLink<Grid<MySimpleCell> > (Intel Core i7-2600K, Nehalem)

performance of a PatchLink<Grid<MySimpleCell> > on a Nehalem CPU arrow down, label: better

Show: last Month, last week, all

PatchLink<Grid<TestCell<3> > > (Intel Xeon E5-2670, Sandy Bridge-EP)

performance of a PatchLink<Grid<TestCell<3> > > on a Sandy Bridge-EP CPU arrow down, label: better

Show: last Month, last week, all

PatchLink<Grid<TestCell<3> > > (Intel Core i7-2600K, Sandy Bridge)

performance of a PatchLink<Grid<TestCell<3> > > on a Sandy Bridge CPU arrow down, label: better

Show: last Month, last week, all

PatchLink<Grid<TestCell<3> > > (Intel Core i7-2600K, Nehalem)

performance of a PatchLink<Grid<TestCell<3> > > on a Nehalem CPU arrow down, label: better

Show: last Month, last week, all

CollectingWriter<MySimpleCell> (Intel Xeon E5-2670, Sandy Bridge-EP)

performance of a CollectingWriter<MySimpleCell> on a Sandy Bridge-EP CPU arrow up, label: better

Show: last Month, last week, all

CollectingWriter<MySimpleCell> (Intel Core i7-2600K, Sandy Bridge)

performance of a CollectingWriter<MySimpleCell> on a Sandy Bridge CPU arrow up, label: better

Show: last Month, last week, all

CollectingWriter<MySimpleCell> (Intel Core i7-2600K, Nehalem)

performance of a CollectingWriter<MySimpleCell> on a Nehalem CPU arrow up, label: better

Show: last Month, last week, all

CollectingWriter<TestCell<3> > (Intel Xeon E5-2670, Sandy Bridge-EP)

performance of a CollectingWriter<TestCell<3> > on a Sandy Bridge-EP CPU arrow up, label: better

Show: last Month, last week, all

CollectingWriter<TestCell<3> > (Intel Core i7-2600K, Sandy Bridge)

performance of a CollectingWriter<TestCell<3> > on a Sandy Bridge CPU arrow up, label: better

Show: last Month, last week, all

CollectingWriter<TestCell<3> > (Intel Core i7-2600K, Nehalem)

performance of a CollectingWriter<TestCell<3> > on a Nehalem CPU arrow up, label: better

Show: last Month, last week, all

last modified: Thu Oct 23 01:38:48 2014 +0200