Code Structure
The core software is organized in three parts: the Belle II Analysis Software Framework basf2 containing the Belle II-specific code, the externals containing third-party code on which basf2 depends, and the tools containing scripts for the software installation and configuration.
Basf2
The Belle II-specific code is partitioned into about 40 packages, such as the base-level framework, one package for each detector component, the track reconstruction code, and the post-reconstruction analysis tools. Each package is managed by one or two librarians.
The code is written in C++, with the header and source files residing in include and src subdirectories, respectively. By default, one shared library is created per package and is installed in a top-level lib directory that is included in the user’s library path. The build system treats the package’s contents in pre-defined subdirectories as follows.
Modules The code is compiled into a shared library and installed in a top-level module directory, so that it can be dynamically loaded by basf2.
Tools C++ code is compiled into an executable and installed in a top-level bin directory that is included in the user’s path. Executable scripts, usually written in Python, are symlinked to this directory.
Dataobjects These classes define the organization of the data that can be stored in output files. The code is linked in a shared library with the _dataobjects suffix.
Scripts Python scripts are installed in a directory that is included in the Python path.
Data All files are symlinked to a top-level data folder.
Tests Unit and script tests (see “Basf2 Development Infrastructure and Procedures”).
Validation Scripts and reference histograms for validation plots (see “Basf2 Development Infrastructure and Procedures”).
Examples Example scripts that illustrate features of the package.
Users of basf2 usually work with centrally installed versions of basf2. At many sites, they are provided on CVMFS [3]. Users may also install pre-compiled binaries at a central location on their local systems with the b2install-release tool. If no pre-compiled version is available for their operating system, the tool compiles the requested version from source.
Externals
We require a few basic packages to be installed on a system, like a compiler, make, wget, tar, and git. The tool b2install-prepare checks whether these prerequisites are fulfilled and installs, if desired, the missing packages. All other third-party code on which we rely is bundled in the externals installation. It includes basic tools like GCC, Python 3, and bzip2 to avoid requiring a system-wide installation of specific versions at all sites, as well as HEP specific software like ROOT [4], Geant4 [5], and EvtGen [6]. Some packages, like LLVM or Valgrind, are optional and not included in the compilation of the externals by default. The number of external products has grown over time to about 60, supplemented with 90 Python packages.
The instructions and scripts to build the externals are stored in a git repository. We use a makefile with specific commands for the download, compilation, and installation of each of the external packages. Copies of the upstream source packages are kept on a Belle II web server to have them available if the original source disappears. The copies provide redundancy for the download if the original source is temporarily unavailable. The integrity of the downloaded files is checked using their SHA 256 digests.
The libraries, executables, and include files of all external packages are collected in the common directories lib, bin, and include, respectively, so that each of them can be referenced with a single path. For the external software that we might want to include in debugging efforts, such as ROOT or Geant4, we build a version with debug information to supplement the optimized version.
The compilation of the externals takes multiple hours and is not very convenient for users. Moreover, some users experience problems because of specific configurations of their systems. These problems and the related support effort are avoided by providing pre-compiled binary versions. We use docker to compile the externals on several supported systems: Scientific Linux 6, Red Hat Enterprise Linux 7, Ubuntu 14.04, and the Ubuntu versions from 16.04 to 18.04. The b2install-externals tool conveniently downloads and unpacks the selected version of the pre-built externals.
Because the absolute path of an externals installation is arbitrary, we have invested significant effort to make the externals location-independent. First studies to move from the custom Makefile to Spack [7] have been done with the aim of profiting from community solutions for the installation of typical HEP software stacks, but relocateability of the build products remains an issue.
Tools
The tools are a collection of shell and Python scripts for the installation and setup of the externals and basf2. The tools themselves are set up by sourcing the script b2setup. This script identifies the type of shell and then sources the corresponding sh- or csh-type setup shell script. This script, in turn, adds the tools directory to the PATH and PYTHONPATH environment variables, sets Belle II-specific environment variables, defines functions for the setup or configuration of further software components, and checks whether a newer version of the tools is available. A pre-defined set of directories is searched for files containing site-specific configurations. The Belle II-specific environment variables have the prefix BELLE2 and contain information like repository locations and access methods, software installation paths, and software configuration options.
Installation of externals and basf2 releases is handled by the shell scripts b2install-externals and b2install-release, respectively. Usually, they download and unpack the version-specific tarball of pre-compiled binaries for the given operating system. If no binary is available, the source code is checked out and compiled. Each version of the externals and basf2 releases is installed in a separate directory named after the version. For the compilation of the externals, we rely on the presence of a few basic tools, like make or tar, and development libraries with header files. Our tools contain a script that checks that these dependencies are fulfilled and, if necessary, installs the missing ones.
The command b2setup sets up the environment for a version-specified basf2 release. It automatically sets up the externals version that is tied to this release, identified by the content of the .externals file in the release directory. An externals version can be set up independently of a basf2 release with the b2setup-externals command. The version-dependent setup of the externals is managed by the script externals.py in the externals directory. Externals and basf2 releases can be compiled in optimized or debug mode using GCC. In addition, basf2 supports the compilation with the Clang or Intel compilers. These options can be selected with the b2code-option and b2code-option-externals commands. A distinct subdirectory is used for the option’s libraries and executables. The commands that change the environment of the current shell are implemented as functions for sh-type shells and as aliases for csh-type shells.
The tools also support the setup of an environment for the development of basf2 code. The b2code-create command clones the basf2 git repository and checks out the master branch. The environment is set up by executing the b2setup command without arguments in the working directory. If a developer wants to modify one package and take the rest from a centrally installed release, the b2code-create command can be used with the version of the selected release as an additional argument that is stored in the file .release. The sparse checkout feature of git is used to get a working directory without checked-out code. Packages can then be checked out individually with the b2code-package-add command. The b2setup command sets up the environment for the local working directory and the centrally installed release. Further tools for the support of the development work are described in “Basf2 Development Infrastructure and Procedures”.
To make it easier for users to set up an environment for the development of post-reconstruction analysis code and to encourage them to store it in a git repository, the tools provide the b2analysis-create command. This requires a basf2 release version as one of the arguments and creates a working directory attached to a git repository on a central Belle II server. The basf2 release version is stored in a .analysis file and used by the b2setup command for the setup of the environment. The b2analysis-get command provides a convenient way to get a clone of an existing analysis repository and set up the build system.
The tools are designed to be able to set up different versions of basf2 and externals and thus must be independent of them. For this reason, all binary code is placed in the externals. When GCC and Python were embedded in the tools originally to avoid duplication in multiple externals versions, this proved difficult to manage during updates. One of the prime challenges that we overcame in the development of the tools was to cope with the different shell types and various user environment settings.
Basf2 Development Infrastructure and Procedures
The basf2 code is maintained in a git repository. We use Bitbucket Server [8] to manage pull requests. This provides us with the ability to review and discuss code changes in pull requests before they are merged to the main development branch in the git repository. Compared to the previous workflow based on subversion, it helps the authors to improve the quality of their code and allows the reviewers to get a broader view of the software. We exploit the integration with the Jira [9] ticketing system for tracking and planning the development work.
Developers obtain a local copy of the code with the b2code-create tool (see “Tools”). The build system is based on SCons [10], because compared to the HEP standard CMake, the build process is a one-step procedure and the build configuration is written in Python, a language adopted already for the basf2 configuration steering files (see “Python Interface”). The time SCons needs to determine the dependencies before starting the build is reduced by optimizations, such as bypassing the check for changes of the externals. Developers and users usually do not have to provide explicit guidance to the build system; they only have to place their code in the proper subdirectories. However, if the code references a set of linked libraries, the developer indicates this in the associated, typically three-line, SConscript file.
We implement an access control for git commits to the master branch using a hook script on the Bitbucket server. Librarians, identified by their user names in a .librarians file in the package directory, can directly commit code in their package. They can grant this permission to others by adding them to a .authors file. All Belle II members are permitted to commit code to any package in feature or bugfix branches. The merging of these branches to the master via pull requests must be approved by the librarians of the affected packages. Initially, when subversion was used for version control, direct commits to the master were the only supported workflow, but after the migration to git pull requests are the recommended and more common way of contributing.
We have established coding conventions to achieve some conformity of the code. Because most of them cannot be enforced technically, we rely on developers and reviewers to follow them. We do enforce a certain style to emphasize that the code belongs to the collaboration and not to the individual developer. The AStyle tool [11] is used for C++ code and pep8 [12] and autopep8 [13] for Python code. Some developers feel strongly about the code formatting, and so, we make it easy to follow the rules and reduce their frustration by providing the b2code-style-check tool to print style violations and the b2code-style-fix tool to automatically fix them. The style conformity is checked by the Bitbucket server hook upon push to the central repository. It also rejects files larger than 1 MB to prevent an uncontrolled growth of the repository size. To provide feedback to developers as early as possible and to avoid annoying rejections when commits are pushed to the central repository, we implement the checks of access rights, style, and file size in a hook for commits to the local git repository.
To facilitate test-driven development, unit tests can be implemented in each package using Google Test [14]. These are executed with the b2test-units command. Test steering files in all packages can be run with the b2test-scripts command. It compares the output to a reference file and complains if they differ or if the execution fails. The unit and steering file tests are executed by the Bamboo [15] build service whenever changes are pushed to the central repository. Branches can only be merged to the master if all tests succeed.
The tests are also executed by a Buildbot [16] continuous integration system that compiles the code with the GCC, Clang, and Intel compilers and informs the authors of commits about new errors or warnings. Once a day, the Buildbot runs Cppcheck, a geometry overlap check, Doxygen and Sphinx [17] documentation generation, and a Valgrind memory check. The results are displayed on a web page, and the librarians are informed by email about issues in their packages. A detailed history of issues is stored in a MySQL database with a web interface that also shows the evolution of the execution time, output size, and memory usage of a typical job.
Higher level quality control is provided by the validation framework. It executes scripts in a package’s validation subdirectory to generate simulated data files and produce plots from them. The validation framework then spawns a web server to display the plots in comparison with a reference as well as results from previous validation runs. A software quality shifter checks the validation plots produced each night for regressions and informs the relevant individual(s) if necessary.
As a regular motivation for the libarians to review the changes in their package, we generate monthly builds. For a monthly build, we require all librarians to agree on a common commit on the master branch. They signal their agreement using the b2code-package-tag command to create a git tag of the agreed-upon common commit with a tag name composed of the package name and a version number. The command asks for a summary of changes that is then used as tag message and included in the announcement of the monthly build. The procedure of checking the agreement, building the code, and sending the announcement is fully automated with the Buildbot.
An extensive manual validation, including the production of much larger data samples, is done before releasing a major official version of basf2. Based on these major versions, minor or patch releases that require less or no validation effort are made. In addition, light basf2 releases containing only the packages required to analyze mini DST (mDST, see “Event Data Model”) data can be made by the analysis tools group convener. This allows for a faster release cycle of analysis tools. Each release is triggered by pushing a tag to the central repository. The build process on multiple systems and the installation on CVMFS is then automated.
In maintaining or modifying the development infrastructure and procedures, we aim to keep the thresholds to use and contribute to the software as low as possible and, at the same time, strengthen the mindset of a common collaborative project and raise awareness of code quality issues. This includes principles like early feedback and not bothering developers with tasks that can be done by a computer. For example, the tools complain about style-rule violations already on commits to the local git repository and offer programmed corrections. In this way, users and developers can focus on the development of their code and use their time more efficiently.
Modules, Parameters, and Paths
The data from the Belle II detector, or simulations thereof, are organized into a set of variable-duration runs, each containing a sequence of independent events. An event records the measurements of the by-products of an electron–positron collision or a cosmic ray passage. A set of runs with similar hardware state and operational characteristics is classified as an experiment. Belle II uses unsigned integers to identify each experiment, run, and event.
The basf2 framework executes a series of dynamically loaded modules to process a collection of events. The selection of modules, their configuration, and their order of execution are defined via a Python interface (see “Python Interface”).
A module is written in C++ or Python and derived from a Module base class that defines the following interface methods.
initialize() called before the processing of events to initialize the module.
beginRun() called each time before a sequence of events of a new run is processed, e.g., to initialize run-dependent data structures like monitoring histograms.
event() called for each processed event.
endRun() called each time after a sequence of events of the same run is processed, e.g., to collect run-summary information.
terminate() called after the processing of all events.
Flags can be set in the constructor of a module to indicate, for example, that it is capable of running in parallel-processing mode (see “Parallel Processing”). The constructor sets a module description and defines module parameters that can be displayed on the terminal with the command basf2 -m.
A module parameter is a property, whose value (or list of values) can be set by the user at runtime via the Python interface to tailor the module’s execution. Each parameter has a name, a description, and an optional default value.
The sequence in which the modules are executed is stored in an instance of the Path class. An integer result value that is set in a module’s event() method can be used for a conditional branching to another path. The processing of events is initiated by calling the process() method with one path as argument. The framework checks that there is exactly one module that sets the event numbers. It also collects information about the number of module calls and their execution time. This information can be printed after the event processing or saved in a ROOT file.
Log messages are managed by the framework and can be passed to different destinations, like the terminal or a text file, via connector classes. Methods for five levels of log messages are provided.
Fatal for situations, where the program execution cannot be continued.
Error for things that went wrong and must be fixed. If an error happens during initialization, event processing is not started.
Warning for potential problems that should not be ignored and only accepted if understood.
Info for informational messages that are relevant to the user.
Debug for everything else, intended solely to provide useful detailed information for developers. An integer debug level is used to control the verbosity.
The log and debug levels can be set globally, per package, or per module.
Data Store and I/O
Data Store
Modules exchange data via the Data Store that provides a globally accessible interface to mutable objects or arrays of objects. Objects (or arrays of objects) are identified by name that, by default, corresponds to the class name. By convention, arrays are named by appending an “s” to the class name. Users may choose a different name to allow different objects of the same type simultaneously. Objects in the Data Store can have either permanent or event-level durability. In the latter case, the framework clears them before the next data event is processed. Client code can add objects to the Data Store, but not remove them.
Within one event, two distinct arrays of objects in the Data Store can have weighted many-to-many relations between their elements. For example, a higher level object might have relations to all lower level objects that were used to create it. Each relation carries a real-valued weight that can be used to attach quantitative information such as the fraction a lower level object contributed to the higher level one. The relationship information is stored in a separate object; no direct pointers appear in the related objects. This allows us to strip parts of the event data, without affecting data integrity: if one side of a relationship is removed, the whole relation is dropped. The relations are implemented by placing a RelationArray in the Data Store that records the names of the arrays it relates, as well as the indices and weights of the related entries. As the Data Store permits only appending entries to an array, the indices are preserved. The name of the relations object is formed by placing “To” between the names of the related arrays.
The interface to objects in the Data Store is implemented in the templated classes StoreObjPtr for single objects and StoreArray for arrays of objects, both derived from the common StoreAccessorBase class. They are constructed with the name identifying the objects; without any argument, the default name is used. Access to the objects is type-safe and transparent to the event-by-event changes of the Data Store content. To make the access efficient, the StoreAccessorBase translates the name on first access to a pointer to a DataStoreEntry object in the Data Store. The DataStoreEntry object is valid for the lifetime of the job and contains a pointer to the currently valid object, which is automatically updated by the Data Store. Access to an object in the Data Store thus requires an expensive string search only on the first access, and then a quick double dereferencing of a pointer on subsequent accesses.
The usage of relations is simplified by deriving the objects in a Data Store array from RelationsObject. It provides methods to directly ask an object for its relations to, from, or with (ignoring the direction) other objects. Non-persistent data members of RelationsObject and helper classes are used to make the relations lookup fast by avoiding regeneration of information that was obtained earlier.
We provide an interface to filter, update or rebuild relations when some elements are removed from the Data Store. It is possible to copy whole or partial arrays in the Data Store, where new relations between the original and copied arrays are created, and, optionally, the existing relations of the original array are copied.
I/O
We use ROOT for persistency. This implies that all objects in the Data Store must have a valid ROOT dictionary. The RootOutputModule writes the content of the Data Store with permanent and event durability to a file with two separate TTrees, with a branch for each Data Store entry. The selection of branches, the file name, and some tree configurations can be specified using module parameters. The corresponding module for reading ROOT files is the RootInputModule.
The RootOutputModule writes an additional object named FileMetaData to the permanent-durability tree of each output file. It contains a logical file name, the number of events, information about the covered experiment/run/event range, the steering file content, and information about the file creation. The file metadata also contains a list of the logical file names of the input files, called parents, if any.
This information is used for the index file feature. A RootInputModule can be asked to load, in addition to the input file, its ancestors up to a generational level given as a parameter. A file catalog in XML format, created by the RootOutputModule, is consulted to translate logical to physical file names for the ancestor files. The unique event identifier is then used to locate and load the desired event. With the index file feature, one can produce a file containing only EventMetaData objects (see next section) of selected events, and then use this as the input file in a subsequent job to access the selected events in its parents. File-reading performance is not optimal, however, since the usual structure of TTrees in ROOT files is not designed for sparse event reading. The index file feature can be used also to add objects to an existing file without copying its full content or to access lower level information of individual events for display or debug purposes.
The Belle II data-acquisition system uses a custom output format with a sequence of serialized ROOT objects to limit the loss of events in case of malfunctions. The files in this format are transient; they are converted to standard ROOT files for permanent storage.
Event Data Model
The Data Store implementation makes no assumption about the event data model. It can be chosen flexibly to match specific requirements. In basf2, the full event data model is defined dynamically by the creation of objects in the Data Store by the executed modules. The only mandatory component is the EventMetaData object. It uniquely identifies an event by its event, run, and experiment numbers and a production identifier to distinguish simulated events with the same event, run, and experiment numbers. The other data members store the time when the event was recorded or created, an error flag indicating problems in data taking, an optional weight for simulated events, and the logical file name of the parent file for the index file feature.
The format of the raw data is defined by the detector readout. Unpacker modules for each detector component convert the raw data to digit objects. In case of simulation, the digit objects are created by digitizer modules from energy depositions that are generated by Geant4 and stored as detector specific SimHits. The use of a common base class for SimHits allows for a common framework to add energy depositions from simulated machine-induced background to that of simulated physics signal processes. This is called background mixing.
The output of the reconstruction consists mainly of detector-specific objects. In contrast, the RecoTrack class is used to manage the pattern recognition and track fitting across multiple detectors. It allows us to add hits to a track candidate and is interfaced to GENFIT [18, 19] for the determination of track parameters.
The subset of reconstruction dataobjects used in physics analyses, called mini data summary table (mDST), is explicitly defined in the steering file function add_mdst_output. It consists of the following classes:
Track the object representing a reconstructed trajectory of a charged particle, containing references to track fit results for multiple mass hypotheses and a quality indicator that can be used to suppress fake tracks.
TrackFitResult the result of a track fit for a given particle hypothesis, consisting of five helix parameters, their covariance matrix, a fit p-value, and the pattern of layers with hits in the vertex detector and drift chamber.
V0 candidate of a \(K^0_S\) or \(\varLambda \) decay or of a converted photon, with references to the pair of positively and negatively charged daughter tracks and track fit results. The vertex fit result is not stored as it can be reconstituted at analysis level.
PIDLikelihood the object that stores, for a charged particle identified by the related track, the likelihoods for being an electron, muon, pion, kaon, proton or deuteron from each detector providing particle identification information.
ECLCluster reconstructed cluster in the electromagnetic calorimeter, containing the energy and position measurements and their correlations, along with shower-shape variables; a relation is recorded if the cluster is matched to an extrapolated track.
KLMCluster reconstructed cluster in the \(K^0_L\) and muon (KLM) detector, providing a position measurement and momentum estimate with uncertainties; a relation is recorded if the cluster is matched to an extrapolated track.
KlId candidate for a \(K^0_L\) meson, providing particle identification information in weights of relations to KLM and/or ECL clusters.
TRGSummary information about level 1 trigger decisions before and after prescaling, stored in bit patterns.
SoftwareTriggerResult the decision of the high-level trigger, implemented as a map of trigger names to trigger results.
MCParticle the information about a simulated particle (in case of simulated data), containing the momentum, production and decay vertex, relations to mother and daughter particles, and information about traversed detector components; relations are created if simulated particles are reconstructed as tracks or clusters.
The average size of an mDST event is a critical performance parameter for the storage specification and for the I/O-bound analysis turnaround time. Therefore, the mDST content is strictly limited to information that is required by general physics analyses. In particular, no raw data information is stored. For detailed detector or reconstruction algorithm performance studies as well as for calibration tasks a dedicated format, called cDST for calibration data summary table, is provided.