Execution of the SimSET Monte Carlo PET/SPECT Simulator in the Condor Distributed Computing Environment
- First Online:
- 708 Downloads
SimSET is a package for simulation of emission tomography data sets. Condor is a popular distributed computing environment. Simple C/C++ applications and shell scripts are presented which allow the execution of SimSET on the Condor environment. This is accomplished without any modification to SimSET by executing multiple instances and using its combinebin utility. This enables research facilities without dedicated parallel computing systems to utilize the idle cycles of desktop workstations to greatly reduce the run times of their SimSET simulations. The necessary steps to implement this approach in other environments are presented along with sample results.
Key wordsSimSET Monte Carlo Condor emission tomography simulation computer simulation positron emission tomography (PET) single photon emission computed tomography (SPECT) high-performance computing distributed computing, grid computing open source computers in medicine
Software simulators have been used extensively in the field of nuclear medicine since the 1990s. These simulators, designed to emulate the object being imaged, nuclear physics, and the detector system may be based on analytic or Monte Carlo models. They can be applied to study a number of different topics including system design, acquisition protocols, and reconstruction techniques. Anyone interested in learning more about positron emission tomography (PET) and single photon emission computed tomography (SPECT) simulators is referred to1,2 for a review of different simulation software packages and their applications.
The work described in this paper is based on the SimSET simulator.3 It was selected for its wide acceptance, and relatively extensive feature set. SimSET, which “uses Monte Carlo techniques to model the physical processes and instrumentation used in emission imaging”, was initially released in 1993.3 It was developed and is still being maintained and updated by the University of Washington Imaging Research Laboratory. SimSET, which can be used to model both SPECT and PET, models the important physical phenomena including photoelectric absorption, Compton’s scattering, coherent scattering, photon noncolinearity, and positron range. It supports a variety of collimator and detector designs, and already includes the attenuation properties for many common materials. SimSET and its source code can be downloaded from.3
The computational complexity of simulators can often limit their utility. This becomes clear when trying to simulate realistic system configurations with high-resolution input data sets. It is not uncommon for simulations to run for a week or two on standard hardware. Much effort has been spent investigating techniques to reduce simulation run times. The most notable efforts in accelerating the simulation process have been applied to the SimSET simulator. Importance sampling techniques such as stratification, forced detection, non-absorption and weight windows 8,9 can lead to an acceleration by a factor of at least 2 and up to more than 10.2
Even with acceleration techniques, the time required to run a simulation may be too long for simulators to be used in a practical manner in research laboratories. Running simulators in parallel environments is a promising way to further decrease the run time. To the authors knowledge, there are currently no PET/SPECT simulators designed to take advantage of more than a single processor. At least one attempt has been made to modify SimSET for execution in a parallel environment.10 The chosen approach, however, may limit its acceptance in other research facilities due to its complexity and required technical skills for proper deployment. The approach runs a modified version of SimSET on the NetSolve distributed environment, which increases maintenance both due to code modification and deployment throughout the grid.
In this paper, the authors introduce an alternative parallel implementation of SimSET based on the Condor high throughput computing distributed environment.11 This implementation requires no modification of SimSET, decreasing maintenance and simplifying installation. Distribution throughout the cluster can be automated using Condor’s file transfer mechanisms, and Condor’s unique architecture means no dedicated computing resources are required.
Condor, developed at the University of Wisconsin-Madison, provides “a job queuing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management” for execution of serial (ex. batch) or parallel (MPI, PVM, or server-client) applications.11 Because of its design, it can effectively run on a dedicated cluster or it can rely on the unused resources of idle workstations, similar to SETI@Home12,13 or Folding@Home.14 It supports heterogeneous environments consisting of multiple machine architectures and operating systems. It can run with or without shared file systems, and supports check pointing (resuming execution of an application on the same or another computer due to the computer losing its idle status, crashing, or rebooting) of specially compiled applications. Condor has been widely accepted, recently reported running on over 88,000 hosts worldwide.15
Individual jobs or batches of jobs are submitted to Condor by providing a simple text file containing job resource requirements and Condor settings. Condor uses this information to allocate appropriate resources (assign jobs to computers that meet the hardware and software requirements), and to determine Condor-specific job settings dealing with job priority, file transfer, error handling, submitter notification, and emulation of the local execution environment on remote machines. Jobs continue to be monitored during execution allowing graceful error handling and job migration as available resources change. For more information on Condor, see11,16, 17, 18, 19.
Several steps were required before successful execution of SimSET on Condor. This involved a wrapper for SimSET to match its interface to that required by Condor, a script to setup and launch a SimSET run on Condor, and an application to finalize the results. Only the script needs to be modified to run SimSET on another Condor cluster.
The first parameter of any C/C++ application is the invocation name of the program that is being run. SimSET is a single executable, multiple utility, package that utilizes the first parameter to determine which utility the operator intended to run. The utility names that are invoked when running SimSET (ex. phg, combinebin, etc.) are simply symbolic links to the SimSET application. SimSET examines the invocation name to determine which utility was meant to be run. For example, if the first argument is phg, then SimSET runs the photon history generator utility.
When executing applications using Condor, the invocation name is replaced by the name of a Condor executable. SimSET is then unable to determine which utility the operator intended to run. To address this issue, a simple C wrapper was written. This wrapper, named simset_wrapper.c, can be found in Appendix 1. simset_wrapper.c simply removes the first argument and reruns the command. For example, running the command “simset_wrapper phg phg.param” will cause the command “phg phg.param” to be executed. By running SimSET using this wrapper the reliance on the invocation name is avoided. When running on Condor, the unavoidable replacement of the first parameter will no longer interfere with the execution of SimSET.
It was determined that the simplest way to run the serial SimSET application in parallel would be to run it multiple times combining the results at the end using the combinebin utility. Since the events recorded in the histograms are independent of each other (one decay of a radioisotope does not affect another), this is an acceptable solution. A simulation of a large number of events can be accomplished by running multiple simulations of an appropriately smaller number of events concluded by combining their results using the combinebin utility. This approach, excluding the execution of combinebin, provides essentially a linear decrease in execution time for SimSET.
A sh script simset_batch.sh, found in Appendix 2, was designed to create the necessary Condor job files for executing multiple versions of SimSET. It takes as arguments the histogram name, the phg parameter file name, and the number of instances that you would like run. For example, running “sh simset_batch.sh tutor1 phg.param 100” will cause the command “phg phg.param” to be executed 100 times. Their results tutor1.weight, tutor1.weight2 and tutor1.count will be combined at the end to create a single tutor1.weight, tutor1.weight2 and tutor1.count. The output for each run of phg will be saved in its own subdirectory for reference at a later time.
For SimSET to correctly run using simset_batch.sh, relative path names to preexisting files in the SimSET parmeter files will need to be changed. During execution, the working directory will be “./data/nodeX”. This means a reference such as “bin.param” will need to be changed to “../../bin.param”.
Lines 5–23 of simset_batch.sh remove any data from previous runs. Lines 24–47 create the Condor job file that will be used to run SimSET the appropriate number of times. See15 for information on modifying the Condor parameters in this section for appropriate execution in your environment. In the very least appropriate paths will need to be inserted. Specification of requirements may be necessary if SimSET is expected to produce large histograms. File transfer will need to be appropriately set up if working in a nonshared file system environment, and macros might need to be used for specification of executables and arguments in a multi-architecture, multi-operating system environment.
Lines 48–68 create the Condor job responsible for combining the results from the different SimSET runs. This section may need to be appropriately modified to match your environment. simset_batch.sh concludes with lines 69–70 which are responsible for submitting the previously defined jobs for execution on Condor.
The C++ application simset_finalize.cpp, found in Appendix 3, is responsible for combining the results of the different SimSET runs. Other than the path to the SimSET combinebin utility, this application should never need changing. Simset_finalize.cpp knows how many SimSET instances have been launched by Condor, it loops through these looking for ones that have completed. As they complete simset_finalize.cpp takes the results and uses combinebin to sum them. Please note that there are bugs in the combinebin portion of SimSET version 22.214.171.124. Contact SimSET support or the authors of this paper for the updates.
RESULTS AND DISCUSSION
Running SimSET in a distributed manner on the Condor system reduced runtime considerably. This savings will amplify in significance as the number of events being simulated increases, and as the simulations being run begin to more realistically model modern scanners.
Due to the way SimSET distributes decay events throughout the phantom, care needs to be exercised when deciding how many events to simulate and how many instances of SimSET to run in parallel. This becomes more important as the resolution of phantoms increases, the number of instances of SimSET running in parallel increases, and as the number of events being simulated decreases. See the empirical results section of10 for a detailed discussion of this problem.
Significant decreases in the runtime of SimSET can be achieved by parallelization. This can be accomplished by running SimSET on a Condor cluster as described in this paper. This implementation required no changes to the SimSET source code and can be easily implemented in other environments. The only change that should be required is modification of the Condor job definitions specified in the script simset_batch.sh. This is a particularly attractive option for smaller research groups since Condor can utilize the unused CPU cycles in idle desktop workstations and does not require an expensive dedicated cluster.
The authors would like to thank Dr. Robert Harrison and the University of Washington Imaging Research Laboratory for supporting SimSET, and the University of Wisconsin-Madison computer science department for supporting Condor. They would also like to thank Research Computing at the Rochester Institute of Technology and its director Dr. Gurcharan S. Khanna for providing and supporting the computational resources used.