Search

Search Results

Showing 1-20 of 233 results
  1. Article
    Full access

    Interactive Performance Visualization and Analysis of Execution Traces for Pattern-Based Parallel Programming

    We introduce the design and implementation of a performance visualization system for high-level programming of heterogeneous parallel systems. The...

    August Ernstsson, Elin Frankell, Christoph Kessler in International Journal of Parallel Programming
    01 October 2025 Open access
  2. Article

    Analysis of Model Parallelism for AI Applications on a 64-core RV64 Server CPU

    Massive Data Parallel workloads, driven by inference on large ML models, are pushing hardware vendors to develop efficient and cost-effective...

    Giulio Malenza, Adriano Marques Garcia, ... Marco Aldinucci in International Journal of Parallel Programming
    30 June 2025
  3. Article

    Simulation-Based Parameter Optimization for Self-adaptive HPL on Parallel Systems

    Computational benchmarks are essential for dependable systems, applications, and technologies across multiple domains. However, traditional...

    Cassiano Rista, Marcelo Teixeira, Mauro Fonseca in International Journal of Parallel Programming
    16 June 2025
  4. Article

    Generating Sparse Matrices for Large-Scale Spectral Clustering on a Single GPU

    Spectral clustering has many fundamental advantages over k-means clustering, but comes at much higher time complexity and memory requirements mainly...

    Guanlin He, Stéphane Vialle, Marc Baboulin in International Journal of Parallel Programming
    26 May 2025
  5. Article
    Full access

    Celerity-RSim: Porting Light Propagation Simulation to Accelerator Clusters Using a High-Level API

    Time-of-Flight (ToF) camera systems are increasingly capable of analyzing larger 3D spaces and providing more detailed and precise results. To...

    Peter Thoman, Philipp Gschwandtner, ... Thomas Fahringer in International Journal of Parallel Programming
    24 March 2025 Open access
  6. Article
    Full access

    Advancing Interactive Parallelization: iCetus

    Despite advancements in parallelization tools, optimizing scientific applications remains a complex and time-consuming task due to the iterative...

    Parinaz Barakhshan, Rudolf Eigenmann in International Journal of Parallel Programming
    25 February 2025 Open access
  7. Article
    Full access

    pi-par: A Dependently-Typed Parallel Language with Algorithmic Skeletons

    Algorithmic skeletons are an effective, pattern-based approach for parallelising software. However, despite implementations for a range of languages...

    Christopher Brown, Adam D. Barwell in International Journal of Parallel Programming
    25 February 2025 Open access
  8. Article
    Full access

    Automatic Heterogeneous Runtime Using Signal Processing Domain-Specific and Parallel Patterns

    Parallel and signal processing patterns for large-scale radio data applications have been captured with a new domain-specific language (DSL),...

    Yaseen Zaidi, Simon Winberg in International Journal of Parallel Programming
    25 February 2025 Open access
  9. Article

    Parallelizing RNA-Seq Analysis with BioSkel: A FastFlow Based Prototype

    Over the past decade, the widespread adoption of RNA-seq methodology for transcript-level monitoring has resulted in a surge of biological data...

    Valentin Beauvais, Nicolò Tonci, ... Sébastien Limet in International Journal of Parallel Programming
    21 February 2025
  10. Article

    Fast Parallel CPU-GPU Approximate Spectral Clustering for Transcriptomics Data

    Spectral clustering algorithms have been used in various research domains to discover structure and patterns in data. However, high computational and...

    Stefan Branković, Lazar Smiljković, ... Marko Mišić in International Journal of Parallel Programming
    30 January 2025
  11. Article

    DyG-DPCD: A Distributed Parallel Community Detection Algorithm for Large-Scale Dynamic Graphs

    Dynamic (Temporal) graphs capture the valuable evolution of real-world systems, from the continuously evolving patterns of social interactions and...

    Naw Safrin Sattar, Khaled Z. Ibrahim, ... Shaikh Arifuzzaman in International Journal of Parallel Programming
    19 November 2024
  12. Article
    Full access

    PragFormer: Data-Driven Parallel Source Code Classification with Transformers

    Multi-core shared memory architectures have become ubiquitous in computing hardware nowadays. As a result, there is a growing need to fully utilize...

    Re’em Harel, Tal Kadosh, ... Gal Oren in International Journal of Parallel Programming
    28 October 2024 Open access
  13. Article
    Full access

    Optimizing Three-Dimensional Stencil-Operations on Heterogeneous Computing Environments

    Complex algorithms and enormous data sets require parallel execution of programs to attain results in a reasonable amount of time. Both aspects are...

    Nina Herrmann, Justus Dieckmann, Herbert Kuchen in International Journal of Parallel Programming
    21 June 2024 Open access
  14. Article
    Full access

    High-Level Programming of FPGA-Accelerated Systems with Parallel Patterns

    As a result of frequency and power limitations, multi-core processors and accelerators are becoming more and more prevalent in today’s systems. To...

    Björn Birath, August Ernstsson, ... Christoph Kessler in International Journal of Parallel Programming
    27 May 2024 Open access
  15. Article

    A Hybrid Machine Learning Model for Code Optimization

    The complexity of programming modern heterogeneous systems raises huge challenges. Over the past two decades, researchers have aimed to alleviate...

    Yacine Hakimi, Riyadh Baghdadi, Yacine Challal in International Journal of Parallel Programming
    22 September 2023
  16. Article

    Calculation of Distributed-Order Fractional Derivative on Tensor Cores-Enabled GPU

    Due to an increased computational complexity of calculating the values of the distributed-order Caputo fractional derivative compared to the...

    10 July 2023
  17. Article
    Full access

    Distributed Calculations with Algorithmic Skeletons for Heterogeneous Computing Environments

    Contemporary HPC hardware typically provides several levels of parallelism, e.g. multiple nodes, each having multiple cores (possibly with...

    Nina Herrmann, Herbert Kuchen in International Journal of Parallel Programming
    07 January 2023 Open access
  18. Article
    Full access

    Portable C++ Code that can Look and Feel Like Fortran Code with Yet Another Kernel Launcher (YAKL)

    This paper introduces the Yet Another Kernel Launcher (YAKL) C++ portability library, which strives to enable user-level code with the look and feel...

    Matthew Norman, Isaac Lyngaas, ... Mark Berrill in International Journal of Parallel Programming
    08 December 2022 Open access
  19. Article
    Full access

    Generic Exact Combinatorial Search at HPC Scale

    Exact combinatorial search is essential to a wide range of important applications, and there are many large problems that need to be solved quickly....

    Ruairidh MacGregor, Blair Archibald, Phil Trinder in International Journal of Parallel Programming
    07 December 2022 Open access
  20. Article
    Full access

    Assessing Application Efficiency and Performance Portability in Single-Source Programming for Heterogeneous Parallel Systems

    We analyze the performance portability of the skeleton-based, single-source multi-backend high-level programming framework SkePU across multiple...

    August Ernstsson, Dalvan Griebler, Christoph Kessler in International Journal of Parallel Programming
    06 December 2022 Open access
Did you find what you were looking for? Share feedback.