Towards a universal and portable assembly code size reduction: a case study of RISC-V ISA Jianfeng LiuWangrong GaoTing Wang Regular Paper 17 May 2024
CPTF–a new heuristic based branch and bound algorithm for workflow scheduling in heterogeneous distributed computing systems D. SirishaS. Sambhu Prasad Regular Paper 15 May 2024
O2ath: an OpenMP offloading toolkit for the sunway heterogeneous manycore platform Haoran LinLifeng YanWeiguo Liu Regular Paper 03 May 2024
Editorial for the special issue on heterogenous computing Shanjiang TangYusen Li Editorial 23 April 2024 Pages: 113 - 114
Opencl-pytorch: an OpenCL-based extension of PyTorch Yicheng SuiYufei SunYuzhi Zhang Regular Paper 08 April 2024
Fast and accurate novelty detection for large surveillance video Shanjiang TangZiyi WangJian Xiao Regular Paper 04 April 2024 Pages: 130 - 149
AIbench: a tool for benchmarking Huawei ascend AI processors Yang XiaoZeke Wang Regular Paper 02 May 2024 Pages: 115 - 129
Deep convolutional encoder–decoder networks based on ensemble learning for semantic segmentation of high-resolution aerial imagery Huming ZhuChendi LiuBiao Hou Regular Paper 14 March 2024
LS-HTC: an HTC system for large-scale jobs Juncheng HuXilong CheYuhan Shao Regular Paper 11 March 2024
A heterogeneous 3-D stacked PIM accelerator for GCN-based recommender systems Xinyang ShenYu HuangHai Jin Regular Paper Open access 28 February 2024 Pages: 150 - 163
Special issue of HPCChina 2023 Yunquan ZhangGuangming TanLiang Yuan Editorial 22 February 2024 Pages: 1 - 2
oclCUB: an OpenCL parallel computing library for deep learning operators Changqing ShiYufei SunYuzhi Zhang Regular Paper 16 February 2024
A security JPEG image system accelerated by NEON technology based on FT-2000/4 Yu HuZiteng LiLei Wang Regular Paper 22 January 2024
thSORT: an efficient parallel sorting algorithm on multi-core DSPs Mouzhi YangPeng ZhangChun Huang Regular Paper 19 January 2024
Adaptive key partitioning in distributed stream processing Gang LiuZeting WangRui Mao Regular Paper Open access 12 January 2024 Pages: 164 - 178
swCUDA: Auto parallel code translation framework from CUDA to ATHREAD for new generation sunway supercomputer Maoxue YuGuanghao MaZhiqiang Wei Regular Paper Open access 11 January 2024
DCU-CHK: checkpointing for large-scale CPU-DCU heterogeneous computing systems Jie JiaXinyuan LinYi Liu Regular Paper 07 January 2024
HiRM: Hierarchical resource management for earth system models on many-core clusters Zhewen XuXiaohui WeiSicong Li Regular Paper 05 January 2024
Extending OP2 framework to support portable parallel programming of complex applications Zongjing ChenKangjin HuangMing Li Regular Paper 07 December 2023
Leveraging simulation of high performance computing systems with node simulation using architecture simulator Fang LinYi LiuXueyan Gai Regular Paper 13 November 2023 Pages: 442 - 464
OneGraph: a cross-architecture framework for large-scale graph computing on GPUs based on oneAPI Shiyang LiJingyu ZhuXuqiang Wang Regular Paper 09 November 2023 Pages: 179 - 191
BSPADMM: block splitting proximal ADMM for sparse representation with strong scalability Yidong ChenJingshan PanZhonghua Lu Regular Paper 07 October 2023 Pages: 3 - 16
Conflict-aware workload co-execution on SX-aurora TSUBASA Riku NunokawaYoichi ShimomuraHiroyuki Takizawa Regular Paper Open access 05 October 2023
FILL: a heterogeneous resource scheduling system addressing the low throughput problem in GROMACS Yueyuan ZhouZiYi RenGuangming Tan Regular Paper 23 September 2023 Pages: 17 - 31
ConvDarts: a fast and exact convolutional algorithm selector for deep learning frameworks Lu BaiWeixing JiWanyi Zhu Regular Paper 20 September 2023 Pages: 32 - 44
An efficient cloud-based elastic RDMA protocol for HPC applications Hang CaoCheng XuXiantao Zhang Regular Paper 15 September 2023 Pages: 45 - 53
Uncovering the performance bottleneck of modern HPC processor with static code analyzer: a case study on Kunpeng 920 Shaojie TanQingcai JiangHong An Regular Paper 15 September 2023
Mixed-precision block incomplete sparse approximate preconditioner on Tensor core Haoyuan ZhangWenpeng MaZhonghua Lu Regular Paper 13 September 2023 Pages: 54 - 67
Optimization of the parallel semi-Lagrangian scheme to overlap computation with communication based on grouping levels in YHGSM Dazheng LiuWenjuan LiuJianping Wu Regular Paper 10 September 2023 Pages: 68 - 77
High performance dilated convolutions on multi-core DSPs Yang WangQinglin WangJie Liu Regular Paper 09 September 2023 Pages: 78 - 93
Quantitative evaluation of deep learning frameworks in heterogeneous computing environment Zhengxian LuChengkun DuFei Yang Regular Paper 08 September 2023 Pages: 94 - 111
A performance evaluation method of queuing theory based on Cosmos cross-chain platform Ou WuBinbin HuangHaoming Li Regular Paper 18 August 2023 Pages: 465 - 485
SI on parallel system and algorithm optimization Liang YuanJunmin Xiao Editorial 16 August 2023 Pages: 229 - 230
ddRingAllreduce: a high-precision RingAllreduce algorithm Xiaojun LeiTongxiang GuXiaowen Xu Regular Paper 05 July 2023 Pages: 245 - 257
MT-office: parallel password recovery program for office on domestic heterogeneous multi-core processor Yongtao LuoBo YangChunye Gong Review Paper 04 July 2023 Pages: 231 - 244
FPGA-based acceleration architecture for Apache Spark operators Yuanwei SunHaikun LiuYu Zhang Regular Paper 23 June 2023 Pages: 192 - 205
Editorial for the special issue on architecture, algorithms and applications of high performance sparse matrix computations Weifeng LiuGuangming TanXiaowen Xu Editorial 19 June 2023 Pages: 99 - 101
TileSpTRSV: a tiled algorithm for parallel sparse triangular solve on GPUs Zhengyang LuWeifeng Liu Regular Paper 12 June 2023 Pages: 129 - 143
Processor power forecasting through model sample analysis and clustering Kexing ZhouYong DongZhixin Ou Regular Paper 12 June 2023 Pages: 258 - 276
FSGraph: fast and scalable implementation of graph traversal on GPUs Yuan ZhangHuawei CaoXuejun An Regular Paper 31 May 2023 Pages: 277 - 291
Compressed data direct computing for Chinese dataset on DCU Yani LiuFeng ZhangXiaoyong Du Regular Paper 30 May 2023 Pages: 206 - 220
FASS-pruner: customizing a fine-grained CNN accelerator-aware pruning framework via intra-filter splitting and inter-filter shuffling Xiaohui WeiXinyang ZhengHengshan Yue Regular Paper 26 May 2023 Pages: 292 - 303
ArkGPU: enabling applications’ high-goodput co-location execution on multitasking GPUs Jie LouYiming SunNinghui Sun Regular Paper 24 May 2023 Pages: 304 - 321
Improved parallel matrix multiplication using Strassen and Urdhvatiryagbhyam method Y. R. Annie BessantJ. Grace JencyBinay Kumar Pandey Regular Paper 24 May 2023 Pages: 102 - 115
Adapting combined tiling to stencil optimizations on sunway processor Biao SunMingzhen LiDepei Qian Regular Paper 17 May 2023 Pages: 322 - 333
A large-scale heterogeneous computing framework for non-uniform sampling two-dimensional convolution applications Yu LuCe YuGang Zheng Regular Paper 11 May 2023 Pages: 221 - 239
Sgap: towards efficient sparse tensor algebra compilation for GPU Genghan ZhangYuetong ZhaoYu Wang Regular Paper 08 May 2023 Pages: 210 - 227
Carbon Emissions Reduction of Neural Network by Discrete Rank Pruning Songwen PeiJie LuoMingsong Chen Regular Paper 04 May 2023 Pages: 334 - 346
ScalaQC: a scalability optimization framework for full-state quantum simulation on CPU+GPU heterogeneous clusters Chenyang JiaoZhikai QinLi Shen Regular Paper 04 May 2023
Altruistic user-oriented task allocation techniques for mobile crowdsensing Moirangthem Goldie MeiteiNingrinla Marchang Regular Paper 27 April 2023