2024

    • Jinhu Jiang, Chaoyi Liang, Rongchao Dong, Zhaohui Yang, Zhongjun Zhou, Wenwen Wang, Pen-Chung Yew, Weihua Zhang. A System-Level Dynamic Binary Translator Using Automatically-Learned Translation Rules. IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2024, Distinguished Paper Award, CCF B)   

2023

    • YiJing Song, HuaSheng Dai, JinHu Jiang and WeiHua Zhang. Multikernel: Operating System Solution to Generalized Functional Safety. Security and Safety (S&S) 2023; 2: 2023007.

    • Chenyang Jiao, Weihua Zhang, Li Shen. Communication Optimizations for State-vector Quantum Simulator on CPU+GPU Clusters. To appear in Proceedings of the 52th International Conference on Parallel Processing (ICPP 2023, CCF B)

    • Weihua Zhang, Chuanlei Zhao, Lu Peng, Yuzhe Lin, Fengzhe Zhang, Yunping Lu.  Boosting Performance and QoS for Concurrent GPU B+trees by Combining-based Synchronization. Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP 2023, CCF A)

2020

    • Jinhu Jiang, Rongchao Dong, Zhongjun Zhou, Changheng Song, Wenwen Wang, Pen-Chung Yew, Weihua Zhang. More with Less — Deriving More Translation Rules with Less Training Data for DBTs Using Parameterization. Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture (Open Source, MICRO 2020, CCF A)

    • Sui Chen, Lei Liu, Weihua Zhang, and Lu Peng. Architectural Support for NVRAM Persistence in GPUs. IEEE Transactions on Parallel and Distributed Systems (TPDS 2020, CCF A)

    • Weihua Zhang,Zhaofeng Yan,Yuzhe Lin,Chuanlei Zhao,Lu Peng. A High Throughput B+tree for SIMD architectures. IEEE Transactions on Parallel and Distributed Systems (TPDS 2020, CCF A)

2019

    • Changheng Song, Wenwen Wang, Pen-Chung Yew, Weihua Zhang. Unleashing the Power of Learning: An Enhanced Learning-based Approach for Dynamic Binary Translation. Proceedings of the 2019 USENIX Annual Technical Conference (ATC 2019, CCF A)

    • Zhaofeng Yan, Yuzhe Lin, Lu Peng, Weihua Zhang. Harmonia: A High Throughput B+ Tree for GPUs. Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming (Open Source, PPoPP 2019, CCF A)

2018

    • Weihua Zhang, Xin Wang, Shiyu Ji, Ziyun Wei, Zhaoguo Wang, Haibo Chen. Scaling Concurrent Index Structures under Contention Using HTM. IEEE Transactions on Parallel and Distributed Systems (TPDS 2018, CCF A)

    • Shaoming Chen, Lu Peng, Samuel Irving, Zhou Zhao, Weihua Zhang and Ashok Srivastava. qSwitch: Dynamical Off-Chip Bandwidth Allocation between Local and Remote Accesses. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD 2018, CCF A)

    • Samuel Irving,Bin Li,Shaoming Chen,Lu Peng,Weihua Zhang,Lide Duan. Computer comparisons in the presence of performance variation. Frontiers of Computer Science (FCS 2018, CCF B)

2017

    • Weihua Zhang, Xiaofeng Ji, Yunping Lu, Haojun Wang, Haibo Chen, Pen-Chung Yew. Prophet: A Parallel Instruction-Oriented Many-Core Simulator. IEEE Transaction on Parallel and Distributed Systems (TPDS 2017, CCF A)

    • Weihua Zhang, Xiaofeng Ji, Bo Song, Shiqiang Yu, Haibo Chen, Pen-Chung Yew, Tao Li, Wenyun Zhao. VarCatcher: A Framework for Tackling Performance Variability of Parallel Workloads on Multi-core. IEEE Transaction on Parallel and Distributed Systems (TPDS 2017, CCF A)

    • Xin Wang, Weihua Zhang, Zhaoguo Wang, Ziyun Wei, Haibo Chen, Wenyun Zhao. Eunomia: A Scalable, Contention-Conscious HTM-Friendly B+Tree. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2017, CCF A)

2016

    • Yunping Lu, Xin Wang, Weihua Zhang, Haibo Chen, Lu Peng, Wenyun Zhao. Performance Analysis of Multimedia Retrieval Workloads Running on Multicore. IEEE Transaction on Parallel and Distributed Systems (TPDS 2016, CCF A)

    • Weihua Zhang, Shiqiang Yu, Haojun Wang, Zhuofang Dai, Haibo Chen. Hardware Support for Concurrent Detection of Multiple Concurrency Bugs on Fused CPU-GPU Architectures. IEEE Transactions on Computers (TC 2016, CCF A)

    • Weihua Zhang, Haojun Wang, Yunping Lu, Haibo Chen and Wenyun Zhao. A Loosely-Coupled Full-System Multicore Simulation Framework. IEEE Transaction on Parallel and Distributed Systems (JPDC 2016, CCF B)

   • Xin Wang, Xiaofeng Ji, Yunping Lu, Yi Li, Weihua Zhang, Wenyun Zhao. Understanding the Architectural Characteristics of EDA Algorithms. The 45th International Conference on Parallel Processing (ICPP 2016, CCF B)

   • Yang Yu, Tianyang Lei, Weihua Zhang, Haibo Chen, Binyu Zang. Performance Analysis and Optimization of Full Garbage Collection in a Production JVM. The 12th Annual International Conference on Virtual Execution Environments (VEE 2016, CCF B)

2015

    • Wenjie Chen, Jin Yu, Weihua Zhang, Linhua Jiang, Guanhua Zhang, and Zhilei Chai. Parallel Implementation of Dense Optical Flow Computation on Many-Core Processor. 15th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP 2015, CCF C)

    • Yunping Lu, Xin Wang, Weihua Zhang and Wenyun Zhao. Characterizing MultiMedia Retrieval Applications. The 44th International Conference on Parallel Processing (ICPP 2015, Best Paper Award, CCF B)

    • Ying Zhang, Samuel Irving, Lu Peng, Xin Fu, David Koppelman, Weihua Zhang, Jesse Ardonne. Design space exploration for device and architectural heterogeneity in chip-multiprocessors. Microprocessors and Microsystems (MICPRO 2015, CCF C)

    • Weihua Zhang, Jiaxin Li, Yi Li, Haibo Chen. Multilevel Phase Analysis. ACM Transaction on Embedded Computing Systems (TECS 2015, CCF B)

2014

    • Zhuofang Dai, Zheng Zhang, Haojun Wang, Yi Li and Weihua Zhang. Parallelized Race Detection Based on GPU Architecture. ACM Transaction on Embedded Computing 2014 Annual Conference of Advanced Computer Architecture (ACA 2014, Best Paper Award)

    • Zhuofang Dai, Haojun Wang, Weihua Zhang, Haibo Chen and Binyu Zang. Hydra: Efficient Detection of Multiple Concurrency Bugs on Fused CPU-GPU Architecture. The 43rd International Conference on Parallel Processing (ICPP 2014, CCF B)

    • Haojun Wang, Qinghao Min, Weihua Zhang. RPSim: A Rapid Prototyping Full-system Simulator for SoC Software Development. The 9th IEEE International Conference on Networking, Architecture and Storage (NAS 2014)

    • Chien-Chih Chen, Yin-Chi Peng, Cheng-Fen Chen, Wei-Shan Wu, Qinghao Min, Pen-Chung Yew, Weihua Zhang, Tien-Fu Chen. DAPs: Dynamic Adjustment and Partial Sampling for Multithreaded/Multicore Simulation. Design Automaion Conference (DAC 2014, CCF A)

2013

    • Chen Dai, Chao Lv, Jiaxin Li, Weihua Zhang. Understanding Architectural Characteristics of Multimedia Retrieval Workloads. The ACM SIGMETRICS 2013 (SIGMETRICS 2013, CCF B)

    • Jiaxin Li, Weihua Zhang, Haibo Chen and Binyu Zang. Multi-level Phase Analysis for Sampling Simulation. Design, Automation and Test in Europe Conference and Exhibition (DATE 2013, CCF B)

2012

    • Peng Chen, Donglei Yang, Weihua Zhang, Yi Li, Haibo Chen and Binyu Zang. Adaptive Pipeline Parallelism for Image Feature Extraction Algorithms. In the 41st International Conference on Parallel Processing (ICPP 2012, CCF B)

    • Zhenman Fang, Jiaxin Li, Weihua Zhang, Yi Li, Haibo Chen, Binyu Zang. Improving Dynamic Prediction Accuracy Through Multi-level Phase Analysis. In proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2012, CCF B)

    • Zhenman Fang, Qinghao Min, Keyong Zhou, Yi Lu, Yibin Hu, Weihua Zhang, Haibo Chen, Jian Li, Binyu Zang. Transformer: A Functional-Driven Cycle-Accurate Multicore Simulator. The 49th Design Automation Conference (DAC 2012, CCF A)

    • Feiwen Zhu, Peng Chen, Donglei Yang, Weihua Zhang, Haibo Chen, Binyu Zang. A GPU-based High-throughput Image Retrieval AlgorithmA GPU-based High-throughput Image Retrieval Algorithm. The Fifth Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2012)

    • Yuan Zhang, Min Yang, Bo Zhou, Zhemin Yang, Weihua Zhang, Binyu Zang. Swift: A Register-based JIT Compiler for Embedded JVMs. The 8th Annual International Conference on Virtual Execution Environments (VEE 2012, CCF B)

2011

    • Zhaoguo Wang, Ran Liu, Yufei Chen, Xi Wu, Haibo Chen, Weihua Zhang, Binyu Zang. COREMU: a Scalable and Portable Parallel Full-system Emulator. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2011, CCF A)

    • Donglei Yang, Lili Liu, Feiwen Zhu, and Weihua Zhang. A parallel analysis on scale invariant feature transform (SIFT) algorithm. The 9th International Symposium on Advanced Parallel Processing Technologies (APPT 2011)

    • Zhenman Fang, Donglei Yang, Weihua Zhang, Haibo Chen, Binyu Zang. A Comprehensive Analysis and Parallelization of an Image Retrieval Algorithm. IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2011, CCF C)

2009

    • Weihua Zhang, Qiang Yan, Binyu Zang, Pen-Chung Yew. Hierarchical Phase Analysis for Sampling Simulations. The 18th International Conference on Parallel Architectures and Compilation Techniques (PACT 2009, CCF B)

    • Weihua Zhang, Lili Liu, Chen Zhang, Hongjiang Zhang, Binyu Zang and Chuanqi Zhu. Optimizing Techniques for Saturated Arithmetic with First-Order Linear Recurrence. The 24th Annual ACM Symposium on Applied Computing (SAC 2009)

    • Shengkai Zhu, Zhiwei Xiao, Haibo Chen, Rong Chen, Weihua Zhang and Binyu Zang. Evaluating SPLASH-2 benchmarks using Hadoop MapReduce. The 8th international Conference on Advanced Parallel Processing Technologies (APPT 2009)

2007

    • Qin Wang, Junpu Chen, Weihua Zhang and Binyu Zang. Optimizing Software Cache Performance of PacketProcessing Applications. In proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2007, CCF B)

    • Weihua Zhang, Tao Bao, Binyu Zang and Chuanqi Zhu. Optimizing Bandwidth Constraint through Register Interconnection for Stream Processors. The 6h International Conference on Parallel Architectures and Compilation Techniques (PACT 2007, CCF B)

2006

    • Weihua Zhang, Xinglong Qian, Ye Wang, Binyu Zang and Chuanqi Zhu. Optimizing Compiler for Shared-Memory Multiple SIMD Architecture. In proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2006, CCF B)

    • Weihua Zhang, Tao Bao, Binyu Zang and Chuanqi Zhu. Data Pipeline Optimization for Shared Memory Multiple-SIMD Architecture. The 19th InternationalWorkshop on Languages and Compilers for Parallel Computing (LCPC 2006)