2023
• Weihua Zhang, Chuanlei Zhao, Lu Peng, Yuzhe Lin, Fengzhe Zhang, Yunping Lu. Boosting Performance and QoS for Concurrent GPU B+trees by Combining-based Synchronization. (Open Source, PPoPP, CCF A)
2020
• Jinhu Jiang, Rongchao Dong, Zhongjun Zhou, Changheng Song, Wenwen Wang, Pen-Chung Yew, Weihua Zhang. More with Less — Deriving More Translation Rules with Less Training Data for DBTs Using Parameterization. (Open Source, MICRO, CCF A)
• Sui Chen, Lei Liu, Weihua Zhang, and Lu Peng. Architectural Support for NVRAM Persistence in GPUs. (TPDS, CCF A)
• Weihua Zhang,Zhaofeng Yan,Yuzhe Lin,Chuanlei Zhao,Lu Peng. A High Throughput B+tree for SIMD architectures. (TPDS, CCF A)
2019
• Changheng Song, Wenwen Wang, Pen-Chung Yew, Weihua Zhang. Unleashing the Power of Learning: An Enhanced Learning-based Approach for Dynamic Binary Translation (ATC, CCF A)
• Zhaofeng Yan, Yuzhe Lin, Lu Peng, Weihua Zhang. Harmonia: A High Throughput B+ Tree for GPUs. (Open Source, PPoPP, CCF A)
2018
• Weihua Zhang, Xin Wang, Shiyu Ji, Ziyun Wei, Zhaoguo Wang, Haibo Chen. Scaling Concurrent Index Structures under Contention Using HTM (TPDS, CCF A)
• Shaoming Chen, Lu Peng, Samuel Irving, Zhou Zhao, Weihua Zhang and Ashok Srivastava. qSwitch: Dynamical Off-Chip Bandwidth Allocation between Local and Remote Accesses (TCAD, CCF A)
• Samuel Irving,Bin Li,Shaoming Chen,Lu Peng,Weihua Zhang,Lide Duan. Computer comparisons in the presence of performance variation (FCS, CCF C)
2017
| TPDS | Prophet: A Parallel Instruction-Oriented Many-Core Simulator
| Weihua Zhang, Xiaofeng Ji, Yunping Lu, Haojun Wang, Haibo Chen, Pen-Chung Yew | IEEE Transactions on Parallel and Distributed Systems (TPDS) volume: 28,Issue:10,Oct 1 2017
|
|
| TPDS | VarCatcher: A Framework for Tackling Performance Variability of Parallel Workloads on Multi-core | Weihua Zhang, Xiaofeng Ji, Bo Song, Shiqiang Yu, Haibo Chen, Pen-Chung Yew, Tao Li, Wenyun Zhao | IEEE Transactions on Parallel and Distributed Systems (TPDS) Volume: 28, Issue: 4, April 1 2017 |
|
| PPoPP | Eunomia: Scaling Concurrent Search Trees under Contention Using HTM | Xin Wang, Weihua Zhang, Zhaoguo Wang, Ziyun Wei, Haibo Chen, Wenyun Zhao | The 22nd ACM SIGPLAN Symposium on Principle and Practice of Parallel Computing (PPoPP 2017). |
|
2016
| TPDS | Performance Analysis of Multimedia Retrieval Workloads Running on Multicore | Yunping Lu, Xin Wang, Weihua Zhang, Haibo Chen, Lu Peng, Wenyun Zhao | IEEE Transaction on Parallel and Distributed Systems (TPDS) Volume: 27, Nov 2016 |
|
| TC | Hardware Support for Concurrent Detection of Multiple Concurrency Bugs on Fused CPU-GPU Architectures | Weihua Zhang, Shiqiang Yu, Haojun Wang, Zhuofang Dai, Haibo Chen | IEEE Transactions on Computers (TC) Volume: 65, No. 10, October 2016 |
|
| TPDS | A Loosely-Coupled Full-System Multicore Simulation Framework | Weihua Zhang, Haojun Wang, Yunping Lu, Haibo Chen and Wenyun Zhao | IEEE Transaction on Parallel and Distributed Systems (TPDS) Volume: 27, Issue: 6, June 1 2016 |
|
| ICPP | Understanding the Architectural Characteristics of EDA Algorithms | Xin Wang, Xiaofeng Ji, Yunping Lu, Yi Li, Weijia Zhou, Weihua Zhang, Wenyun Zhao | The 45th International Conference on Parallel Processing (ICPP) |
|
| JPDC | Parallelizing Image Feature Extraction Algorithms on Multi-core Platforms | Yunping Lu, Yi Li, Bo Song, Weihua Zhang, Haibo Chen, Lu Peng | Journal of Parallel and Distributed Computing (JPDC) Volume: 92, May 2016 |
|
| VEE | Performance Analysis and Optimization of Full Garbage Collection in a Production JVM | Yang Yu, Tianyang Lei, Weihua Zhang, Haibo Chen, Binyu Zang | The 12th Annual International Conference on Virtual Execution Environments (VEE2016) |
|
2015
| ICPP | Characterizing MultiMedia Retrieval Applications | Yunping Lu, Xin Wang, Weihua Zhang, Yi Li and Wenyun Zhao | The 44th International Conference on Parallel Processing (ICPP, Best Paper Award) |
|
| TECS | Multi-level Phase Analysis | Weihua Zhang, Jiaxin Li, Yi Li, Haibo Chen | ACM Transactions on Embedded Computing Systems (TECS) Volume: 14, Issue: 2, March 2015 |
|
2014
| ACA | Parallelized Race Detection Based on GPU Architecture | Zhuofang Dai, Zheng Zhang, Haojun Wang, Yi Li and Weihua Zhang | 2014 Annual Conference of Advanced Computer Architecture (ACA 2014, Best Paper Award) |
|
| ICPP | Hydra: Efficient Detection of Multiple Concurrency Bugs on Fused CPU-GPU Architecture | Zhuofang Dai, Haojun Wang, Weihua Zhang, Haibo Chen and Binyu Zang | The 43rd International Conference on Parallel Processing (ICPP) |
|
| NAS | RPSim: A Rapid Prototyping Full-system Simulator for SoC Software Development | Haojun Wang, Qinghao Min, Weihua Zhang | The 9th IEEE International Conference on Networking, Architecture and Storage (NAS) |
|
| DAC | DAPs: Dynamic Adjustment and Partial Sampling for Multithreaded/Multicore Simulation | Chien-Chih Chen, Yin-Chi Peng, Cheng-Fen Chen, Wei-Shan Wu, Qinghao Min, Pen-Chung Yew, Weihua Zhang, Tien-Fu Chen | Design Automaion Conference (DAC), San Francisco, June 1 – 5, 2014 |
|
2013
| SIGMETRICS | Understanding Architectural Characteristics of Multimedia Retrieval Workloads | Chen Dai, Chao Lv, Jiaxin Li, Weihua Zhang | The ACM SIGMETRICS 2013 (POSTER), PA, June 17 – 21, 2013 |
|
| DATE | Multi-level Phase Analysis for Sampling Simulation | Jiaxin Li, Weihua Zhang, Haibo Chen and Binyu Zang | Design, Automation & Test in Europe Conference & Exhibition (DATE 2013). Grenoble, France, March, 2013 |
|
2012
| ICPP | Adaptive Pipeline Parallelism for Image Feature Extraction Algorithms | Peng Chen, Donglei Yang, Weihua Zhang, Yi Li, Haibo Chen and Binyu Zang | In the 41st International Conference on Parallel Processing (ICPP 2012). PA, USA, September, 2012 |
|
| LCTES | Improving Dynamic Prediction Accuracy Through Multi-level Phase Analysis | Zhenman Fang, Jiaxin Li, Weihua Zhang, Yi Li, Haibo Chen, Binyu Zang | In proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2012) |
|
| DAC | Transformer: A Functional-Driven Cycle-Accurate Multicore Simulator | Zhenman Fang, Qinghao Min, Keyong Zhou, Yi Lu, Yibin Hu, Weihua Zhang, Haibo Chen, Jian Li, Binyu Zang | The 49th Design Automation Conference (DAC 2012) San Francisco, USA, June, 2012 |
|
| GPGPU | A GPU-based High-throughput Image Retrieval AlgorithmA GPU-based High-throughput Image Retrieval Algorithm | Feiwen Zhu, Peng Chen, Donglei Yang, Weihua Zhang, Haibo Chen, Binyu Zang | The Fifth Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 5) collocated with ASPLOS 2012 |
|
| VEE | Swift: A Register-based JIT Compiler for Embedded JVMs | Yuan Zhang, Min Yang, Bo Zhou, Zhemin Yang, Weihua Zhang, Binyu Zang | The 8th Annual International Conference on Virtual Execution Environments (VEE 2012). London, United Kingdom |
|
2011
| PPoPP | COREMU: a Scalable and Portable Parallel Full-system Emulator | Zhaoguo Wang, Ran Liu, Yufei Chen, Xi Wu, Haibo Chen, Weihua Zhang, Binyu Zang | ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2011). San Antonio, USA, February, 2011 |
|
| APPT | A parallel analysis on scale invariant feature transform (SIFT) algorithm | Donglei Yang, Lili Liu, Feiwen Zhu, and Weihua Zhang | The 9th International Symposium on Advanced Parallel Processing Technologies (APPT 2011). Shanghai, China |
|
| ISPASS | A Comprehensive Analysis and Parallelization of an Image Retrieval Algorithm | Zhenman Fang, Donglei Yang, Weihua Zhang, Haibo Chen, Binyu Zang | IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2011). Austin TX, USA, April, 2011 |
|
2009
| PACT | Hierarchical Phase Analysis for Sampling Simulations | Weihua Zhang, Qiang Yan, Binyu Zang, Pen-Chung Yew | The 18th International Conference on Parallel Architectures and Compilation Techniques (PACT 2009), POSTER |
|
| SAC | Optimizing Techniques for Saturated Arithmetic with First-Order Linear Recurrence | Weihua Zhang, Lili Liu, Chen Zhang, Hongjiang Zhang, Binyu Zang and Chuanqi Zhu | The 24th Annual ACM Symposium on Applied Computing (SAC 2009) Programming Language Track. Honolulu, Hawaii, USA |
|
| APPT | Evaluating SPLASH-2 benchmarks using Hadoop MapReduce | Shengkai Zhu, Zhiwei Xiao, Haibo Chen, Rong Chen, Weihua Zhang and Binyu Zang | The 8th international Conference on Advanced Parallel Processing Technologies (APPT 2009). Rapperswil, Switzerland. August, 2009 |
|
2007
| LCTES | Optimizing Software Cache Performance of PacketProcessing Applications | Qin Wang, Junpu Chen, Weihua Zhang and Binyu Zang | In proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2007) |
|
| PACT | Optimizing Bandwidth Constraint through Register Interconnection for Stream Processors | Weihua Zhang, Tao Bao, Binyu Zang and Chuanqi Zhu | The 6h International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), Poster, Brasov, Romania |
|
2006
| LCTES | Optimizing Compiler for Shared-Memory Multiple SIMD Architecture | Weihua Zhang, Xinglong Qian, Ye Wang, Binyu Zang and Chuanqi Zhu | In proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2006) |
|
| LCPC | Data Pipeline Optimization for Shared Memory Multiple-SIMD Architecture | Weihua Zhang, Tao Bao, Binyu Zang and Chuanqi Zhu | The 19th InternationalWorkshop on Languages and Compilers for Parallel Computing (LCPC 2006) |
|