复旦大学软件学院 教授
并行处理研究所
复旦大学
个人简介
我是复旦大学并行处理研究所的教授。我的研究领域包括计算机体系结构、软件调试和编译器优化。我于2007年获复旦大学计算机科学系博士研究生学位。
目前我在招收具有良好体系结构和系统相关领域背景的学生。如果你有兴趣和我一起工作,给我发邮件。
研究领域
计算机体系结构、软件调试和编译器优化等
教育
· 2007 博士研究生,复旦大学
· 2000 硕士研究生,江南计算技术研究所
· 1997 本科生,解放军信息工程大学
奖项
· 2015 ICPP最佳论文
· 2014 ACA最佳论文
· 2007 计算机学会优秀博士论文奖
邮箱
zhangweihua at fudan dot edu dot cn
办公室
电话:(86 21)51355358
传真:(86 21)51355358
地址
上海市淞沪路2005号 复旦大学 交叉二号楼2015室
Professional Service
· Vice PC Chair: NAS 2013
· Publicity Chair: COSMIC 2013 (International Workshop on Code Optimization for Multi and many Cores)
· PC Member(Conference): ICPP 2016, NPC 2016, ACA 2016, ICPP 2015, I-SPAN 2014, HPC China 2015, HPC China 2014, ACA 2014, HPCC 2013, ICSNC 2013, ICSNC 2012
· PC Member(Workshop):LSPP (Workshop on Large-Scale Parallel Processing) 2012, EMS (International Workshop on Embedded Multicore Systems) 2011
课程
· 高级计算机体系结构,秋季,2010,2011
· 计算机体系结构,春季,2007,2008,2009,2010,2011,2012
· 编译原理,秋季,2007,2008
论文列表
2020
|
MICRO |
More with Less — Deriving More Translation Rules with Less Training Data for DBTs Using Parameterization
|
Jinhu Jiang, Rongchao Dong, Zhongjun Zhou, Changheng Song, Wenwen Wang, Pen-Chung Yew, Weihua Zhang |
To appear in IEEE/ACM International Symposium on Microarchitecture (MICRO) |
|
|
TPDS |
Architectural Support for NVRAM Persistence in GPUs
|
Sui Chen, Lei Liu, Weihua Zhang, Lu Peng |
IEEE Transactions on Parallel and Distributed Systems (TPDS)Volume: 31, Issue: 5, May 1 2020 |
|
|
TPDS |
A High Throughput B+tree for SIMD architectures
|
Weihua Zhang, Zhaofeng Yan, Yuzhe Lin, Chuanlei Zhao, Lu Peng |
IEEE Transactions on Parallel and Distributed Systems (TPDS)Volume: 31, Issue: 3, March 1 2020 |
|
2019
|
ATC |
Unleashing the Power of Learning: An Enhanced Learning-based Approach for Dynamic Binary Translation
|
Changheng Song, Wenwen Wang, Pen-Chung Yew, Weihua Zhang |
The 2019 USENIX Annual Technical Conference (ATC’19) |
|
|
PPoPP |
Harmonia: A High Throughput B+ Tree for GPUs
|
Zhaofeng Yan, Yuzhe Lin, Lu Peng, Weihua Zhang |
The 24th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming(PPoPP 2019) |
|
2018
|
FCS |
Computer comparisons in the presence of performance variation |
Samuel Irving,Bin Li,Shaoming Chen,Lu Peng,Weihua Zhang,Lide Duan |
Frontiers of Computer Science(FCS) |
|
|
TPDS |
Scaling Concurrent Index Structures under Contention Using HTM |
Weihua Zhang, Xin Wang, Shiyu Ji, Ziyun Wei, Zhaoguo Wang, Haibo Chen |
IEEE Transactions on Parallel and Distributed Systems(TPDS)Volume: 29,Issue: 8,Aug 1 2018
|
|
|
TCAD |
qSwitch: Dynamical Off-Chip Bandwidth Allocation between Local and Remote Accesses |
Shaoming Chen, Lu Peng, Samuel Irving, Zhou Zhao, Weihua Zhang and Ashok Srivastava |
IEEE Transactions on on Computer-Aided Design of Integrated Circuits and Systems(TCAD), Volume: 37, Issue: 1, Jan. 2018 |
|
2017
|
TPDS |
Prophet: A Parallel Instruction-Oriented Many-Core Simulator |
Weihua Zhang, Xiaofeng Ji, Yunping Lu, Haojun Wang, Haibo Chen, Pen-Chung Yew |
IEEE Transaction on Parallel and Distributed Systems (TPDS) |
|
|
TPDS |
VarCatcher: A Framework for Tackling Performance Variability of Parallel Workloads on Multi-core |
Weihua Zhang, Xiaofeng Ji, Bo Song, Shiqiang Yu, Haibo Chen, Pen-Chung Yew, Tao Li, Wenyun Zhao |
IEEE Transaction on Parallel and Distributed Systems (TPDS) Volume: 28, Issue: 4, April 1 2017 |
|
|
PPoPP |
Eunomia: Scaling Concurrent Search Trees under Contention Using HTM |
Xin Wang, Weihua Zhang, Zhaoguo Wang, Ziyun Wei, Haibo Chen, Wenyun Zhao |
The 22nd ACM SIGPLAN Symposium on Principle and Practice of Parallel Computing (PPoPP 2017). |
|
2016
|
TPDS |
Performance Analysis of Multimedia Retrieval Workloads Running on Multicore |
Yunping Lu, Xin Wang, Weihua Zhang, Haibo Chen, Lu Peng, Wenyun Zhao |
IEEE Transaction on Parallel and Distributed Systems (TPDS) Volume: 27, Nov 2016 |
|
|
TC |
Hardware Support for Concurrent Detection of Multiple Concurrency Bugs on Fused CPU-GPU Architectures |
Weihua Zhang, Shiqiang Yu, Haojun Wang, Zhuofang Dai, Haibo Chen |
IEEE Transactions on Computers (TC) Volume: 65, No. 10, October 2016 |
|
|
TPDS |
A Loosely-Coupled Full-System Multicore Simulation Framework |
Weihua Zhang, Haojun Wang, Yunping Lu, Haibo Chen and Wenyun Zhao |
IEEE Transaction on Parallel and Distributed Systems (TPDS) Volume: 27, Issue: 6, June 1 2016 |
|
|
ICPP |
Understanding the Architectural Characteristics of EDA Algorithms |
Xin Wang, Xiaofeng Ji, Yunping Lu, Yi Li, Weijia Zhou, Weihua Zhang, Wenyun Zhao |
The 45th International Conference on Parallel Processing (ICPP) |
|
|
JPDC |
Parallelizing Image Feature Extraction Algorithms on Multi-core Platforms |
Yunping Lu, Yi Li, Bo Song, Weihua Zhang, Haibo Chen, Lu Peng |
Journal of Parallel and Distributed Computing (JPDC) Volume: 92, May 2016 |
|
|
VEE |
Performance Analysis and Optimization of Full Garbage Collection in a Production JVM |
Yang Yu, Tianyang Lei, Weihua Zhang, Haibo Chen, Binyu Zang |
The 12th Annual International Conference on Virtual Execution Environments (VEE2016) |
|
2015
|
ICA3PP |
Parallel Implementation of Dense Optical Flow Computation on Many-Core Processor |
Wenjie Chen, Jin Yu, Weihua Zhang, Linhua Jiang, Guanhua Zhang, and Zhilei Chai |
15th International Conference on Algorithms and Architectures for Parallel Processing(ICA3PP 2015) |
|
|
ICPP |
Characterizing MultiMedia Retrieval Applications |
Yunping Lu, Xin Wang, Weihua Zhang, Yi Li and Wenyun Zhao |
The 44th International Conference on Parallel Processing (ICPP, Best Paper Award) |
|
|
TECS |
Multi-level Phase Analysis |
Weihua Zhang, Jiaxin Li, Yi Li, Haibo Chen |
ACM Transactions on Embedded Computing Systems (TECS) Volume: 14 Issue: 2, March 2015 |
|
2014
|
ACA |
Parallelized Race Detection Based on GPU Architecture |
Zhuofang Dai, Zheng Zhang, Haojun Wang, Yi Li and Weihua Zhang |
2014 Annual Conference of Advanced Computer Architecture (ACA 2014, Best Paper Award) |
|
|
ICPP |
Hydra: Efficient Detection of Multiple Concurrency Bugs on Fused CPU-GPU Architecture |
Zhuofang Dai, Haojun Wang, Weihua Zhang, Haibo Chen and Binyu Zang |
The 43rd International Conference on Parallel Processing (ICPP) |
|
|
NAS |
RPSim: A Rapid Prototyping Full-system Simulator for SoC Software Development |
Haojun Wang, Qinghao Min, Weihua Zhang |
The 9th IEEE International Conference on Networking, Architecture and Storage (NAS) |
|
|
DAC |
DAPs: Dynamic Adjustment and Partial Sampling for Multithreaded/Multicore Simulation |
Chien-Chih Chen, Yin-Chi Peng, Cheng-Fen Chen, Wei-Shan Wu, Qinghao Min, Pen-Chung Yew, Weihua Zhang, Tien-Fu Chen |
Design Automaion Conference (DAC), San Francisco, June 1 – 5, 2014 |
|
2013
|
SIGMETRICS |
Understanding Architectural Characteristics of Multimedia Retrieval Workloads |
Chen Dai, Chao Lv, Jiaxin Li, Weihua Zhang |
The ACM SIGMETRICS 2013 (POSTER), PA, June 17 – 21, 2013 |
|
|
DATE |
Multi-level Phase Analysis for Sampling Simulation |
Jiaxin Li, Weihua Zhang, Haibo Chen and Binyu Zang |
Design, Automation & Test in Europe Conference & Exhibition (DATE 2013). Grenoble, France, March, 2013 |
|
2012
|
ICPP |
Adaptive Pipeline Parallelism for Image Feature Extraction Algorithms |
Peng Chen, Donglei Yang, Weihua Zhang, Yi Li, Haibo Chen and Binyu Zang |
In the 41st International Conference on Parallel Processing (ICPP 2012). PA, USA, September, 2012 |
|
|
LCTES |
Improving Dynamic Prediction Accuracy Through Multi-level Phase Analysis |
Zhenman Fang, Jiaxin Li, Weihua Zhang, Yi Li, Haibo Chen, Binyu Zang |
In proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2012) |
|
|
DAC |
Transformer: A Functional-Driven Cycle-Accurate Multicore Simulator |
Zhenman Fang, Qinghao Min, Keyong Zhou, Yi Lu, Yibin Hu, Weihua Zhang, Haibo Chen, Jian Li, Binyu Zang |
The 49th Design Automation Conference (DAC 2012) San Francisco, USA, June, 2012 |
|
|
GPGPU |
A GPU-based High-throughput Image Retrieval AlgorithmA GPU-based High-throughput Image Retrieval Algorithm |
Feiwen Zhu, Peng Chen, Donglei Yang, Weihua Zhang, Haibo Chen, Binyu Zang |
The Fifth Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 5) collocated with ASPLOS 2012 |
|
|
VEE |
Swift: A Register-based JIT Compiler for Embedded JVMs |
Yuan Zhang, Min Yang, Bo Zhou, Zhemin Yang, Weihua Zhang, Binyu Zang |
The 8th Annual International Conference on Virtual Execution Environments (VEE 2012). London, United Kingdom |
|
2011
|
PPOPP |
COREMU: a Scalable and Portable Parallel Full-system Emulator |
Zhaoguo Wang, Ran Liu, Yufei Chen, Xi Wu, Haibo Chen, Weihua Zhang, Binyu Zang |
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2011). San Antonio, USA, February, 2011 |
|
|
APPT |
A parallel analysis on scale invariant feature transform (SIFT) algorithm |
Donglei Yang, Lili Liu, Feiwen Zhu, and Weihua Zhang |
The 9th International Symposium on Advanced Parallel Processing Technologies (APPT 2011). Shanghai, China |
|
|
ISPASS |
A Comprehensive Analysis and Parallelization of an Image Retrieval Algorithm |
Zhenman Fang, Donglei Yang, Weihua Zhang, Haibo Chen, Binyu Zang |
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2011). Austin TX, USA, April, 2011 |
|
2009
|
PACT |
Hierarchical Phase Analysis for Sampling Simulations |
Weihua Zhang, Qiang Yan, Binyu Zang, Pen-Chung Yew |
The 18th International Conference on Parallel Architectures and Compilation Techniques (PACT 2009), POSTER |
|
|
SAC |
Optimizing Techniques for Saturated Arithmetic with First-Order Linear Recurrence |
Weihua Zhang, Lili Liu, Chen Zhang, Hongjiang Zhang, Binyu Zang and Chuanqi Zhu |
The 24th Annual ACM Symposium on Applied Computing (SAC 2009) Programming Language Track. Honolulu, Hawaii, USA |
|
|
APPT |
Evaluating SPLASH-2 benchmarks using Hadoop MapReduce |
Shengkai Zhu, Zhiwei Xiao, Haibo Chen, Rong Chen, Weihua Zhang and Binyu Zang |
The 8th international Conference on Advanced Parallel Processing Technologies (APPT 2009). Rapperswil, Switzerland. August, 2009 |
|
2007
|
LCTES |
Optimizing Software Cache Performance of PacketProcessing Applications |
Qin Wang, Junpu Chen, Weihua Zhang and Binyu Zang |
In proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2007) |
|
|
PACT |
Optimizing Bandwidth Constraint through Register Interconnection for Stream Processors |
Weihua Zhang, Tao Bao, Binyu Zang and Chuanqi Zhu |
The 6h International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), Poster, Brasov, Romania |
|
|
LCTES |
Optimizing Compiler for Shared-Memory Multiple SIMD Architecture |
Weihua Zhang, Xinglong Qian, Ye Wang, Binyu Zang and Chuanqi Zhu |
In proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2006) |
|
|
LCPC |
Data Pipeline Optimization for Shared Memory Multiple-SIMD Architecture |
Weihua Zhang, Tao Bao, Binyu Zang and Chuanqi Zhu |
The 19th InternationalWorkshop on Languages and Compilers for Parallel Computing (LCPC 2006) |
|