Architectural Implications of Multiple Clock Domains,Multiple Voltage Domains and Dynamic Voltage-Frequency Scaling (MCD/MVD/DVFS)
Ran Ginosar
VLSI Systems Research Center, EE & CS Departments
Technion-Israel Institute of Technology, Haifa, Israel
Summary: (PDF)
Large multi-processor chips (aka CMP or MPSoC) may exploit the benefits of multiple clocks, multiple frequencies, multiple voltages, and their dynamic scaling. The tutorial discusses what these technologies are and how they may affect the architecture.
Outline
1. Introduction and problem formulation
2. Classifications: Types of multiple clock domains, types of multiple voltage domains, types of dynamic scaling of voltage, frequency and their combinations
3. Multiple clock domains
a. Clock distribution in multi-clock-domain chips
b. Limitations: clock power and delay variations
c. Solutions: Decomposed clocks
d. Synchronization of mutually-asynchronous clock domains
i. Metastability and synchronization failures: Theory, measurement, simulation, circuits and layout, temperature and voltage effects, failure probabilities and MTBF
ii. Synchronizers for mutually-asynchronous clock domains: circuits, methodologies, performance, other applications
e. Formal and informal verification of synchronization circuits
f. Synchronization of (same frequency) multi-synchronous clock domains in CMP: Adaptive synchronizers, self compensation for drifting phase. Predictive synchronizers in CMP.
g. Synchronization over long on-chip interconnects.
h. Synchronization in Networks-on-Chip.
i. Architectural implications of multiple clocks
4. Multiple voltage domains
a. Definitions
b. Sources of multiple voltages
c. Level shifting
d. Combining voltage and clock domains
i. Layout constraints
ii. Varying clock frequency with voltage
e. Architectural implications of multiple voltage domains
5. Dynamic Voltage and Frequency Scaling
a. Voltage scaling versus voltage and frequency scaling
b. Global versus multi-domain scaling
c. Methods of DVFS
d. Architectural implications of DVFS
6. Conclusions and Summary
Lecturer's biography
Ran Ginosar received his BSc from the Technion and his PhD from Princeton
University. After conducting research at AT&T Bell Laboratories, he joined
the Technion where he is now an Associate Professor at the Electrical Engineering
and Computer Science departments, and he headis the VLSI Systems Research Center.
Professor Ginosar has been a visiting Associate Professor with the University
of Utah and co-initiated the Asynchronous Architecture Research Project at Intel
(Oregon) during a two-year Sabbatical. He has co-founded a number of VLSI companies.
He has published numerous papers and patents on VLSI. His research interests
include VLSI architecture, asynchronous logic and synchronization.
Power Management Solutions for Computer Systems and Datacenters
Karthick Rajamani, Charles Lefurgy, Soraya Ghiasi, Juan Rubio, Heather Hanson
and Tom Keller
Power-Aware Systems, IBM Austin Research Lab
Abstract
Power and cooling are at the heart of many hurdles for the continued growth
of computer industry. Diminishing returns for CMOS technology scaling, mismatch between growth in computational density
to meet ever-growing demands and
growth in cooling capabilities, and the growing cost of facilities provisioning
for this demand have all come together
to create a near-crisis situation. Aspects of the problem have attained enough
visibility that not just customers, but
governments and environmental agencie have begun to demand new initiatives and
solutions.
This tutorial will discuss the chief characteristics of power and cooling problems
problems and how they dictate the
nature of solutions that are required. The focus will be on the process of developing
robust solutions for power
and cooling problems, with discussion of relevant technologies and mechanisms
employed. We will cover emerging
industrial solutions - how they address different aspects of the problem and
what technical ideas they exploit. Emerging
technologies in cooling and power distribution that can aid solutions will also
be discussed. The tutorial will also
present the increasing scope of solutions moving from the individual system
to the datacenter. We'll then close with
our view on emerging solutions for the larger context, addressing the link between
facilities provisioning and IT
management and the value of exploiting other rapidly growing management solutions
such as virtualization.
Outline: (PDF)
Introduction and Background
Motivation
o StatisticsConsiderations for the design of power management solutions
o Changing environment
o What is the problem?
o Multi-dimensional requirements
o Understanding the System – addressing variability
Power Management Concepts
Goals for power management solutions – concepts and examples
o Basic goalsImplementation of power management solutions
o Advanced functions
o Future solutions
o Sensors and actuators
o Feedback-driven management
o Model-assisted methods
Industry state-of-the-art
Industry state-of-the-art
New technologies for power distribution and cooling
Datacenter and IT Management
Facilities management
o Anatomy of a datacenterIT management
o Opportunities and technologies for improving efficiency
o Current generation solutions
o Problems and potential solutions
Case study of integrated management
Virtualizationo New problems
o Enabling efficient usage and sophisticated solutions
Other Open Issues
Challenges in orchestrating power management solutions
Questions
Feedback
High-Speed Network Architectures
for Clusters: Designs and Trends
D. K. Panda (The Ohio State University) and P. Balaji (Argonne National Laboratory)
Abstract
High-speed network architectures such as InfiniBand (IB) and 10-Gigabit Ethernet
(10GE) are generating a lot of excitement towards building next generation High-End
Computing (HEC) systems. This tutorial will provide an in-depth look at this
emerging trend and examine the suitability of these network architectures for
prime-time HEC. It will start with a brief overview of some of the latest network
architectures including IB, 10GE, Myrinet 10G (which is a recent addition to
the 10GE family), and the ConnectX architecture which uniformly deals with both
the IB and 10GE families, together with their architectural features. An overview
of the emerging software stack which encapsulates some of the architectures
in a unified manner will be presented. Hardware/software solutions for different
networks and the market trends will be highlighted. Challenges in designing
different kinds of systems using these standards on multi-core platforms for
performance, scalability, portability and reliability will be covered. Specifically,
case studies and experiences in designing HPC clusters (with MPI-1, MPI-2 and
Sockets programming models), Parallel File Systems, Networked File Systems (NFS),
Storage Protocols, Multitier Datacenters, and Virtualization schemes will be
presented together with the associated performance numbers and comparisons.
The tutorial is organized along the following topics: (PDF)
1. What are IB and 10GE?
* TCP vs. User-level communication protocols
* Requirements (communication, I/O, performance, cost, RAS) from the perspective of designing next generation high-end systems and scalable data centers
2. Short Overview of InfiniBand Architecture
* Architecture and Basic Components
* Communication and I/O Operations
* Transport Layer/Services and Reliability
* Advanced Features (Keys, Protection Domains, Partitioning, Virtual Lanes, QoS Mechanisms and Multicast)
* Software Transport Interfaces and Management Services
* InfiniBand 1.2 specification and highlights
3. Short Overview of 10-Gigabit Ethernet
* Architecture and Basic Components
* Communication Operations
* Advanced Features
* Software Interfaces and Management Services
4. Convergence between InfiniBand and 10-Gigabit Ethernet through the OpenFabrics stack
* Software Interfaces and Management Services
+ Lower-level primitives
+ Upper-level (MPI, SDP, IPoIB, SRP, iSER, uDAPL, kDAPL)* Subnet Management
* Unified Connection Management Support through RDMA CM
5. Overview of IB and 10GE Products (hardware and software), Time-frames, and Market Trends
* Vendors, Switches, and Host Channel Adapters
* Overview of ConnectX architecture
* Pointers to IB and 10GE installations
6. Designing High-end Systems with IB and 10GE: Research Challenges, Case Studies and Performance Evaluation
* High-end clusters with MPI-1 and MPI-2
* Storage and File Systems (PVFS, Lustre, NFS over RDMA, pNFS)
* Multi-tier Datacenters
* Virtualization Support (Xen-IB)
7. Conclusions, Final Q&A, and Discussion
About Prof. Panda: Dhabaleswar K. (DK) Panda is a Professor of Computer Science
at the Ohio State University. He obtained his Ph.D. in computer engineering
from the University of Southern California. His research interests include parallel
computer architecture, high performance computing, communication protocols, files systems,
network-based computing, and Quality of Service. He has published over 225 papers
in major journals and international conferences related to these research areas.
Dr. Panda and his research group members have been doing extensive research
on modern networking technologies including InfiniBand and 10GigE/iWARP. His
research group is currently collaborating with National Laboratories and leading
InfiniBand and 10GigE/iWARP companies on designing various subsystems of next
generation high-end systems. The MVAPICH/MVAPICH2 (High Performance MPI over
InfiniBand and iWARP) open-source software packages, developed by his research
group (http://mvapich.cse.ohio-state.edu),
are currently being used by more than 580 organizations worldwide (in 42 countries).
This software has enabled several InfiniBand clusters (including the 3rd ranked
one) to get into the latest TOP500 ranking. These software packages are also
available with the Open Fabrics stack for network vendors (InfiniBand
and iWARP), server vendors and Linux distributors. Dr. Panda's research is supported
by funding from US National Science Foundation, US Department of Energy, and
several industry including Intel, Cisco, SUN, Mellanox, NetApp and Linux Networx.
He is an IEEE Fellow and a member of ACM.
About Dr. Balaji: Pavan Balaji holds a joint appointment as a post-doctoral
researcher at the Argonne National Laboratory and as a fellow of the Computation
Institute at the University of Chicago. Dr. Balaji had received his Ph.D. from
the Computer Science and Engineering department at the Ohio State University.
His research interests include high-speed interconnects, efficient protocol
stacks, parallel programming models and middleware and job scheduling and resource
management. He has more than 25 publications in these areas. Dr. Balaji has
also served as a Program Committee Member and Technical Referee on numerous
International conferences and journals (ICPP, HiPC, ICCCN, TPDS, TC, JPDC).
He has delivered multiple talks
at International conferences and as an invited speaker at various research institutions.
He has also been a tutorial co-presenter at
Supercomputing '05, CCGrid '07, Cluster '07 and Supercomputing '07. He is a
member of the IEEE and ACM. More details about Dr. Balaji, including a comprehensive
CV, are available at (http://www.mcs.anl.gov/~balaji).