Architectural Implications of Multiple Clock Domains,Multiple Voltage Domains and Dynamic Voltage-Frequency Scaling (MCD/MVD/DVFS)

Ran Ginosar
VLSI Systems Research Center, EE & CS Departments
Technion-Israel Institute of Technology, Haifa, Israel

Summary: (PDF)

Large multi-processor chips (aka CMP or MPSoC) may exploit the benefits of multiple clocks, multiple frequencies, multiple voltages, and their dynamic scaling. The tutorial discusses what these technologies are and how they may affect the architecture.

 

Outline

1.     Introduction and problem formulation

2.     Classifications: Types of multiple clock domains, types of multiple voltage domains, types of dynamic scaling of voltage, frequency and their combinations

3.     Multiple clock domains

a.      Clock distribution in multi-clock-domain chips

b.     Limitations: clock power and delay variations

c.      Solutions: Decomposed clocks

d.     Synchronization of mutually-asynchronous clock domains

                                        i.     Metastability and synchronization failures: Theory, measurement, simulation, circuits and layout, temperature and voltage effects, failure probabilities and MTBF

                                        ii.     Synchronizers for mutually-asynchronous clock domains: circuits, methodologies, performance, other applications

e.      Formal and informal verification of synchronization circuits

f.       Synchronization of (same frequency) multi-synchronous clock domains in CMP: Adaptive synchronizers, self compensation for drifting phase. Predictive synchronizers in CMP.

g.      Synchronization over long on-chip interconnects.

h.      Synchronization in Networks-on-Chip.

i.       Architectural implications of multiple clocks

4.     Multiple voltage domains

a.      Definitions

b.     Sources of multiple voltages

c.      Level shifting

d.     Combining voltage and clock domains

                                        i.     Layout constraints

                                        ii.     Varying clock frequency with voltage

e.      Architectural implications of multiple voltage domains

5.     Dynamic Voltage and Frequency Scaling

a.      Voltage scaling versus voltage and frequency scaling

b.     Global versus multi-domain scaling

c.      Methods of DVFS

d.     Architectural implications of DVFS

6.     Conclusions and Summary

Lecturer's biography
Ran Ginosar received his BSc from the Technion and his PhD from Princeton University. After conducting research at AT&T Bell Laboratories, he joined the Technion where he is now an Associate Professor at the Electrical Engineering and Computer Science departments, and he headis the VLSI Systems Research Center. Professor Ginosar has been a visiting Associate Professor with the University of Utah and co-initiated the Asynchronous Architecture Research Project at Intel (Oregon) during a two-year Sabbatical. He has co-founded a number of VLSI companies.  He has published numerous papers and patents on VLSI. His research interests include VLSI architecture, asynchronous logic and synchronization. 


Power Management Solutions for Computer Systems and Datacenters

Karthick Rajamani, Charles Lefurgy, Soraya Ghiasi, Juan Rubio, Heather Hanson and Tom Keller
Power-Aware Systems, IBM Austin Research Lab

Abstract
Power and cooling are at the heart of many hurdles for the continued growth of computer industry. Diminishing returns for CMOS technology scaling, mismatch between growth in computational density to meet ever-growing demands and growth in cooling capabilities, and the growing cost of facilities provisioning for this demand have all come together to create a near-crisis situation. Aspects of the problem have attained enough visibility that not just customers, but governments and environmental agencie have begun to demand new initiatives and solutions. This tutorial will discuss the chief characteristics of power and cooling problems problems and how they dictate the nature of solutions that are required. The focus will be on the process of developing robust solutions for power and cooling problems, with discussion of relevant technologies and mechanisms employed. We will cover emerging industrial solutions - how they address different aspects of the problem and what technical ideas they exploit. Emerging technologies in cooling and power distribution that can aid solutions will also be discussed. The tutorial will also present the increasing scope of solutions moving from the individual system to the datacenter. We'll then close with our view on emerging solutions for the larger context, addressing the link between facilities provisioning and IT management and the value of exploiting other rapidly growing management solutions such as virtualization.

Outline: (PDF)

Introduction and Background

Motivation

o Statistics
o Changing environment
Considerations for the design of power management solutions
o What is the problem?
o Multi-dimensional requirements
o Understanding the System – addressing variability

 

Power Management Concepts

Goals for power management solutions – concepts and examples

o Basic goals
o Advanced functions
o Future solutions
Implementation of power management solutions
o Sensors and actuators
o Feedback-driven management
o Model-assisted methods

 

Industry state-of-the-art

Industry state-of-the-art
New technologies for power distribution and cooling

 

Datacenter and IT Management

Facilities management

o Anatomy of a datacenter
o Opportunities and technologies for improving efficiency
IT management
o Current generation solutions
o Problems and potential solutions

Case study of integrated management

Virtualization
o New problems
o Enabling efficient usage and sophisticated solutions

 

Other Open Issues

Challenges in orchestrating power management solutions
Questions
Feedback


High-Speed Network Architectures for Clusters: Designs and Trends

D. K. Panda (The Ohio State University) and P. Balaji (Argonne National Laboratory)

Abstract
High-speed network architectures such as InfiniBand (IB) and 10-Gigabit Ethernet (10GE) are generating a lot of excitement towards building next generation High-End Computing (HEC) systems. This tutorial will provide an in-depth look at this emerging trend and examine the suitability of these network architectures for prime-time HEC. It will start with a brief overview of some of the latest network architectures including IB, 10GE, Myrinet 10G (which is a recent addition to the 10GE family), and the ConnectX architecture which uniformly deals with both the IB and 10GE families, together with their architectural features. An overview of the emerging software stack which encapsulates some of the architectures in a unified manner will be presented. Hardware/software solutions for different networks and the market trends will be highlighted. Challenges in designing different kinds of systems using these standards on multi-core platforms for performance, scalability, portability and reliability will be covered. Specifically, case studies and experiences in designing HPC clusters (with MPI-1, MPI-2 and Sockets programming models), Parallel File Systems, Networked File Systems (NFS), Storage Protocols, Multitier Datacenters, and Virtualization schemes will be presented together with the associated performance numbers and comparisons.

The tutorial is organized along the following topics: (PDF)

1. What are IB and 10GE?

* TCP vs. User-level communication protocols

* Requirements (communication, I/O, performance, cost, RAS) from the perspective of designing next generation high-end systems and scalable data centers

2. Short Overview of InfiniBand Architecture

* Architecture and Basic Components

* Communication and I/O Operations

* Transport Layer/Services and Reliability

* Advanced Features (Keys, Protection Domains, Partitioning, Virtual Lanes, QoS Mechanisms and Multicast)

* Software Transport Interfaces and Management Services

* InfiniBand 1.2 specification and highlights

3. Short Overview of 10-Gigabit Ethernet

* Architecture and Basic Components

* Communication Operations

* Advanced Features

* Software Interfaces and Management Services

4. Convergence between InfiniBand and 10-Gigabit Ethernet through the OpenFabrics stack

* Software Interfaces and Management Services

+ Lower-level primitives
+ Upper-level (MPI, SDP, IPoIB, SRP, iSER, uDAPL, kDAPL)

* Subnet Management

* Unified Connection Management Support through RDMA CM

5. Overview of IB and 10GE Products (hardware and software), Time-frames, and Market Trends

* Vendors, Switches, and Host Channel Adapters

* Overview of ConnectX architecture

* Pointers to IB and 10GE installations

6. Designing High-end Systems with IB and 10GE: Research Challenges, Case Studies and Performance Evaluation

* High-end clusters with MPI-1 and MPI-2

* Storage and File Systems (PVFS, Lustre, NFS over RDMA, pNFS)

* Multi-tier Datacenters

* Virtualization Support (Xen-IB)

7. Conclusions, Final Q&A, and Discussion


About Prof. Panda: Dhabaleswar K. (DK) Panda is a Professor of Computer Science at the Ohio State University. He obtained his Ph.D. in computer engineering from the University of Southern California. His research interests include parallel computer architecture, high performance computing, communication protocols, files systems, network-based computing, and Quality of Service. He has published over 225 papers in major journals and international conferences related to these research areas. Dr. Panda and his research group members have been doing extensive research on modern networking technologies including InfiniBand and 10GigE/iWARP. His research group is currently collaborating with National Laboratories and leading InfiniBand and 10GigE/iWARP companies on designing various subsystems of next generation high-end systems. The MVAPICH/MVAPICH2 (High Performance MPI over InfiniBand and iWARP) open-source software packages, developed by his research group (http://mvapich.cse.ohio-state.edu), are currently being used by more than 580 organizations worldwide (in 42 countries). This software has enabled several InfiniBand clusters (including the 3rd ranked one) to get into the latest TOP500 ranking. These software packages are also available with the Open Fabrics stack for network vendors (InfiniBand and iWARP), server vendors and Linux distributors. Dr. Panda's research is supported by funding from US National Science Foundation, US Department of Energy, and several industry including Intel, Cisco, SUN, Mellanox, NetApp and Linux Networx. He is an IEEE Fellow and a member of ACM.

About Dr. Balaji: Pavan Balaji holds a joint appointment as a post-doctoral researcher at the Argonne National Laboratory and as a fellow of the Computation Institute at the University of Chicago. Dr. Balaji had received his Ph.D. from the Computer Science and Engineering department at the Ohio State University. His research interests include high-speed interconnects, efficient protocol stacks, parallel programming models and middleware and job scheduling and resource management. He has more than 25 publications in these areas. Dr. Balaji has also served as a Program Committee Member and Technical Referee on numerous International conferences and journals (ICPP, HiPC, ICCCN, TPDS, TC, JPDC). He has delivered multiple talks at International conferences and as an invited speaker at various research institutions. He has also been a tutorial co-presenter at Supercomputing '05, CCGrid '07, Cluster '07 and Supercomputing '07. He is a member of the IEEE and ACM. More details about Dr. Balaji, including a comprehensive CV, are available at (http://www.mcs.anl.gov/~balaji).