Keynotes, Supported by State Key Lab of Computer Architecture (ICT, CAS)

Monday, February 25th (8:30am-9:50am)

Finding Meaning in Big Data

Kevin Nowka
Director, IBM Research - Austin
member, IBM Academy

Abstract: Increasingly Big Data computing is being applied to some of the world’s most difficult, most urgent problems. This talk will describe the nature of Big Data computing and present technology that has been developed to allow Big Data computing systems to address these challenges. This talk will detail trends in data characteristics and data sources. Growth in data volumes and sources have driven advances in indexing massive data collections and in supporting sophisticated analytics. Finally, methods of raising the level of interaction between the Big Data analytics systems and their users enable broader adoption of analytics on big data. This talk will show how IBM’s Watson analytics system typifies this changing model of interaction.

Biography: Dr. Kevin Nowka is the director of IBM Research – Austin, one of IBM's 12 global research laboratories. He leads a team of scientists and engineers working on power-efficient systems and datacenters, system modeling, workload-optimized computing systems, high-speed and power-efficient VLSI circuits, productivity enhancing design automation tools, and the measurement and modeling of the complex interactions between the design and the manufacture of integrated circuits. Dr. Nowka received a PhD in Electrical Engineering from Stanford University in 1995. He has more than 70 issued patents and has published over 70 technical papers on high-performance and low-power circuits, processor design, and technology issues. He is an IBM Research Master Inventor and a member of the IBM Academy of Technology.

Tuesday, February 26th (8:30am-9:50am)

Download slides here.

Antisocial Parallelism:
Avoiding, Hiding and Managing Communication

Katherine Yelick
University of California at Berkeley and Lawrence Berkeley National Laboratory

Future computing system designs will be constrained by power density and total system energy, and will require new programming models and implementation strategies.  Data movement in the memory system and interconnect will dominate running time and energy costs, making communication cost reduction the primary optimization criteria for compilers and programmers.  Communication cost can be divided into latency costs, which are per communication event, and bandwidth costs, which grow with total communication volume. The trends show growing gaps for both of these relative to computation, with the additional problem that communication congestion can conspire to worsen both in practice

In this talk I will describe some of the main techniques for reducing the impact of communication, starting with latency hiding techniques, including the use of one-sided communication in Partitioned Global Address Space languages.  I will describe some of the performance benefits from overlapped and pipelined communication but also note case where there is “too much of a good thing” that causes congestion in network internals.  I will also discuss some of the open problems that arise from increasingly hierarchical computing systems, with multiple levels of memory spaces and communication layers.

Bandwidth reduction often requires more substantial algorithmic transformations, although some techniques, such as loop tiling, are well known.   These can be applied as hand-optimizations, through code generation strategies in autotuned libraries, or as fully automatic compiler transformations.   Less obvious techniques for communication avoidance have arisen in the so-called “2.5D” parallel algorithms, which I will describe more generally as “.5D” algorithms.  These ideas are applicable to many domains, from scientific computations to database operations. In addition to having provable optimality properties, these algorithms also perform well on large-scale parallel machines.  I will end by describing some recent work that lays the foundation for automating transformations to produce communication optimal code for arbitrary loop nests.

Biography: Katherine Yelick is the Associate Laboratory Director for Computing Sciences at Lawrence Berkeley National Laboratory.  She is also a Professor of Electrical Engineering and Computer Sciences at the University of California at Berkeley.  She co-invented the UPC and Titanium languages as well as techniques for self-tuning sparse matrix kernels, and has published over 100 technical papers. She earned her Ph.D. in EECS from MIT and has been a professor at UC Berkeley since 1991 with a joint appointment at LBNL since 1996. She has received multiple research and teaching awards, is an ACM Fellow and serves on numerous advising committee, including the California Council on Science and Technology and the National Academies Computer Science and Telecommunications Board.