The ASPIRE faculty are Krste Asanovic (Director), Elad Alon, Jonathan Bachrach, Jim Demmel, Armando Fox, Kurt Keutzer, Borivoje Nikolic, David Patterson, Koushik Sen, and John Wawrzynek.
Decades of exponential improvements in transistor scaling are coming to an end, bringing an abrupt shift in how future systems will be architected and programmed. No longer can computer architects or software stacks rely on the previously relentless improvements in transistor technology to support ever more complex architectures or ever more layers of programming abstraction. Demanding new applications, from Natural User Interfaces on mobile devices to Big-Data analytics in Warehouse-Scale Computers, will only be possible with radical energy-efficiency innovations above the transistor. The move to parallel programming and multicore architectures was the first step in the drive towards greater efficiency, but general-purpose parallel architectures provided only a one-time gain. We must now go beyond parallelism to maintain future growth in computational capability.
The ASPIRE project is a new five-year collaborative effort attacking the challenges of a new era, where pervasive application-specific specialization of software and hardware is the only path to computational efficiency. Applications must be restructured to use the most energy-efficient algorithms, and the resulting code must be highly tuned to the underlying hardware. Even though power constraints limit how many transistors can be active, continuing increases in transistor budgets allow architects to consider providing highly specialized accelerators for dramatic improvements in performance and energy efficiency for key computations. To save power at the circuit level, fine-grained clock and power gating are essential together with lower supply voltages, requiring architectures to be resilient to the errors introduced when operating with reduced margins. The rapid pace of evolution of complicated systems-on-a-chip (SoCs), both for mobile devices and datacenter server racks, requires rapid co-design and verification of hardware with software to avoid missing market windows. In short, the computing industry must shift from the traditional model where transistor-scaling-driven performance improvements supported horizontal layered architectures, to a new post-scaling world where vertical applications-to-hardware co-design is the key to improved efficiency.
ASPIRE brings together researchers across the stack, including applications, algorithms, compilers, architecture, and circuits. Our goal is to allow future applications to be developed rapidly but run with utmost efficiency by automatically exploiting specialization cutting across all layers of the software and hardware stack.
Our research agenda is structured into six components:
- Pattern-Based Application Rearchitecting: In our earlier Par Lab project, we developed a methodology to break down all applications into their fundamental underlying computational and structural patterns. Expressing problems using the higher-level abstraction of patterns leaves us freedom to develop efficient algorithms and implementations using a wide range of techniques. We believe the patterns will also provide a natural framework within which to develop and program our new specialized hardware architectures.
- Communication-Avoiding Algorithms: Communication, either when moving data between levels of a memory hierarchy or between processors over a network, is often many times more costly in either time or energy than arithmetic, and this cost ratio is growing exponentially. This divergence inspired us to re-examine fundamental algorithms to answer three questions: Are there lower bounds on communication? Do conventional algorithms attain them? If not, are there new algorithms that do? We have systematically proven new lower bounds that are not attained by most conventional algorithms, and found new asymptotically faster algorithms that do, which are faster in theory and practice, and some which achieve perfect strong scaling in time and energy. As applications are usually constrained by memory accesses and inter-processor communications, basing our specializer stacks on communication-optimal algorithms allows us to move towards provably-optimal system implementations.
- Software Specializers using SEJITS (Selective Embedded Just-in-Time Specialization): Our pattern-based approach and new communication-avoiding algorithms create a new, larger design space of algorithms and code, where many variations are possible. The need for heterogeneous and specialized cores also greatly expands the hardware design space. To explore this large hardware and software design space efficiently, we need efficiently retargetable software. Our Asp SEJITS infrastructure supports the creation of software specializers, which are very small pattern-specific languages and compilers designed to express and optimize a single pattern, such as structured grid computations or graph traversal. Our initial prototypes are successful at delivering source- and performance-portable pattern implementations, beating existing DSLs as well as human-optimized code.
- Architectures exploiting Pattern-Based Specialization and Heterogeneity Achievable efficiency is ultimately constrained by hardware architecture, which is itself constrained by flexibility demands. Conventional CPU and GPU designs are very flexible but cannot scale to provide the efficiency needs of future applications. Fixed-function hardware is very efficient, but has high development cost and is inflexible. We are exploring a new architectural style, ESP (Ensemble of Specialized Processors), where we replace conventional processors with a symbiotic ensemble of specialized processing engines that are individually optimized for a particular pattern of computation but that collectively retain the coverage and flexibility required to support a wide range of applications.
- Resiliency and Power Control: At the circuit level, exploiting future increases in transistor count without increasing total power dissipation will require extensive support for low-voltage operation and fine-grained power gating. Optimal energy efficiency is achieved by operating at low supply voltages with low voltage margins, thus exposing the design to elevated error rates. We are developing new resilient circuit and architecture techniques to tolerate circuit errors at the architecture level, and new power conversion and management techniques.
- Agile Hardware Development: We have been developing a new hardware-description language, Chisel (Constructing Hardware in a Scala Embedded Language), to form the basis of a new “Agile Hardware” development methodology, adapting ideas from the Agile Software movement to more rapidly produce efficient working hardware. Chisel uses the power of modern programming languages to help build powerful hardware specializers that generate customized circuits according to high-level parameters and constraints. Chisel can also automatically generate high-performance FPGA-based emulations to allow deep hardware-software co-tuning, thereby improving the quality of results and dramatically reducing the design iteration time for complex SoCs.
ASPIRE faculty, students, and staff share an open-plan collaborative research space designed to encourage interactions across all areas of the lab. We hold twice-yearly off-site research retreats where we interact with our federal and industrial research sponsors on projects of mutual interest.