Many distributed workloads in today’s data centers are writ-
ten in managed languages such as Java or Ruby. Examples
include big data frameworks such as Hadoop, data stores
such as Cassandra or applications such as the SOLR search
engine. These workloads typically run across many indepen-
dent language runtime systems on different nodes.
This setup represents a source of inefficiency, as these
language runtime systems are unaware of each other. For
example, they may perform Garbage Collection at times that
are locally reasonable but not in a distributed setting.
We address these problems by introducing the concept
Holistic Runtime System
that makes runtime-level de-
cisions for the entire distributed application rather than lo-
cally. We then present Taurus, a Holistic Runtime System
prototype. Taurus is a JVM drop-in replacement, requires al-
most no configuration and can run unmodified off-the-shelf
Java applications. Taurus enforces user-defined coordination
policies and provides a DSL for writing these policies.
By applying Taurus to Garbage Collection, we demon-
strate the potential of such a system and use it to explore
coordination strategies for the runtime systems of real-world
distributed applications, to improve application performance
and address tail-latencies in latency-sensitive workloads.