TS News

News

Seminar 2015-06-10; Stratos Idreos; Curious and self-designing systems: towards easy to use data systems tailored for exploration
How far away are we from a future where a data management system sits in the critical path of everything we do? Already today we need to go through a data system in order to do several basic tasks, e.g., to pay at the grocery store, to book a flight, to find out where our friends are and even to get coffee. Businesses and sciences are increasingly recognizing the value of storing and analyzing vast amounts of data. Other than the expected path towards an exploding number of data-driven businesses and scientific scenarios in the next few years, in this talk we also envision a future where data becomes readily available and its power can be harnessed by everyone. What both scenarios have in common is a need for new kinds of data systems which are tailored for data exploration, which are easy to use, and which can quickly absorb and adjust to new data and access patterns on-the-fly. We will discuss this vision and some of our recent efforts towards self-designing systems as well as "curious" systems tailored for automated exploration.

Seminar 2015-06-10; Stratos Idreos; Curious and self-designing systems: towards easy to use data systems tailored for exploration

How far away are we from a future where a data management system sits in the critical path of everything we do? Already today we need to go through a data system in order to do several basic tasks, e.g., to pay at the grocery store, to book a flight, to find out where our friends are and even to get coffee. Businesses and sciences are increasingly recognizing the value of storing and analyzing vast amounts of data. Other than the expected path towards an exploding number of data-driven businesses and scientific scenarios in the next few years, in this talk we also envision a future where data becomes readily available and its power can be harnessed by everyone. What both scenarios have in common is a need for new kinds of data systems which are tailored for data exploration, which are easy to use, and which can quickly absorb and adjust to new data and access patterns on-the-fly. We will discuss this vision and some of our recent efforts towards self-designing systems as well as "curious" systems tailored for automated exploration.

Seminar 2015-06-04: Onur Mutlu; Rethinking Memory System Design for Data-Intensive Computing
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy-efficiency, and reliability significantly more costly with conventional techniques.

Seminar 2015-06-04: Onur Mutlu; Rethinking Memory System Design for Data-Intensive Computing

The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy-efficiency, and reliability significantly more costly with conventional techniques.

Seminar 2015-05-19; James Zheng on Physically Informed Runtime Verification for Cyber Physical Systems
Cyber-physical systems (CPS) are an integration of computation with physical processes. CPS have gained popularity both in industry and the research community and are represented by many varied mission critical applications. Debugging CPS is important, but the intertwining of the cyber and physical worlds makes it very difficult. Formal methods, simulation, and testing are not sufficient in guaranteeing required correctness. Runtime Verification (RV) provides a perfect complement.However, the state of the art in RV lacks either efficiency or expressiveness, and very few RV technologies are specifically designed for CPS. In this talk, I discuss a toolset, which brings formal methods (e.g., temporal logic and time automata) and physical models (through real time simulation) into CPS runtime verification. The toolset is evaluated through increasingly complex real CPS applications of smart agent system

Seminar 2015-05-19; James Zheng on Physically Informed Runtime Verification for Cyber Physical Systems

Cyber-physical systems (CPS) are an integration of computation with physical processes. CPS have gained popularity both in industry and the research community and are represented by many varied mission critical applications. Debugging CPS is important, but the intertwining of the cyber and physical worlds makes it very difficult. Formal methods, simulation, and testing are not sufficient in guaranteeing required correctness. Runtime Verification (RV) provides a perfect complement.However, the state of the art in RV lacks either efficiency or expressiveness, and very few RV technologies are specifically designed for CPS. In this talk, I discuss a toolset, which brings formal methods (e.g., temporal logic and time automata) and physical models (through real time simulation) into CPS runtime verification. The toolset is evaluated through increasingly complex real CPS applications of smart agent system

Seminar 2015-05-15; Matthew Grosvenor ; Atomic Broadcast for the Rack-Scale Computer
Atomic Broadcast is a powerful primitive for implementing agreement systems. The way in which atomic broadcast is implemented depends on the underlying communications infrastructure. Multiprocessors can assume the presence of special purpose, low latency and highly reliable interconnects giving rise to systems that operate in just a few CPU cycles. Whereas, across machine boundaries the communication infrastructure is typically general purpose, higher latency and less reliable. In these situations more complex software approaches such as Paxos, Raft and Zookeeper Zab are used. As a consequence, atomic broadcast between machines is slow and scales poorly. The racks-scale computer (RSC) falls somewhere between these two worlds. Although constructed out of unreliable sub-components, we would like to be able to treat the machine as if it were a single unit. Our work is motivated by this apparent contradiction, and the observation treating the rack as a single machine provides us with an opportunity to build custom interconnects using general purpose components. In this talk I will discuss Exo, a fast and efficient network architecture and protocol for atomic broadcasts at the rack scale. Exo builds upon the well established theory of token-ring based atomic broadcast protocols. The Exo protocol is accelerated using a specialised low latency network architecture and a custom hardware acceleration engine. The network is constructed from commodity Ethernet networking components and the acceleration engine is programmed into commodity FPGA enabled network cards. Exo is a work in progress. At time of writing, the Exo protocol is running in our lab on a small test cluster of15 nodes. Currently the system is capable of a sustained rate of over 2 million messages per second and coping with transmission faults (bit-errors), arbitrary partitions and failing nodes. We expect that over the coming months this will mature into a fully featured system, capable of operating several orders of magnitude faster, and with at least an order of magnitude greater scale than existing systems.

Seminar 2015-05-15; Matthew Grosvenor ; Atomic Broadcast for the Rack-Scale Computer

Atomic Broadcast is a powerful primitive for implementing agreement systems. The way in which atomic broadcast is implemented depends on the underlying communications infrastructure. Multiprocessors can assume the presence of special purpose, low latency and highly reliable interconnects giving rise to systems that operate in just a few CPU cycles. Whereas, across machine boundaries the communication infrastructure is typically general purpose, higher latency and less reliable. In these situations more complex software approaches such as Paxos, Raft and Zookeeper Zab are used. As a consequence, atomic broadcast between machines is slow and scales poorly. The racks-scale computer (RSC) falls somewhere between these two worlds. Although constructed out of unreliable sub-components, we would like to be able to treat the machine as if it were a single unit. Our work is motivated by this apparent contradiction, and the observation treating the rack as a single machine provides us with an opportunity to build custom interconnects using general purpose components. In this talk I will discuss Exo, a fast and efficient network architecture and protocol for atomic broadcasts at the rack scale. Exo builds upon the well established theory of token-ring based atomic broadcast protocols. The Exo protocol is accelerated using a specialised low latency network architecture and a custom hardware acceleration engine. The network is constructed from commodity Ethernet networking components and the acceleration engine is programmed into commodity FPGA enabled network cards. Exo is a work in progress. At time of writing, the Exo protocol is running in our lab on a small test cluster of15 nodes. Currently the system is capable of a sustained rate of over 2 million messages per second and coping with transmission faults (bit-errors), arbitrary partitions and failing nodes. We expect that over the coming months this will mature into a fully featured system, capable of operating several orders of magnitude faster, and with at least an order of magnitude greater scale than existing systems.

Seminar 2015-05-15; Matthew Grosvenor ; Queues don't matter when you can Jump them!
In this talk I will be discussing our recent system called QJump. QJump is a simple and immediately deployable approach to controlling network interference in datacenter networks. Network interference occurs when congestion from throughput-intensive applications causes queueing that delays traffic from latency-sensitive applications. To mitigate network interference, QJump applies Internet QoS-inspired techniques to datacenter applications. Each application is assigned to a latency sensitivity level (or class). Packets from higher levels are rate-limited in the end host, but once allowed into the network can “jump-the-queue” over packets from lower levels. In settings with known node counts and link speeds, QJump can support service levels ranging from strictly bounded latency (but with low rate) through to line-rate throughput (but with high latency variance). We have implemented QJump as a Linux Traffic Control module. QJump achieves bounded latency and reduces in-network interference by up to

Seminar 2015-05-15; Matthew Grosvenor ; Queues don't matter when you can Jump them!

In this talk I will be discussing our recent system called QJump. QJump is a simple and immediately deployable approach to controlling network interference in datacenter networks. Network interference occurs when congestion from throughput-intensive applications causes queueing that delays traffic from latency-sensitive applications. To mitigate network interference, QJump applies Internet QoS-inspired techniques to datacenter applications. Each application is assigned to a latency sensitivity level (or class). Packets from higher levels are rate-limited in the end host, but once allowed into the network can “jump-the-queue” over packets from lower levels. In settings with known node counts and link speeds, QJump can support service levels ranging from strictly bounded latency (but with low rate) through to line-rate throughput (but with high latency variance). We have implemented QJump as a Linux Traffic Control module. QJump achieves bounded latency and reduces in-network interference by up to

Show older articles

Trustworthy Systems

News