|
Seminar 2015-06-04: Onur Mutlu;
Rethinking Memory System Design for Data-Intensive
Computing
|
|
The memory system is a fundamental performance and energy
bottleneck in almost all computing systems. Recent system
design, application, and technology trends that require
more capacity, bandwidth, efficiency, and predictability
out of the memory system make it an even more important
system bottleneck. At the same time, DRAM and flash
technologies are experiencing difficult technology scaling
challenges that make the maintenance and enhancement of
their capacity, energy-efficiency, and reliability
significantly more costly with conventional techniques.
|
|
Seminar 2015-05-19; James
Zheng on Physically Informed Runtime Verification
for Cyber Physical Systems
|
|
Cyber-physical systems (CPS) are an integration of
computation with physical processes. CPS have gained
popularity both in industry and the research community
and are represented by many varied mission critical
applications. Debugging CPS is important, but the
intertwining of the cyber and physical worlds makes it
very difficult. Formal methods, simulation, and testing
are not sufficient in guaranteeing required correctness.
Runtime Verification (RV) provides a perfect
complement.However, the state of the art in RV lacks
either efficiency or expressiveness, and very few RV
technologies are specifically designed for CPS. In this
talk, I discuss a toolset, which brings formal methods
(e.g., temporal logic and time automata) and physical
models (through real time simulation) into CPS runtime
verification. The toolset is evaluated through
increasingly complex real CPS applications of smart agent
system
|
|
Seminar 2015-05-15; Matthew Grosvenor ;
Atomic Broadcast for the Rack-Scale Computer
|
|
Atomic Broadcast is a powerful primitive for implementing
agreement systems. The way in which atomic broadcast is
implemented depends on the underlying communications
infrastructure. Multiprocessors can assume the presence of
special purpose, low latency and highly reliable
interconnects giving rise to systems that operate in just a
few CPU cycles. Whereas, across machine boundaries the
communication infrastructure is typically general purpose,
higher latency and less reliable. In these situations more
complex software approaches such as Paxos, Raft and
Zookeeper Zab are used. As a consequence, atomic broadcast
between machines is slow and scales poorly. The racks-scale
computer (RSC) falls somewhere between these two worlds.
Although constructed out of unreliable sub-components, we
would like to be able to treat the machine as if it were a
single unit. Our work is motivated by this apparent
contradiction, and the observation treating the rack as a
single machine provides us with an opportunity to build
custom interconnects using general purpose components. In
this talk I will discuss Exo, a fast and efficient network
architecture and protocol for atomic broadcasts at the rack
scale. Exo builds upon the well established theory of
token-ring based atomic broadcast protocols. The Exo
protocol is accelerated using a specialised low latency
network architecture and a custom hardware acceleration
engine. The network is constructed from commodity Ethernet
networking components and the acceleration engine is
programmed into commodity FPGA enabled network cards. Exo
is a work in progress. At time of writing, the Exo protocol
is running in our lab on a small test cluster of15 nodes.
Currently the system is capable of a sustained rate of over
2 million messages per second and coping with transmission
faults (bit-errors), arbitrary partitions and failing
nodes. We expect that over the coming months this will
mature into a fully featured system, capable of operating
several orders of magnitude faster, and with at least an
order of magnitude greater scale than existing systems.
|
|
Seminar 2015-05-15; Matthew Grosvenor ;
Queues don't matter when you can Jump them!
|
|
In this talk I will be discussing our recent system called
QJump. QJump is a simple and immediately deployable
approach to controlling network interference in datacenter
networks. Network interference occurs when congestion from
throughput-intensive applications causes queueing that
delays traffic from latency-sensitive applications. To
mitigate network interference, QJump applies Internet
QoS-inspired techniques to datacenter applications. Each
application is assigned to a latency sensitivity level (or
class). Packets from higher levels are rate-limited in the
end host, but once allowed into the network can
“jump-the-queue” over packets from lower
levels. In settings with known node counts and link speeds,
QJump can support service levels ranging from strictly
bounded latency (but with low rate) through to line-rate
throughput (but with high latency variance). We have
implemented QJump as a Linux Traffic Control module. QJump
achieves bounded latency and reduces in-network
interference by up to
|
|
Seminar 2015-05-12; John
Grundy on The Future of Software Engineering in
Australia
|
|
Professor John
Grundy is Dean of the School of Software and
Electrical Engineering at Swinburne University of
Technology. He is also Director of the Swinburne
University Centre for Computing and Engineering
Software Systems (SUCCESS). His teaching is mostly in
the area of team projects, software requirements and
design, software processes, distributed systems, and
programming. His research areas include software tools
and techniques, software architecture, model-driven
software engineering, visual languages, software
security engineering, service-based and component-based
systems and user interfaces. John will be giving a talk
on The Future of Software Engineering in Australia. In
this talk John will outline what he considers to be key
issues for software engineering research and practice
in Australia. He will highlight some key example areas
we are working in to address these in our research
group.
|