GCASR 2019‎ > ‎


Andrew Chien,  University of Chicago

Zero-carbon Cloud an Update and Challenges for Datacenters as Supply-following Loads

Hyperscale cloud datacenters are the fastest growing consumer of power in most of the world. Zero-carbon cloud proposes the use of these networks of datacenters as a geographically-distributed, supply-following load to utilize excess renewable generation (e.g. stranded power). Recent developments suggest Zero-carbon cloud is at a breakout point, and datacenters as supply-following load being deployed.

Supply-following datacenters raise a wealth of interesting research questions in cloud and energy systems in areas of resource management and scheduling (variable, stochastic capacity), how much cloud load is compatible with supply-following and can it be shaped, how to couple mutually distrustful power grid and cloud scheduling, how to clear power markets with stochastic bids, and how to manage power grids with stochastic bids.

Shan Lu, University of Chicago

Tackling Performance Problems in Database-Backed Web Applications

Web developers face the stringent task of designing informative web pages while keeping the page-load time low. This task has become increasingly challenging as most web contents are now generated by processing ever-growing amount of user data stored in back-end databases.

This talk will first discuss an empirical study about hundreds of performance problems in popular web applications written in Ruby on Rails and then present program analysis, code refactoring, and IDE tools that we have built to tackle these performance problems. These tools can automatically identify and fix inefficient data-processing code in web applications, with thousands of problems already identified; help developers understand the data-processing cost behind web page elements; and help developers explore different performance-enhancing web-page designs.

Rujia Wang, Illinois Institute of Technology

A Cacheable ORAM Interface for Efficient I/O Accesses

Oblivious RAM (ORAM) is a security primitive to prevent the access pattern leakage. By adding redundant memory accesses, ORAM prevents attackers from revealing the patterns in the access sequences by breaking the locality. With a growing address space to be protected, we need to extend the protection boundary to the storage device. However, this would cause a huge performance miss if each memory access would need to go through the entire memory hierarchy. In this talk, I’ll present a new ORAM interface that is capable of protecting large data set without frequently access the I/O storage devices. We propose H-ORAM, a novel hybrid ORAM primitive to address massive performance degradation when overflowing the user data to storage. H-ORAM consists of a batch scheduling scheme for enhancing the memory bandwidth usage, and a cacheable ORAM interface that returns data without waiting for the long I/O access. We evaluate H-ORAM on a Linux based machine with HDD backend, and the experimental results show that H-ORAM outperforms the state-of-the-art Path ORAM by 20x.

Qi Zhu, Northwestern University

Beyond Functionality: Tackling Timing and Security Challenges in the Design of Intelligent Engineering Systems

Intelligent engineering systems, such as autonomous vehicles, industrial robots, smart buildings and infrastructures, wearable devices and medical systems, have shown great economic and societal promises in recent years. The rapid development of sensing, data processing and analysis, control and communication methods brings new intelligent functionality and propels system advancement. However, the design and implementation of these systems are facing tremendous challenges beyond the traditional scope of functionality, in particular on system timing and security. In many cases, violation of timing and security requirements may in fact lead to incorrect behavior and system failures. In this talk, I will discuss timing and security challenges in the design of intelligent engineering systems, with examples from automotive electronic systems and vehicular networks. I will briefly introduce our work in tackling these challenges with design automation techniques, including 1) a timing-driven and contract-based software synthesis framework that automatically explores the large software design space and addresses timing-related metrics such as schedulability, security, performance, extensibility, reliability and fault tolerance; and 2) a cross-layer modeling, simulation, synthesis and verification framework for connected vehicle applications.

Anthony Kougkas, Illinois Institute of Technology

Hermes: A distributed multi-tiered buffering platform

Modern supercomputer designs employ newly emerging hardware technologies such as High-Bandwidth Memory (HBM), Non-Volatile RAM (NVRAM), Solid-State Drives (SSD), and dedicated buffering nodes (e.g., burst buffers) to alleviate the performance gap between main memory and the remote disk-based PFS. This creates a heterogeneous layered memory and storage hierarchy, we call Deep Memory and Storage Hierarchy (DMSH). However, as multiple layers of storage are added into HPC systems, the complexity of data movement among the layers increases significantly. Additionally, each layer of DMSH is an independent system that requires expertise to manage, and the lack of automated data movement between tiers is a significant burden currently left to the users. In this talk, we present Hermes, a new, heterogeneous-aware, multi-tiered, dynamic, and distributed I/O buffering system. Hermes enables, manages, and supervises I/O buffering into DMSH. Hermes accelerates applications' I/O access by transparently buffering data in DMSH. Data can be moved through the hierarchy effortlessly and therefore, applications have a capable, scalable, and reliable middleware software to navigate the I/O challenges towards the exascale era. Hermes project is a NSF-funded collaborative project between Illinois Institute of Technology and The HDF Group. Additional partners include the Lawrence Berkeley National Lab, Argonne National Lab, and Sandia National Labs as well as other academic institutions..

Bogdan Nicolae, Argonne National Laboratory

VeloC: Very Low Overhead Checkpoint-Restart

This talk introduces VeloC, a multi-level checkpoint/restart runtime for high performance computing applications that takes advantage of and delivers high performance and scalability for modern, complex heterogeneous storage hierarchies without sacrificing ease of use and flexibility. First, the talk introduces the need for checkpointing at Exascale and associated challenges. Next, it highlights the key features of VeloC: (1) exposes a simple application-level API to checkpoint and restart HPC applications based on either protecting application data structures directly or managing application-defined checkpoint files; (2) hides the complexity of interacting with the storage hierarchy (burst buffers, etc.) of current and future HPC systems; (3) has a modular design that facilitates flexibility in choosing the resilience strategies and mode of operation (synchronous or asynchronous), while being highly customizable with additional post-processing modules. The talk continues with highlighting a series of results with VeloC in the context of large-scale ECP applications. Finally, explores the use of checkpoint-restart beyond resilience and highlights several promising future work directions

Corey Adams, Argonne National Laboratory

Convolutional Neural Networks for Sparse Scientific Data

Many scientific domains have been applying machine learning techniques to improve and accelerate the analysis of complex data. Convolutional neural networks are a go-to method for analysis of scientific data using supervised learning techniques. Unfortunately, in many scientific domains the data sets do not map perfectly into the RGB images used in most machine learning research datasets like ImageNet. In this talk, I’ll present some of the work at Argonne’s Leadership Computing Facility using alternative implementations of convolutional neural networks for sparse data. These techniques can accelerate training and inference times by more than an order of magnitude on appropriate scientific data, however the implementations are not as well optimized as standard convolutions on 2D images. I will also discuss some of the optimizations being developed for these techniques to further accelerate deep convolutional neural networks for sparse data.

Josh San Miguel, University of Wisconsin

New Paradigms for Designing Energy-Efficient Intermittent Processor Architectures

The number of smart, network-enabled devices has grown exponentially, already surpassing the world's population. An IoT future is fast approaching where everything (whether it be a credit card or even a ring on one's finger) is capable of measuring and processing information. Powering such arbitrary and tiny computers is among the toughest technology challenges today. Researchers have turned to energy-harvesting systems, collecting energy from ambient sources (e.g., solar, thermal, RF, Wi-Fi) in order to power devices for short periods of time. The key challenge for programs now is that computation isintermittent: programs may have to stop at any point and back up their current progress, unable to resume until the device has harvested sufficient energy from the environment. In this talk, I will first present the EH Model, an analytical model for early design space exploration that generalizes and characterizes the complex, unconventional trade-offs that arise in intermittent systems. Second, I will describe a new computing paradigm: computational skimming via the What's Next processor, which introduces techniques for approximate subword pipelining/vectorization and fundamentally decouples the checkpoint location from the recovery location upon a power outage. Finally, I will briefly introduce race-to-expiry, an intermittent system that characterizes the expiry dates of data and aligns them with power outages to minimize backup costs.

Baihu Qian and Bashuman Deb, Amazon Web Services

Routing in the Cloud: The Design of AWS Transit Gateway

With more and more companies moving to the cloud, complex network architectures are being implemented to achieve customers’ business, regulatory, and compliance needs. This drives the demand of network services with advanced functionalities, simplified management, and high performance. This talk provides an overview of the design of AWS Transit Gateway, a service that enables customers to connect thousands of networks together, and looks into how high performance, scalability, availability, and low cost are achieved.

Ian Kash, University of Illinois at Chicago

On the cluster admission problem for cloud computing

Abstract: Cloud computing providers must handle heterogeneous customer workloads for resources such as (virtual) CPU or GPU cores. This is particularly challenging if customers, who are already running a job on a cluster, scale their resource usage up and down over time. The provider therefore has to continuously decide whether she can add additional workloads to a given cluster or if doing so would impact existing workloads' ability to scale. Currently, this is often done using simple threshold policies to reserve large parts of each cluster, which leads to low average utilization of the cluster. I’ll discuss a proposal to use more sophisticated Bayesian policies and present simulations which suggest that it leads to a substantial improvement over the simple threshold policy. I’ll also discuss how this can be improved with learned or elicited prior information and how to incentivize users to provide this information.

Cornelia Caragea
, University of Illinois at Chicago

Dynamic Deep Multi-modal Fusion for Image Privacy Prediction

With millions of images that are shared online on social networking sites, effective methods for automatic image privacy prediction are highly needed. In this talk, I will present an approach for dynamically fusing objects, scenes, and image tags derived from Convolutional Neural Networks for accurately predicting the privacy of images shared online. Specifically, our approach identifies the set of most competent modalities on the fly, according to each new target image whose privacy has to be predicted. The approach considers three stages to predict the privacy of a target image, wherein we first identify the neighborhood images that are visually similar and/or have similar sensitive content as the target image. Then, we estimate the competence of the modalities based on the neighborhood images. Finally, we fuse the decisions of the most competent modalities and predict the privacy label for the target image. Experimental results show that our approach predicts the sensitive (or private) content more accurately than the models trained on individual modalities (object, scene, and tags) and other prior privacy prediction works

Radha Jagadeesan
, DePaul University

Abstractions that lie!


We examine the effect of two abstractions on programming. We assume that a) the order of reads and writes in a program, and b) the instructions in a program happen in the order specified by the program text.   In this talk, we examine the unpleasant results of "information leaks'' in these abstractions; leading to the study of relaxed memory in the first case, and the "Sceptre'' family of attacks in the second case.  Our approach is to address these issues from the perspective of programming lnguages, by “extending the functional specification of the programming language to include more details of its performance''

Alexander Rasin,
DePaul University

Forget About It: Batched Database Sanitization

Deallocated data in both file systems and database management systems (DBMSes) can be reconstructed from raw storage, making it vulnerable to theft even after deletion. Data erasure (or sanitization) is a well-known process that eliminates this vulnerability, offering users ``the right to be forgotten''. However, most sanitization work only addresses erasing data (files) at the OS level; due to the inherent complexity of DBMS operation, live DBMS instances cannot be sanitized the same way. In fact, SQLite is the only DBMS that offers a "secure delete" functionality. Moreover, these efforts  towards DBMS sanitization so far chose an erase-on-commit approach, which introduces significant I/O overheads and cannot make guarantees about the comprehensiveness of the sanitization. 

In this talk, we describe a targeted sanitization method, DBSanitizer, that 1) is DBMS agnostic, 2) supports batched erasure, and 3) offers sanitization guarantees. An evaluation with multiple DBMSes demonstrates it can be applied in any row-store relational DBMSes. We furthermore compare DBSanitizer to the  current (and the only available) secure delete solution in SQLite. DBSanitizer shows the feasibility and the advantages of batched sanitization, and is intended to serve as a template for DBMS vendors to begin supporting their own built-in sanitization.