ADMS 2018

ADMS 2018
Ninth International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures

Monday, August 27, 2018, El Pardo I

In conjunction with VLDB 2018, Rio De Janeiro, Brazil

Links

Overview

Topics of Interest

Important Dates

Submission Instructions

ADMS 2012

ADMS 2011

The objective of this one-day workshop is to investigate opportunities in accelerating data management systems and analytics workloads (which include traditional OLTP, data warehousing/OLAP, ETL, Streaming/Real-time, Analytics (including Machine Learning), and HPC/Deep Learning) using processors (e.g., commodity and specialized Multi-core, GPUs, FPGAs, and ASICs), storage systems (e.g., Storage-class Memories like SSDs and Phase-change Memory), and programming models like MapReduce, Spark, CUDA, OpenCL, and OpenACC.

The current data management scenario is characterized by the following trends: traditional OLTP and OLAP/data warehousing systems are being used for increasing complex workloads (e.g., Petabyte of data, complex queries under real-time constraints, etc.); applications are becoming far more distributed, often consisting of different data processing components; non-traditional domains such as bio-informatics, social networking, mobile computing, sensor applications, gaming are generating growing quantities of data of different types; economical and energy constraints are leading to greater consolidation and virtualization of resources; and analyzing vast quantities of complex data is becoming more important than traditional transactional processing.

At the same time, there have been tremendous improvements in the CPU and memory technologies. Newer processors are more capable in the CPU and memory capabilities and are optimized for multiple application domains. Commodity systems are increasingly using multi-core processors with more than 6 cores per chip and enterprise-class systems are using processors with 8 cores per chip, where each core can execute upto 4 simultaneous threads. Specialized multi-core processors such as the GPUs have brought the computational capabilities of supercomputers to cheaper commodity machines. On the storage front, FLASH-based solid state devices (SSDs) are becoming smaller in size, cheaper in price, and larger in capacity. Exotic technologies like Phase-change memory are on the near-term horizon and can be game-changers in the way data is stored and processed.

In spite of the trends, currently there is limited usage of these technologies in data management domain. Naive usage of multi-core processors or SSDs often leads to unbalanced system. It is therefore important to evaluate applications in a holistic manner to ensure effective utilization of CPU and memory resources. This workshop aims to understand impact of modern hardware technologies on accelerating core components of data management workloads. Specifically, the workshop hopes to explore the interplay between overall system design, core algorithms, query optimization strategies, programming approaches, performance modelling and evaluation, etc., from the perspective of data management applications.

Topics of Interest

The suggested topics of interest include, but are not restricted to:

Hardware and System Issues in Domain-specific Accelerators

New Programming Methodologies for Data Management Problems on Modern Hardware

Query Processing for Hybrid Architectures

Large-scale I/O-intensive (Big Data) Applications

Parallelizing/Accelerating Machine Learning/Deep Learning Workloads

Autonomic Tuning for Data Management Workloads on Hybrid Architectures

Algorithms for Accelerating Multi-modal Multi-tiered Systems

Energy Efficient Software-Hardware Co-design for Data Management Workloads

Parallelizing non-traditional (e.g., graph mining) workloads

Algorithms and Performance Models for modern Storage Sub-systems

Exploitation of specialized ASICs

Novel Applications of Low-Power Processors and FPGAs

Exploitation of Transactional Memory for Database Workloads

Exploitation of Active Technologies (e.g., Active Memory, Active Storage, and Networking)

New Benchmarking Methodologies for Accelerated Workloads

Applications of HPC Techniques for Data Management Workloads

Acceleration in the Cloud Environments

Keynote Presentations

Quantum Computing and IBM Q: An Introduction

Carlos Cardonha, IBM Research, Brazil

Abstract: In his keynote speech at the Physics of Computation Conference in 1981, Richard Feynman discussed the challenges involved in the simulation of physical systems; in particular, Feynman suggested that quantum-mechanical devices should be constructed in order to make such tasks tractable, an observation that lead to the creation of several areas in science which we now know as Quantum Computing. After decades of intensive research efforts, answers for some of the main engineering challenges have been found, and the construction of quantum computers capable of overperforming classical computers seems not only possible, but eventually achievable in the near future. In this talk, we present the main concepts of quantum computing, some of the main challenges in the area, potential application in the near and long-term, and give an overview on resources that are currently available for learning about and interacting with quantum computers.

Bio: Carlos Cardonha is a Research Staff Member of the Natural Resources Optimization Group at IBM Research Brazil, with a Ph.D. in Mathematics (T.U. Berlin) and with a Bachelor's and a Master's degree in Computer Science (Universidade de São Paulo). His primary research interests are mathematical programming and theoretical computer science, with focus on the application of techniques in mixed integer linear programming, combinatorial optimization, and algorithms design to real-world and/or operations research problems.

A Comprehensive Study of SIMD Techniques for Data Processing: The Good, the Bad, and the Ugly (Slides)

Shasank Chavan, Oracle

Abstract: Modern CPUs introduced SIMD (Single Instruction, Multiple Data) instructions in the mid 1990s to drastically speed up multimedia applications such as gaming and audio/video processing. It wasn’t until the early 2000s when SIMD instructions started being leveraged in data management systems to vectorize compute intensive workloads on columnar data. Since then there has been a plethora of techniques and algorithms introduced in academia utilizing SIMD instructions – everything from optimizing traditional SQL operators such as scans and joins, to accelerating graph, spatial and text processing. In this talk we’ll advance through a list of top SIMD techniques developed over the years for data management systems, and discuss in detail what’s worked, what hasn’t, and what industry really needs going forward in this age of GPUs, FPGAs, and specialized ASIC data accelerators.

Bio: Shasank Chavan is the Vice President of the In-Memory Technologies group at Oracle. He leads an amazing team of brilliant engineers in the Database organization who develop customer-facing, performance-critical features for an In-Memory Columnar Store which, as Larry Ellison proclaimed, “processes data at ungodly speeds”. His team implements novel SIMD kernels and hardware acceleration technology for blazing fast columnar data processing, optimized data formats and compression technology for efficient in-memory storage, algorithms and techniques for fast in-memory join and aggregation processing, and optimized in-memory data access and storage solutions in general. His team is currently hyper-focused on leveraging emerging hardware technologies to build the next-generation data storage engine that powers the cloud. Shasank earned his BS/MS in Computer Science at the University of California, San Diego. He has accumulated 15+ patents over a span of 20 years working on systems software technology.

Concepts of Coherent Memory Interface

Sumanta Chatterjee, Oracle

Abstract: Use of RDMA has benefited Enterprise Software with low latency, high throughput I/O services. However, RDMA primitives pose many challenges for developing distributed protocols. In this talk I will present Coherent Memory Interface— modeled like Partitioned Global Address Space (PGAS) to show how this new memory model can be used for developing distributed protocols.

Bio: Sumanta Chatterjee is Vice President of the Oracle Database Virtual OS group. He works in the areas of Distributed Computing, I/O, Concurrency Control, Memory Management areas. Sumanta joined Oracle in 1994. Sumanta has a BTech from IIT Kanpur and a MS from the Texas A&M University.

Workshop Program

Session 1: Morning Keynote (9-10.30 am)

Quantum Computing and IBM Q: An Introduction, Carlos Cardonha, IBM Research, Brazil

Coffee Break (10.30-11 am)

Session 2: Compute Acceleration (11 am - 12.30 pm)

Optimizing Group-By And Aggregation using GPU-CPU Co-Processing, Diego Gomes Tomé, Tim Gubner, Mark Raasveldt, Eyal Rozenberg and Peter Boncz, CWI, Vrije Universiteit, Amsterdam (Slides)

Full Speed Ahead: 3D Spatial Database Acceleration with GPUs, Lucas Villa Real and Bruno Silva, IBM Research, Brazil (Slides)

Low-Latency Transaction Execution on Graphics Processors: Dream or Reality?, Iya Arefyeva, Gabriel Campero Durand, Marcus Pinnecke, David Broneske and Gunter Saake (University of Magdeburg) (Slides)

Column Scan Acceleration in Hybrid CPU-FPGA Systems, Nusrat Lisa, Annett Ungethüm, Dirk Habich, Wolfgang Lehner, Nguyen Duy Anh Tuan and Akash Kumar (TU Dresden) (Slides)

Lunch Break (12.30-2 pm)

Session 3: Memory Acceleration (2-3.30 pm)

Hammer Slide: Work- and CPU-efficient Streaming Window Aggregation, Georgios Theodorakis, Alexandros Koliousis, Peter Pietzuch and Holger Pirk, Imperial College, London (Slides)

Near-Data Filters: Taking Another Brick from the Memory Wall Diego Gomes Tomé (CWI, Amsterdam), Tiago Rodrigo Kepe, Marco Antonio Zanata Alves and Eduardo Cunha De Almeida (UFPR, Brazil) (Slides)

Hassium: Hardware Assisted Database Synchronization, Hillel Avni and Aharon Avitzur (Huawei) (Slides)

Adaptive Cache Mode Selection for Queries over Raw Data, Tahir Azim (EPFL), Azqa Nadeem (Delft University of Technology), and Anastasia Ailamaki (EPFL) (Slides)

Coffee Break (3.30-4.00 pm)

Session 4: Afternoon Keynotes (4.00-5.30 pm)

A Comprehensive Study of SIMD Techniques for Data Processing: The Good, the Bad, and the Ugly, Shasank Chavan, Oracle (Slides)

Concepts of Coherent Memory Interface, Sumanta Chatterjee, Oracle

Organization

Workshop Co-Chairs

Rajesh Bordawekar, IBM T.J. Watson Research Center

Tirthankar Lahiri, Oracle

For questions regarding the workshop please send email to contact@adms-conf.org.
Program Committee

Raja Appuswamy, EPFL

Shasank Chavan, Oracle

Christoph Dubach, University of Edinburgh

Markus Dreseler, HPI

Stefan Manegold, CWI

Bingsheng He, NUS

Diego Arroyuelo, Universidad Técnica Federico Santa María

Nikolay Sakharnykh, Nvidia

Carsten Binnig, TU Darmstadt

Kajan Kanagaratnam, IBM Analytics

Bill Howe, University of Washington

Wellington Martins, INF/UFG

Arun Raghavan, Oracle Labs

Ken Salem, University of Waterloo

Rajkumar Sen, Striim Inc.

Man Lung Yiu, Hongkong Polytechnic University

Important Dates

Paper Submission: Monday, 11 June, 2018 (EXTENDED)

Notification of Acceptance: Friday, 29 June, 2018

Camera-ready Submission: Friday, 20 July, 2018

Workshop Date: Monday, 27 August, 2018

Submission Instructions

Submission Site

All submissions will be handled electronically via EasyChair.

Formatting Guidelines

We will use the same document templates as the VLDB18 conference. You can find them here.
It is the authors' responsibility to ensure that their submissions adhere strictly to the VLDB format detailed here. In particular, it is not allowed to modify the format with the objective of squeezing in more material. Submissions that do not comply with the formatting detailed here will be rejected without review.

As per the VLDB submission guidelines, the paper length for a full paper is limited to 12 pages, excluding bibliography. However, shorter papers (at least 4 pages of content) are encouraged as well.