ADMS 2015

ADMS 2015
Sixth International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures

Monday, August 31, 2015

In conjunction with VLDB 2015
Kohala Coast, Hawai‘i

Links

Overview

Topics of Interest

Important Dates

Submission Instructions

ADMS 2012

ADMS 2011

The objective of this one-day workshop is to investigate opportunities in accelerating data management systems and workloads (which include traditional OLTP, data warehousing/OLAP, ETL, Streaming/Real-time, Business Analytics, and XML/RDF Processing) using processors (e.g., commodity and specialized Multi-core, GPUs, FPGAs, and ASICs), storage systems (e.g., Storage-class Memories like SSDs and Phase-change Memory), and programming models like MapReduce, Spark, CUDA, OpenCL, and OpenACC.

The current data management scenario is characterized by the following trends: traditional OLTP and OLAP/data warehousing systems are being used for increasing complex workloads (e.g., Petabyte of data, complex queries under real-time constraints, etc.); applications are becoming far more distributed, often consisting of different data processing components; non-traditional domains such as bio-informatics, social networking, mobile computing, sensor applications, gaming are generating growing quantities of data of different types; economical and energy constraints are leading to greater consolidation and virtualization of resources; and analyzing vast quantities of complex data is becoming more important than traditional transactional processing.

At the same time, there have been tremendous improvements in the CPU and memory technologies. Newer processors are more capable in the CPU and memory capabilities and are optimized for multiple application domains. Commodity systems are increasingly using multi-core processors with more than 6 cores per chip and enterprise-class systems are using processors with 8 cores per chip, where each core can execute upto 4 simultaneous threads. Specialized multi-core processors such as the GPUs have brought the computational capabilities of supercomputers to cheaper commodity machines. On the storage front, FLASH-based solid state devices (SSDs) are becoming smaller in size, cheaper in price, and larger in capacity. Exotic technologies like Phase-change memory are on the near-term horizon and can be game-changers in the way data is stored and processed.

In spite of the trends, currently there is limited usage of these technologies in data management domain. Naive usage of multi-core processors or SSDs often leads to unbalanced system. It is therefore important to evaluate applications in a holistic manner to ensure effective utilization of CPU and memory resources. This workshop aims to understand impact of modern hardware technologies on accelerating core components of data management workloads. Specifically, the workshop hopes to explore the interplay between overall system design, core algorithms, query optimization strategies, programming approaches, performance modelling and evaluation, etc., from the perspective of data management applications.

Topics of Interest

The suggested topics of interest include, but are not restricted to:

Hardware and System Issues in Domain-specific Accelerators

New Programming Methodologies for Data Management Problems on Modern Hardware

Query Processing for Hybrid Architectures

Large-scale I/O-intensive (Big Data) Applications

Parallelizing/Accelerating Analytical (e.g., Data Mining) Workloads

Autonomic Tuning for Data Management Workloads on Hybrid Architectures

Algorithms for Accelerating Multi-modal Multi-tiered Systems

Energy Efficient Software-Hardware Co-design for Data Management Workloads

Parallelizing non-traditional (e.g., graph mining) workloads

Algorithms and Performance Models for modern Storage Sub-systems

Exploitation of specialized ASICs

Novel Applications of Low-Power Processors and FPGAs

Exploitation of Transactional Memory for Database Workloads

Exploitation of Active Technologies (e.g., Active Memory, Active Storage, and Networking)

Every year, we choose a theme around which the keynote or panel sessions are organized. This year, the workshop theme is Interactions of Processor Architecture with Data Management and Analytics".

Keynote Presentations

There will be three keynote presentations at the ADMS workshop. Tim Mattson, Principal Engineer, Intel Parallel Computing Lab, will talk about, "Graph analytics with lots of CPU cores". Rick Hetherington, Vice President, Oracle Hardware Development, will talk about "SPARC at Oracle: Vectoring Processor Architecture at the Database". Finally, Haider Rizvi, Distinguished Engineer, IBM Systems and Technology Group, will talk about, "IBM Power8: a processor built for big data".
Graph analytics with lots of CPU cores, Tim Mattson, Intel Parallel Computing Lab
Data analytics stress every facet of a microprocessor; from the details of the cache hierarchy, the instruction set of the processor cores, and the network between cores. In this talk, we will explore modern CPU designs through the lens of sparse matrix multiplication; one of the most important primitives for graph analytics. We’ll discuss vector instruction sets (such as the Intel® AVX-512 instructions) and how to write portable code to exploit them. Then we’ll follow the behavior of the sparse matrix multiplication as we move away from a single core to multiple cores on a processor, across the memory controller to the memory subsystem, and finally the network between nodes in a cluster. The result is an end-to-end understanding of how graph analytics (as represented through sparse matrix multiplication) interacts with the features of a modern CPU based system for data analytics.
SPARC at Oracle: Vectoring Processor Architecture at the Database, Rick Hetherington, Oracle
SPARC has a long history and has gone through numerous transformations over the past 26 years. As part of the initial debate of RISC vs CICS, SPARC was primarily deployed in workstations and technical applications. The name SPARC is an acronym for Scalable Processor ARChitecture and it soon found scalability was a key asset lending itself perfectly to commercial applications, primarily database. Contemporary SPARC processors have their roots in the multicore/multithread (CMT) approach taken by the Niagara processors released in 2004. The target of CMT was simply database. SPARC has flourished at Oracle with 20X performance rise culminating in the set of features known as Software-in-Silicon. This talk will focus on SPARC since the Oracle acquisition of Sun Microsystems in 2010. It will cover the process of developing world class silicon targeted at database as well as a ‘clouded' glimpse at what lies ahead.
IBM Power8: a Processor Built for Big Data, Haider Rizvi, IBM
IBM's POWER processors are the workhorses for all of IBM's Unix servers. IBM Unix servers lead the industry in capabilities and performance, especially for data intensive workloads, such as databases, app servers, etc. The current generation of POWER8-based systems have especially been designed for big data needs, with large caches, memory bandwidth, and abilities to enhance the processor's capabilities with CAPI (Coherent Attach Processor Interface) accelerators. In this presentation, I'll talk about the design decisions that enable big data analytics on these servers, and the optimization efforts that deliver industry-leading performance on data intensive workloads, such as data warehousing with DB2 BLU, Watson, Spark, etc. I'll also talk about the use of SIMD (single-instruction multiple-data) capabilities, exploiting the larger number of cores and hardware threads available on the Power processors.

Workshop Program (9.00 am-5.30 pm, Room 3)

Session 1: 9.00-10.30 am

9.00 am: Welcome Comments

9.00-10.00 am: "Graph analytics with lots of CPU cores", Tim Mattson, Intel Parallel Computing Lab (Keynote Presentation) (slides)

10.00-10.25 am: "The Operator Variant Selection Problem on Heterogeneous Hardware", Viktor Rosenfeld, Max Heimel,Christoph Viebig and Volker Markl, TU Berlin (slides)

10.30-11.00 am: Break

Session 2 (11.00 am -12.30 pm)

11.00-11.25 am: "Towards Dynamic Green-Sizing for Database Servers", Mustafa Korkmaz, Alexey Karyakin, Martin Karsten and Kenneth Salem, University of Waterloo (slides)

11.30-11.55 am: "Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments", Carsten Meyer, Martin Boissier, Adrian Michaud, Jan Ole Vollmer, Ken Taylor, David Schwalb, Matthias Uflacker and Kurt Roedszus, HPI and EMC Corp. (slides)

12-12.25 pm: "Towards Adaptive Resource Allocation for Database Workloads", Cong Guo and Martin Karsten, University of Waterloo (slides)

12.30-2 pm: Lunch

Session 3: (2.00-3.30 pm)

2.00-2.25 pm: "nvm malloc: Memory Allocation for NVRAM", David Schwalb, Tim Berning, Martin Faust, Markus Dreseler and Hasso Plattner, HPI, and SAP (slides)

2.30-3.30 pm: "SPARC at Oracle: Vectoring Processor Architecture at the Database", Rick Hetherington, Oracle (Keynote Presentation)

3.30-4.00 pm: Break

Session 4 (4.00 -5.30 pm)

4.00-4.25 pm: "Optimizing GPU-accelerated Group-By and Aggregation", Tomas Karnagel, Rene Mueller and Guy Lohman, IBM Almaden Research Center (slides)

4.30-5.30 pm: "IBM Power8: a Processor Built for Big Data", Haider Rizvi, IBM Systems and Technology Group. (Keynote Presentation)

Organization

Workshop Co-Chairs

Rajesh Bordawekar, IBM T.J. Watson Research Center

Buğra Gedik, Bilkent University

Tirthankar Lahiri, Oracle

Christian A. Lang, Acelot, Inc

For questions regarding the workshop please send email to contact@adms-conf.org.
Program Committee

Reza Azimi, Huawei

Nipun Agarwal, Oracle Labs

Robert Halstead, University of California, Riverside

Rashed Bhatti, IBM Almaden Research

Christoph Dubach, University of Edinburgh

Franz Faerber, SAP

Arno Jacobsen, University of Toronto

Hyojun Kim, Datos IO, Inc

Thomas Kissinger, TU Dresden

Qiong Luo, HKUST

Stefan Manegold, CWI

Sina Merji, IBM Toronto

Duane Merrill, Nvidia

Rupesh Nasre, IIT Madras

Mohammad Sadoghi, IBM Watson Research

Nadathur Satish, Intel

Sayantan Sur, Intel

Sudhakar Yalamanchili, Georgia Tech

Pinar Tozun, IBM Almaden Research

Jianting Zhang, CUNY

Important Dates

Paper Submission: Monday, June 15, 2015, 11.59 pm PST.

Notification of Acceptance: Wednesday, July 1, 2015

Camera-ready Submission: Friday, July 17, 2015

Workshop Date: Monday, August 31, 2015

Submission Instructions

The workshop proceedings will be published by VLDB and indexed via DBLP.

Submission Site

All submissions will be handled electronically via EasyChair.

Formatting Guidelines

We will use the same document templates as the VLDB15 conference. You can find them here.
It is the authors' responsibility to ensure that their submissions adhere strictly to the VLDB format detailed here. In particular, it is not allowed to modify the format with the objective of squeezing in more material. Submissions that do not comply with the formatting detailed here will be rejected without review.

The paper length for a full paper is limited to 12 pages.