ADMS 2015
Sixth International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures

 
Monday, August 31, 2015
 
In conjunction with VLDB 2015
Kohala Coast, Hawai‘i
 
 
 
  Links
 
 
 
 
 
 
Workshop Overview

The objective of this one-day workshop is to investigate opportunities in accelerating data management systems and workloads (which include traditional OLTP, data warehousing/OLAP, ETL, Streaming/Real-time, Business Analytics, and XML/RDF Processing) using processors (e.g., commodity and specialized Multi-core, GPUs, FPGAs, and ASICs), storage systems (e.g., Storage-class Memories like SSDs and Phase-change Memory), and programming models like MapReduce, Spark, CUDA, OpenCL, and OpenACC.

The current data management scenario is characterized by the following trends: traditional OLTP and OLAP/data warehousing systems are being used for increasing complex workloads (e.g., Petabyte of data, complex queries under real-time constraints, etc.); applications are becoming far more distributed, often consisting of different data processing components; non-traditional domains such as bio-informatics, social networking, mobile computing, sensor applications, gaming are generating growing quantities of data of different types; economical and energy constraints are leading to greater consolidation and virtualization of resources; and analyzing vast quantities of complex data is becoming more important than traditional transactional processing.

At the same time, there have been tremendous improvements in the CPU and memory technologies. Newer processors are more capable in the CPU and memory capabilities and are optimized for multiple application domains. Commodity systems are increasingly using multi-core processors with more than 6 cores per chip and enterprise-class systems are using processors with 8 cores per chip, where each core can execute upto 4 simultaneous threads. Specialized multi-core processors such as the GPUs have brought the computational capabilities of supercomputers to cheaper commodity machines. On the storage front, FLASH-based solid state devices (SSDs) are becoming smaller in size, cheaper in price, and larger in capacity. Exotic technologies like Phase-change memory are on the near-term horizon and can be game-changers in the way data is stored and processed.

In spite of the trends, currently there is limited usage of these technologies in data management domain. Naive usage of multi-core processors or SSDs often leads to unbalanced system. It is therefore important to evaluate applications in a holistic manner to ensure effective utilization of CPU and memory resources. This workshop aims to understand impact of modern hardware technologies on accelerating core components of data management workloads. Specifically, the workshop hopes to explore the interplay between overall system design, core algorithms, query optimization strategies, programming approaches, performance modelling and evaluation, etc., from the perspective of data management applications.

Topics of Interest

The suggested topics of interest include, but are not restricted to:

  • Hardware and System Issues in Domain-specific Accelerators
  • New Programming Methodologies for Data Management Problems on Modern Hardware
  • Query Processing for Hybrid Architectures
  • Large-scale I/O-intensive (Big Data) Applications
  • Parallelizing/Accelerating Analytical (e.g., Data Mining) Workloads
  • Autonomic Tuning for Data Management Workloads on Hybrid Architectures
  • Algorithms for Accelerating Multi-modal Multi-tiered Systems
  • Energy Efficient Software-Hardware Co-design for Data Management Workloads
  • Parallelizing non-traditional (e.g., graph mining) workloads
  • Algorithms and Performance Models for modern Storage Sub-systems
  • Exploitation of specialized ASICs
  • Novel Applications of Low-Power Processors and FPGAs
  • Exploitation of Transactional Memory for Database Workloads
  • Exploitation of Active Technologies (e.g., Active Memory, Active Storage, and Networking)

Every year, we choose a theme around which the keynote or panel sessions are organized. This year, the workshop theme is Interactions of Processor Architecture with Data Management and Analytics".

Keynote Presentations
There will be three keynote presentations at the ADMS workshop. Tim Mattson, Principal Engineer, Intel Parallel Computing Lab, will talk about, "Graph analytics with lots of CPU cores". Rick Hetherington, Vice President, Oracle Hardware Development, will talk about "SPARC at Oracle: Vectoring Processor Architecture at the Database". Finally, Haider Rizvi, Distinguished Engineer, IBM Systems and Technology Group, will talk about, "IBM Power8: a processor built for big data".

Graph analytics with lots of CPU cores, Tim Mattson, Intel Parallel Computing Lab

Data analytics stress every facet of a microprocessor; from the details of the cache hierarchy, the instruction set of the processor cores, and the network between cores. In this talk, we will explore modern CPU designs through the lens of sparse matrix multiplication; one of the most important primitives for graph analytics. We’ll discuss vector instruction sets (such as the Intel® AVX-512 instructions) and how to write portable code to exploit them. Then we’ll follow the behavior of the sparse matrix multiplication as we move away from a single core to multiple cores on a processor, across the memory controller to the memory subsystem, and finally the network between nodes in a cluster. The result is an end-to-end understanding of how graph analytics (as represented through sparse matrix multiplication) interacts with the features of a modern CPU based system for data analytics.

SPARC at Oracle: Vectoring Processor Architecture at the Database, Rick Hetherington, Oracle

SPARC has a long history and has gone through numerous transformations over the past 26 years. As part of the initial debate of RISC vs CICS, SPARC was primarily deployed in workstations and technical applications. The name SPARC is an acronym for Scalable Processor ARChitecture and it soon found scalability was a key asset lending itself perfectly to commercial applications, primarily database. Contemporary SPARC processors have their roots in the multicore/multithread (CMT) approach taken by the Niagara processors released in 2004. The target of CMT was simply database. SPARC has flourished at Oracle with 20X performance rise culminating in the set of features known as Software-in-Silicon. This talk will focus on SPARC since the Oracle acquisition of Sun Microsystems in 2010. It will cover the process of developing world class silicon targeted at database as well as a ‘clouded' glimpse at what lies ahead.

IBM Power8: a Processor Built for Big Data, Haider Rizvi, IBM

IBM's POWER processors are the workhorses for all of IBM's Unix servers. IBM Unix servers lead the industry in capabilities and performance, especially for data intensive workloads, such as databases, app servers, etc. The current generation of POWER8-based systems have especially been designed for big data needs, with large caches, memory bandwidth, and abilities to enhance the processor's capabilities with CAPI (Coherent Attach Processor Interface) accelerators. In this presentation, I'll talk about the design decisions that enable big data analytics on these servers, and the optimization efforts that deliver industry-leading performance on data intensive workloads, such as data warehousing with DB2 BLU, Watson, Spark, etc. I'll also talk about the use of SIMD (single-instruction multiple-data) capabilities, exploiting the larger number of cores and hardware threads available on the Power processors.
Workshop Program (9.00 am-5.30 pm, Room 3)

Session 1: 9.00-10.30 am

9.00 am: Welcome Comments

  • 9.00-10.00 am: "Graph analytics with lots of CPU cores", Tim Mattson, Intel Parallel Computing Lab (Keynote Presentation)
  • 10.00-10.25 am: "The Operator Variant Selection Problem on Heterogeneous Hardware", Viktor Rosenfeld, Max Heimel,Christoph Viebig and Volker Markl, TU Berlin

10.30-11.00 am: Break


Session 2 (11.00 am -12.30 pm)

  • 11.00-11.25 am: "Towards Dynamic Green-Sizing for Database Servers", Mustafa Korkmaz, Alexey Karyakin, Martin Karsten and Kenneth Salem, University of Waterloo
  • 11.30-11.55 am: "Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments", Carsten Meyer, Martin Boissier, Adrian Michaud, Jan Ole Vollmer, Ken Taylor, David Schwalb, Matthias Uflacker and Kurt Roedszus, HPI and EMC Corp.
  • 12-12.25 pm: "Towards Adaptive Resource Allocation for Database Workloads", Cong Guo and Martin Karsten, University of Waterloo

12.30-2 pm: Lunch


Session 3: (2.00-3.30 pm)

  • 2.00-2.25 pm: "nvm malloc: Memory Allocation for NVRAM", David Schwalb, Tim Berning, Martin Faust, Markus Dreseler and Hasso Plattner, HPI, and SAP
  • 2.30-3.30 pm: "SPARC at Oracle: Vectoring Processor Architecture at the Database", Rick Hetherington, Oracle (Keynote Presentation)

3.30-4.00 pm: Break


Session 4 (4.00 -5.30 pm)

  • 4.00-4.25 pm: "Optimizing GPU-accelerated Group-By and Aggregation", Tomas Karnagel, Rene Mueller and Guy Lohman, IBM Almaden Research Center
  • 4.30-5.30 pm: "IBM Power8: a Processor Built for Big Data", Haider Rizvi, IBM Systems and Technology Group. (Keynote Presentation)
Organization

Workshop Co-Chairs

       For questions regarding the workshop please send email to contact@adms-conf.org.

Program Committee

  • Reza Azimi, Huawei
  • Nipun Agarwal, Oracle Labs
  • Robert Halstead, University of California, Riverside
  • Rashed Bhatti, IBM Almaden Research
  • Christoph Dubach, University of Edinburgh
  • Franz Faerber, SAP
  • Arno Jacobsen, University of Toronto
  • Hyojun Kim, Datos IO, Inc
  • Thomas Kissinger, TU Dresden
  • Qiong Luo, HKUST
  • Stefan Manegold, CWI
  • Sina Merji, IBM Toronto
  • Duane Merrill, Nvidia
  • Rupesh Nasre, IIT Madras
  • Mohammad Sadoghi, IBM Watson Research
  • Nadathur Satish, Intel
  • Sayantan Sur, Intel
  • Sudhakar Yalamanchili, Georgia Tech
  • Pinar Tozun, IBM Almaden Research
  • Jianting Zhang, CUNY

Important Dates

  • Paper Submission: Monday, June 15, 2015, 11.59 pm PST.
  • Notification of Acceptance: Wednesday, July 1, 2015
  • Camera-ready Submission: Friday, July 17, 2015
  • Workshop Date: Monday, August 31, 2015

Submission Instructions

The workshop proceedings will be published by VLDB and indexed via DBLP.

Submission Site 

All submissions will be handled electronically via EasyChair.

Formatting Guidelines 

We will use the same document templates as the VLDB15 conference. You can find them here.

It is the authors' responsibility to ensure that their submissions adhere strictly to the VLDB format detailed here. In particular, it is not allowed to modify the format with the objective of squeezing in more material. Submissions that do not comply with the formatting detailed here will be rejected without review. 

The paper length for a full paper is limited to 12 pages.