Related Work

As mentioned in Section 1, the main goal of this paper is to present and discuss the concepts surrounding data modeling and data analytics, and their evolution for three representative approaches: operational databases, decision support databases and Big Data technologies. In our survey we have researched related works that also explore and compare these approaches from the data modeling or data analytics point of view.

Table 3. Comparison of the approaches from the Data Analytics perspective

Approaches

Features

Class of Application Domains

Common

Operations

Operations

Concrete

Languages

Abstraction Level

Technology

Support

Operational

OLTP

Read/Write

Select, Insert, Update, Delete, Join, OrderBy, GroupBy

SQL-DML

Logical, Physical

Microsoft SQL Server, Oracle, MySQL,

PostgreSQL,

IBM DB2

Decision Support

OLAP

Read

Slice, Dice, Drill down, Drill up, Pivot

SQL-DML, MDX, XMLA

Logical, Physical

Microsoft SQL Server, Oracle, MySQL,

PostgreSQL,

IBM DB2,

Microsoft OLAP Provider,

Microsoft Analysis Services

Big Data

Batch-oriented processing

Read/Write

Map-Reduce, Select, Insert, Update, Delete, Load, Import, Export, OrderBy, GroupBy

Hive QL, Pig Latin

Logical, Physical

Hadoop, Hive Pig

Stream processing

Read/Write

Aggregate, Partition, Merge, Join,

SQL stream

Logical, Physical

Storm, S4, Spark

OLTP

Read/Write

Select, Insert, Update, Delete, Batch, Get, OrderBy, GroupBy

CQL, Java, JavaScript

Logical, Physical

Cassandra, HBase

Interactive ad-hoc queries and analysis

Read

Select, Insert, Update, Delete, OrderBy, GroupBy

SQL-DML

Logical, Physical

Drill


J.H. ter Bekke provides a comparative study between the Relational, Semantic, ER and Binary data models based on an examination session results. In that session participants had to create a model of a case study, similar to the Academic Management System used in this paper. The purpose was to discover relationships between the modeling approach in use and the resulting quality. Therefore, this study just addresses the data modeling topic, and more specifically only considers data models associated to the database design process.

Several works focus on highlighting the differences between operational databases and data warehouses. For example, R. Hou provides an analysis between operational databases and data warehouses distinguishing them according to their related theory and technologies, and also establishing common points where combining both systems can bring benefits. C. Thomsen and T.B. Pedersen compare open source ETL tools, OLAP clients and servers, and DBMSs, in order to build a Business Intelligence (BI) solution.

P. Vassiliadis and T. Sellis conducted a survey that focuses only on OLAP databases and compare various proposals for the logical models behind them. They group the various proposals in just two categories: commercial tools and academic efforts, which in turn are subcategorized in relational model extensions and cube- oriented approaches. However, unlike our survey they do not cover the subject of Big Data technologies.

Several papers discuss the state of the art of the types of data stores, technologies and data analytics used in Big Data scenarios, however they do not compare them with other approaches. Recently, P. Chandarana and M. Vijayalakshmi focus on Big Data analytics frameworks and provide a comparative study according to their suitability.

Summarizing, none of the following mentioned work provides such a broad analysis like we did in this paper, namely, as far as we know, we did not find any paper that compares simultaneously operational databases, decision support databases and Big Data technologies. Instead, they focused on describing more thoroughly one or two of these approaches.