Big Data Patterns | Design Patterns | Indirect Data Access

Big Data Patterns, Mechanisms > Data Transfer and Transformation Patterns > Indirect Data Access

Home > Design Patterns > Indirect Data Access

Indirect Data Access (Buhler, Erl, Khattak)

How can traditional BI tools access data stored in Big Data storage technologies without having to make separate connections to these technologies?

Problem

Data analysts using traditional Business Intelligence (BI) tools may need access to processed data stored within the Big Data platform. However, the use of non-relational storage technology makes this task difficult, as traditional BI tools only support connecting to data warehouses.

Solution

Processed data is exported from the Big Data platform to the data warehouse, from where it can be accessed by the existing BI tools without the need to make separate connections.

Application

Processed data in the Big Data platform is converted to the required schema and then exported to the data warehouse using a connector.

Mechanisms

Data Transfer Engine, Productivity Portal, Query Engine, Storage Device, Workflow Engine

A relational data transfer engine is used for exporting the data from the storage device to the data warehouse. However, before data can be exported, normally some level of formatting or restructuring may be required, as the data stored by the storage device is in a non-relational form. Additionally, data will need to be normalized if the Data Denormalization pattern was applied previously.

Indirect Data Access: An indirect approach is adopted by using the data warehouse as an intermediary. The required data is exported from the Big Data platform to the data warehouse. As the BI tool can connect to the data warehouse, the exported data can now be accessed by the BI tool.

An indirect approach is adopted by using the data warehouse as an intermediary. The required data is exported from the Big Data platform to the data warehouse. As the BI tool can connect to the data warehouse, the exported data can now be accessed by the BI tool.

A Big Data platform holds a large dataset of processed data in a NoSQL database.
A data warehouse holds enterprise-wide data from multiple OLTP systems.
An analyst uses a BI tool to query the data warehouse and generate some reports.
The analyst needs to create a report based on the processed data in the Big Data platform.
A relational data transfer engine is used to export the dataset to the data warehouse.
The analyst uses the BI tool to query the newly imported processed dataset in the warehouse and retrieves the required data to generate the report.

BigDataScienceSchool.com Big Data Science Certified Professional (BDSCP) Module 11: Advanced Big Data Architecture.

This pattern is covered in BDSCP Module 11: Advanced Big Data Architecture.

For more information regarding the Big Data Science Certified Professional (BDSCP) curriculum,
visit www.arcitura.com/bdscp.

The official textbook for the BDSCP curriculum is:

Big Data Fundamentals: Concepts, Drivers & Techniques
by Paul Buhler, PhD, Thomas Erl, Wajid Khattak
(ISBN: 9780134291079, Paperback, 218 pages)

Please note that this textbook covers fundamental topics only and does not cover design patterns.
For more information about this book, visit www.arcitura.com/books.