Big Data Patterns | Design Patterns | High Volume Hierarchical Storage


Big Data Patterns, Mechanisms > Storage Patterns > High Volume Hierarchical Storage
Home > Design Patterns > High Volume Hierarchical Storage

High Volume Hierarchical Storage (Buhler, Erl, Khattak)

How can large amounts of non-relational data that conforms to a nested structure be stored in a scalable manner so that the data retains its internal structure and sub-sections of a data unit can be accessed?

High Volume Hierarchical Storage

Problem

Storing very large amounts of semi-structured data using traditional database technologies so that individual elements of a data unit can be referenced without performing post-query processing is either not possible because of the inability to parse the structure of the data unit or suffers from scalability issues.

Solution

Semi-structured data is stored in a nested form based on a clustered storage technique.

Application

A NoSQL-based Big Data storage technology is used, which stores each data unit as a nested document where not only the complete data unit be accessed via a unique key but also individual sections of the data unit can be accessed.

Mechanisms

A document NoSQL database is used to store nested data. Such a database generally uses a hierarchical format, such as JSON, as an internal data storage format. Apart from providing key based access, such a storage device is also capable of selecting sub-units of a data unit along with full create, read, update and delete (CRUD) functionality. API-based access is generally provided by the document NoSQL database for performing the CRUD operations.

High Volume Hierarchical Storage: A database based on NoSQL technology is used that is capable of storing data in a hierarchical format and understanding the internal structure of the data. Saving data based on a nested structure further enables relational-like storage such that the related child table records can be embedded inside the parent table record.

A database based on NoSQL technology is used that is capable of storing data in a hierarchical format and understanding the internal structure of the data. Saving data based on a nested structure further enables relational-like storage such that the related child table records can be embedded inside the parent table record.

  1. An XML invoice is saved in a document NoSQL database, which retains the internal structure of the invoice
  2. The user only requires the purchased items to be returned from the database.
  3. The database successfully returns only the list of the purchased items from the invoice.

BigDataScienceSchool.com Big Data Science Certified Professional (BDSCP) Module 10: Fundamental Big Data Architecture

This pattern is covered in BDSCP Module 10: Fundamental Big Data Architecture.

For more information regarding the Big Data Science Certified Professional (BDSCP) curriculum,
visit www.arcitura.com/bdscp.

Big Data Fundamentals

The official textbook for the BDSCP curriculum is:

Big Data Fundamentals: Concepts, Drivers & Techniques
by Paul Buhler, PhD, Thomas Erl, Wajid Khattak
(ISBN: 9780134291079, Paperback, 218 pages)

Please note that this textbook covers fundamental topics only and does not cover design patterns.
For more information about this book, visit www.arcitura.com/books.