Big Data Patterns, Mechanisms > Storage Patterns > High Volume Tabular Storage
High Volume Tabular Storage (Buhler, Erl, Khattak)
How can large amounts of non-relational data be stored in a table-like form where each record may consist of a very large number of fields or related groups of fields?
![High Volume Tabular Storage](https://patterns.arcitura.com/wp-content/uploads/2018/09/high_volume_tabular_storage.png)
Problem
Solution
Application
Mechanisms
A column-family NoSQL database is used to enable the High Volume Tabular Storage pattern. Such a database normally allows adding multiple key-value pairs under a column and further allows rows within the same table to have different columns. Some level of schema conformance can be achieved by specifying a table schema before the table is populated. Some column-family implementations may support generic data types such as integer, float and double, while others may persist data within columns in binary form, in which case some serialization may be required before data is stored and deserialization when data is retrieved. Such databases may provide SQL-like or API-based access.
This pattern is also applicable when a relational database needs replacing with a highly scalable alternative, provided that ACID support is not required.
![High Volume Tabular Storage: A database based on NoSQL technology is used that is capable of storing data in a hierarchical format and understanding the internal structure of the data. Saving data based on a nested structure further enables relational-like storage such that the related child table records can be embedded inside the parent table record. High Volume Tabular Storage: A database based on NoSQL technology is used that is capable of storing data in a hierarchical format and understanding the internal structure of the data. Saving data based on a nested structure further enables relational-like storage such that the related child table records can be embedded inside the parent table record.](https://patterns.arcitura.com/wp-content/uploads/2018/09/fig1-167.png)
A database based on NoSQL technology is used that is capable of storing data in a hierarchical format and understanding the internal structure of the data. Saving data based on a nested structure further enables relational-like storage such that the related child table records can be embedded inside the parent table record.
- A dataset consists of rows such that each record consists of one million attributes.
- The user uses a column-family NoSQL database to import the dataset.
- The import is a success as the database can store more than billion attributes.
This pattern is covered in BDSCP Module 10: Fundamental Big Data Architecture.
For more information regarding the Big Data Science Certified Professional (BDSCP) curriculum,
visit www.arcitura.com/bdscp.
The official textbook for the BDSCP curriculum is:
Big Data Fundamentals: Concepts, Drivers & Techniques
by Paul Buhler, PhD, Thomas Erl, Wajid Khattak
(ISBN: 9780134291079, Paperback, 218 pages)
Please note that this textbook covers fundamental topics only and does not cover design patterns.
For more information about this book, visit www.arcitura.com/books.