Big Data Patterns, Mechanisms > Mechanisms > Storage Device
Storage Device
Storage devices provide the underlying data storage environment for persisting the datasets that are processed by Big Data solutions. A storage device can exist as a distributed file system or a database.
Distributed file systems can be used for persisting immutable data that is intended for streaming access or batch processing. Databases, such as NoSQL repositories, can be used for structured and unstructured storage and read/write data access, as shown in Figure 2. Note that distributed file systems and databases are both on-disk storage devices.
Figure 1 – Structured data is imported into a storage device (1) using a data transfer engine (2). Unstructured data is imported (3) using another type of data transfer engine (4).
Related Patterns:
- Automated Dataset Execution
- Automated Processing Metadata Insertion
- Automatic Data Replication and Reconstruction
- Automatic Data Sharding
- Canonical Data Format
- Cloud-based Big Data Processing
- Cloud-based Big Data Storage
- Complex Logic Decomposition
- Confidential Data Storage
- Data Size Reduction
- Dataset Decomposition
- Dataset Denormalization
- Direct Data Access
- Fan-in Ingress
- Fan-out Ingress
- File-based Sink
- File-based Source
- High Velocity Realtime Processing
- High Volume Binary Storage
- High Volume Hierarchical Storage
- High Volume Linked Storage
- High Volume Tabular Storage
- Indirect Data Access
- Intermediate Results Storage
- Large-Scale Batch Processing
- Processing Abstraction
- Realtime Access Storage
- Relational Sink
- Relational Source
- Streaming Egress
- Streaming Source
- Streaming Storage