Big Data Patterns | Design Patterns | Streaming Egress


Big Data Patterns, Mechanisms > Data Processing Patterns > Streaming Egress
Home > Design Patterns > Streaming Egress

Streaming Egress (Buhler, Erl, Khattak)

How can processed data be exported in realtime from a Big Data platform to other systems?

Streaming Egress

Problem

Realtime decision-making warrants that the results obtained via the processing of high velocity data are available instantaneously, which makes batch export unsuitable.

Solution

Computed results are exported from the Big Data platform as soon as they become available using an automatic push notification system.

Application

A publish-subscribe system based on a queuing mechanism is implemented that exports processed data as events to the configured subscriber(s).

A queue-based, publish-subscribe message-passing system is developed. The system is configured to copy data from a storage device and to forward the data to the downstream sinks in the form of events. The copying of the data is either based on a set interval, or it may get triggered as soon as data appears in the configured source location. Using a queue provides high availability, fault-tolerance, scalability and delivery assurance features and further enables exporting results to multiple downstream systems at a time.

The Streaming Egress pattern is generally applied together with the Streaming Source, Realtime Access Storage and High Velocity Realtime Processing patterns.

Streaming Egress: Functionality is added to the Big Data platform to enable exporting analysis results as soon as they get computed. A system is developed that continuously copies the computed result to the interested clients in realtime.

Functionality is added to the Big Data platform to enable exporting analysis results as soon as they get computed. A system is developed that continuously copies the computed result to the interested clients in realtime.

  1. Readings from a pressure sensor that arrive every five seconds are analyzed in realtime by a realtime processing engine.
  2. The analysis results are stored in a storage device.
  3. A publish-subscribe system copies the analysis results to a pressure monitoring application every ten seconds.
  4. The introduced latency with regards to the availability of the analysis results is very limited.
  5. Due to this, the engineer is able to make timely decisions.

BigDataScienceSchool.com Big Data Science Certified Professional (BDSCP) Module 11: Advanced Big Data Architecture.

This pattern is covered in BDSCP Module 11: Advanced Big Data Architecture.

For more information regarding the Big Data Science Certified Professional (BDSCP) curriculum,
visit www.arcitura.com/bdscp.

Big Data Fundamentals

The official textbook for the BDSCP curriculum is:

Big Data Fundamentals: Concepts, Drivers & Techniques
by Paul Buhler, PhD, Thomas Erl, Wajid Khattak
(ISBN: 9780134291079, Paperback, 218 pages)

Please note that this textbook covers fundamental topics only and does not cover design patterns.
For more information about this book, visit www.arcitura.com/books.