Relational Sink (Buhler, Erl, Khattak)
How can large amounts of processed data be ported from a Big Data platform directly to a relational database?
A relational data transfer engine component is added to the Big Data platform. Different drivers and connectors are internally used by the relational data transfer engine to connect to different types of relational databases. The user specifies the connection string of the relational database and the table to which data needs exporting. Depending upon the capabilities of the relational data transfer engine, the relational data transfer engine may internally make use of a processing engine that parallelizes the export process by executing multiple SQL commands (INSERT/UPDATE) in parallel. Based on the availability of suitable connectors, the Relational Sink pattern can also be applied to populate data warehouses.
The application of the Relational Sink pattern may be impeded if a database-specific connector is not available. A generic connector can generally be used in such a situation. However, the data export performance may suffer.
The Big Data platform is enabled to make a direct connection to the relational database, and data is transferred as a batch from the storage device. The export process can further be scheduled to automatically update the relational database whenever fresh computational results are available.
- The user configures the relational data transfer engine to extract the required data from the storage device.
- The relational data transfer engine mechanism automatically extracts the required data from the storage device.
- The relational data transfer engine then automatically inserts the data into the relational database without requiring any human intervention.