Three ways to turn old files into Hadoop data sets in a data lake

Share This Post

One of the reasons why Hadoop systems are being integrated with data warehouses is to move cold data that isn’t accessed frequently from a warehouse database to Hive tables running on top of the Hadoop Distributed File System (HDFS). This mingling of conventional databases with Hadoop is often a first step in the data modernization process, and it opens up a range of new options for creating useful Hadoop data sets.

A particularly promising aspect involves migrating the massive volumes of historical data hidden away in many data warehouses to big data environments to make the info more accessible for analysis. In a lot of cases, that data is stored in mainframe files, such as VSAM, IMS and COBOL files. When planning a legacy data migration to a data lake, you have to consider the different alternatives for the target format based on the anticipated use cases for the data.

To read the entire article, please click on this

More To Explore