Data lakes are typically standalone objects, meaning, not designed with any particular requirements or upstream/downstream dependencies. In a way, they represent the rebirth of the original data warehouse concept, getting all the data in one place but without the limitations of schema and scale, but also without any offerings for people to use it. Because in this form, organizations are finding that getting value from the investment is elusive. The data lakes lack most essential capabilities to be useful for those other than data scientists and IT developers.
The reason for this is that there is a yawning gap between a Brobdingnagian (OK, huge) collection of datasets of mixed up formats, semantics, and types and an organized data warehouse. Without some extreme enhancement and translation process that provides both an abstraction layer and pipelines from the data lake to the ops and analytics, reaping value from the effort is elusive.
To read the entire article, please click on https://diginomica.com/big-data-didnt-fall-mapr-hadoop-fading-data-lakes-are-not