Smart Data Lake or Data Landfill? The Difference May Be ‘Semantic’
Featuring: Surya Mukherjee | Senior Analyst, Ovum Research and Marty Loughlin, VP, Cambridge Semantics
Data lakes are fast becoming a front-burner issue as early Hadoop adopters plan or consider implementation. They are attractive for several reasons – fixed schema independence, use of commodity hardware & economical archiving, cross platform data discovery and insights.
In principle, data lakes should be transparent, manageable, and governable, even if incrementally, without which organizations may be exposed to risks and lower ROI. There are many approaches from data platform and integration providers to making data lakes governable, each with positives and tradeoffs. The semantic approach, which is driven by taxonomies and ontologies, can be extremely helpful for industries such as financial services and healthcare.
- Data Lake and the enterprise agenda
- Data landfill versus a ‘smart’ data lake
- Key components to a smart data lake
- The semantic approach to data lakes
- Recommendations for enterprises