The data lake is a daring new approach for harnessing the power of big data technology and providing convenient self-service capabilities. But is it right for your company? This book is based on discussions with prac...

Buy Now From Amazon

Product Review

The data lake is a daring new approach for harnessing the power of big data technology and providing convenient self-service capabilities. But is it right for your company? This book is based on discussions with practitioners and executives from more than a hundred organizations, ranging from data-driven companies such as Google, LinkedIn, and Facebook, to governments and traditional corporate enterprises. You’ll learn what a data lake is, why enterprises need one, and how to build one successfully with the best practices in this book.

Alex Gorelik, CTO and founder of Waterline Data, explains why old systems and processes can no longer support data needs in the enterprise. Then, in a collection of essays about data lake implementation, you’ll examine data lake initiatives, analytic projects, experiences, and best practices from data experts working in various industries.

  • Get a succinct introduction to data warehousing, big data, and data science
  • Learn various paths enterprises take to build a data lake
  • Explore how to build a self-service model and best practices for providing analysts access to the data
  • Use different methods for architecting your data lake
  • Discover ways to implement a data lake from experts in different industries


Similar Products

Foundations for Architecting Data Solutions: Managing Successful Data ProjectsArchitecting Modern Data Platforms: A Guide to Enterprise Hadoop at ScaleStreaming Systems: The What, Where, When, and How of Large-Scale Data ProcessingHands-On Unsupervised Learning Using Python: How to Build Applied Machine Learning Solutions from Unlabeled DataSpark: The Definitive Guide: Big Data Processing Made SimpleDesigning Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable SystemsHadoop: The Definitive Guide: Storage and Analysis at Internet ScaleThe Hundred-Page Machine Learning BookData Science from Scratch: First Principles with PythonKafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale