In this hands-on workshop, you’ll learn the basics of data lakes and see why so many organizations are adopting them. We’ll cover how data lakes work, how they compare to traditional databases and big data tools, and what makes them powerful.
You’ll build your own data lake from the ground up, using an object store, a metastore, a query engine, and analytics tools. With the query engine, you’ll explore and manipulate your data to better understand how it flows and how the system works.
We’ll also introduce analytics tools with real-world big data analytics use cases — which you can try on your own datasets. As we go deeper, you’ll learn about advanced topics like Apache Iceberg tables for handling updates and deletes, along with key aspects of managing a data lake: security, best practices, and controlling costs.