This book is for data engineers looking to streamline data ingestion, transformation, and orchestration tasks. Data analysts responsible for managing and processing lakehouse data for analysis, reporting, and visualization will also find this book beneficial. Additionally, DataOps/DevOps engineers will find this book helpful for automating the testing and deployment of data pipelines, optimizing table tasks, and tracking data lineage within the lakehouse. Beginner-level knowledge of Apache Spark and Python is needed to make the most out of this book.
Will Girten is a lead specialist solutions architect who joined Databricks in early 2019. With over a decade of experience in data and AI, Will has worked in various business verticals, from healthcare to government and financial services. Will's primary focus has been helping enterprises implement data warehousing strategies for the lakehouse and performance-tuning BI dashboards, reports, and queries. Will is a certified Databricks Data Engineering Professional and Databricks Machine Learning Professional. He holds a Bachelor of Science in computer engineering from the University of Delaware.