talk-data.com
People (2 results)
Activities & events
| Title & Speakers | Event |
|---|---|
|
Data Engineering with Databricks Cookbook
2024-05-31
Pulkit Chadha
– author
In "Data Engineering with Databricks Cookbook," you'll learn how to efficiently build and manage data pipelines using Apache Spark, Delta Lake, and Databricks. This recipe-based guide offers techniques to transform, optimize, and orchestrate your data workflows. What this Book will help me do Master Apache Spark for data ingestion, transformation, and analysis. Learn to optimize data processing and improve query performance with Delta Lake. Manage streaming data processing with Spark Structured Streaming capabilities. Implement DataOps and DevOps workflows tailored for Databricks. Enforce data governance policies using Unity Catalog for scalable solutions. Author(s) Pulkit Chadha, the author of this book, is a Senior Solutions Architect at Databricks. With extensive experience in data engineering and big data applications, he brings practical insights into implementing modern data solutions. His educational writings focus on empowering data professionals with actionable knowledge. Who is it for? This book is ideal for data engineers, data scientists, and analysts who want to deepen their knowledge in managing and transforming large datasets. Readers should have an intermediate understanding of SQL, Python programming, and basic data architecture concepts. It is especially well-suited for professionals working with Databricks or similar cloud-based data platforms. |
O'Reilly Data Engineering Books
|
|
Distributing Data Governance: How Unity Catalog Allows for a Collaborative Approach
2023-08-01 · 15:58
Gilad Asulin
,
Pulkit Chadha
– author
As one of the world’s largest providers of content delivery network (CDN) and security solutions, Akamai owns thousands of data assets of various shapes and sizes, some even go up to multiple PBs. Several departments within the company leverage Databricks for their data and AI workloads, which means we have over a hundred Databricks workspaces within a single Databricks account, where some of the assets are shared across products, and some are product-specific. In this presentation, we will describe how to use the capabilities of Unity Catalog to distribute the administration burden between departments, while still maintaining a unified governance model. We will also share the benefits we’ve found in using Unity Catalog, beyond just access management, such as:
Talk by: Gilad Asulin and Pulkit Chadha Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc |
Databricks DATA + AI Summit 2023 |