Azure Databricks Mastery: Hands-on project with Unity Catalog , Delta lake, CI/CD implementing Medallion Architecture
What you'll learn Understand and implement Unity Catalog
Implement project with incremental loading
Understanding the Spark Structured Streaming
Implement Continuous Integration and Continuous Deployment in project
Real time hands-on project experience
Implement and work with Delta Lake
Understand the features of Delta Lake
Implement Medallion Architecture in your project
Evolution of Delta lake from Datalake
Understand workflows in Azure Databricks
Simulate real time environment with Unity Catalog
Implement and understand the governance with Unity Catalog
Master the compute cluster creation and Management
How spark structured streaming works
Implement structured streaming in Azure databricks
Understand incremental loading with Autoloader
Code that can run in any environment
Understand and implement the Unity Catalog Object Model
Build an end to end CICD pipeline
Understand the implementation of Delta Live tables
Practise tests to check your knowledge
Requirements Basic knowledge on Python and SQL
Basic knowledge on Azure Cloud
An Azure account to implement the end to end project
Description Embark on a transformative journey to master Azure Databricks with our comprehensive hands-on Udemy course. Tailored not just for learning but also to equip you with practical concepts essential for passing the Databricks Certified Data Engineer Associate certification, this course is your key to success.Immerse yourself in real-world projects where you'll leverage the capabilities of Unity Catalog, Delta Lake, and CI/CD methodologies, all while implementing the cutting-edge Medallion Architecture. This training program serves as your gateway to seamlessly integrate and process data in the cloud, offering invaluable insights into the latest practices in data engineering.Throughout the course, delve into the intricacies of Delta Lake, refine your skills with Unity Catalog, and become proficient in the art of Continuous Integration and Continuous Deployment. Whether you're a seasoned data professional aiming to enhance your skill set or a budding enthusiast eager to explore the world of data engineering, this course provides the tools and knowledge to elevate your expertise in Azure Databricks.Join us on this educational journey to unlock the full potential of cloud-based data engineering, propelling yourself towards success in contemporary data projects. Enrich your career and knowledge with this comprehensive Udemy course, ensuring you don't miss the opportunity to become a proficient Azure Databricks Engineer. Your transformation begins here!
Overview Section 1: Introduction
Lecture 1 Course Introduction
Lecture 2 Project Architecture and Concepts
Lecture 3 Course prerequisites and benefits
Lecture 4 Project Complete Code
Section 2: Environment Setup
Lecture 5 Section Introduction
Lecture 6 Creating a budget for project
Lecture 7 Creating an Azure Databricks Workspace
Lecture 8 Creating an Azure Datalake Storage Gen2
Lecture 9 Walkthough on databricks Workspace UI
Section 3: Azure Databricks - An Introduction
Lecture 10 Section Introduction
Lecture 11 Introduction to Distributed Data Processing
Lecture 12 What is Azure Databricks
Lecture 13 Azure Databricks Architecture
Lecture 14 Cluster types and configuration
Lecture 15 Behind the scenes when creating cluster
Lecture 16 Sign up for Databricks Community Edition
Lecture 17 Understanding notebook and Markdown basics
Lecture 18 Notebook - Magic Commands
Lecture 19 DBUitls -File System Utilities
Lecture 20 DBUitls -Widget Utilities
Lecture 21 DBUtils - Notebook Utils
Section 4: Delta lake
Lecture 22 Section Intro
Lecture 23 Drawbacks of Azure Datalake
Lecture 24 What is delta lake
Lecture 25 Understanding Lakehouse Architecture
Lecture 26 Creating databricks workspace and ADLS for delta lake
Lecture 27 Accessing Datalake storage using service principal
Lecture 28 Drawbacks of ADLS - practical
Lecture 29 Creating Delta lake
Lecture 30 Understanding the delta format
Lecture 31 Understanding Transaction Log
Lecture 32 Creating delta tables using SQL Command
Lecture 33 Creating Delta table using PySpark Code
Lecture 34 Uploading files for next lectures
Lecture 35 Schema Enforcement
Lecture 36 Schema Evolution
Lecture 37 Time Travel and Versioning
Lecture 38 Vacuum Command
Lecture 39 Convert to Delta
Lecture 40 Understanding Optimize Command - Demo
Lecture 41 Optimize Command - Practical
Lecture 42 UPSERT using MERGE
Section 5: Unity Catalog
Lecture 43 Section Introduction
Lecture 44 What is Unity Catalog
Lecture 45 Creating Access Connector for Databricks
Lecture 46 Creating Metastore in Unity Catalog
Lecture 47 Unity Catalog Object Model
Lecture 48 Roles in Unity Catalog
Lecture 49 Creating users in Azure Entra ID
Lecture 50 User and groups management Practical
Lecture 51 Cluster Policies
Lecture 52 What are cluster pools
Lecture 53 Creating Cluster Pool
Lecture 54 Creating a Dev Catalog
Lecture 55 Unity Catalog Privileges
Lecture 56 Understanding Unity Catalog
Lecture 57 Creating and accessing External location and storage credentials
Lecture 58 Managed and External Tables in Unity Catalog
Section 6: Spark Structured Streaming
Lecture 59 Section Introduction
Lecture 60 Spark Structured Streaming - basics
Lecture 61 Understanding micro batches and background query
Lecture 62 Supported Sources and Sinks
Lecture 63 WriteStream and checkpoints
Lecture 64 Community Edition Drop databases
Lecture 65 Understanding outputModes
Lecture 66 Understanding Triggers
Lecture 67 Autoloader - Intro
Lecture 68 Autoloader - Schema inference
Lecture 69 Schema Evolution - Demo
Lecture 70 Schema Evolution - Practical
Section 7: Project Overview
Lecture 71 Section Introduction
Lecture 72 Typical Medallion Architecture
Lecture 73 Project Architecture
Lecture 74 Understanding the dataset
Section 8: Project Setup
Lecture 75 Section Introduction
Lecture 76 Expected Setup
Lecture 77 Creating containers and External Locations
Lecture 78 Creating all schemas dynamically
Lecture 79 Creating bronze Tables Dynamically
Section 9: Ingestion to Bronze
Lecture 80 Section Introduction
Lecture 81 Ingesting data to bronze layer - Demo
Lecture 82 Ingesting raw_traffic data to bronze table
Lecture 83 Assignment to get the raw_roads data to bronze table
Lecture 84 Ingesting raw_roads data to bronze Table
Lecture 85 To prove autoloader handles incremental loading
Section 10: Silver Layer Transformations
Lecture 86 Section Introduction
Lecture 87 Transforming Silver Traffic data
Lecture 88 To prove only incremented records were being transformed
Lecture 89 Creating a common Notebook
Lecture 90 Run one notebook from another notebook
Lecture 91 Transforming Silver Roads data
Section 11: Loading to Gold Layer
Lecture 92 Section Introduction
Lecture 93 Getting data to Gold Layer
Lecture 94 Gold Layer Transformations and loading
Section 12: Orchestrating with Workflows
Lecture 95 Section Introduction
Lecture 96 Adding run for common notebook in all notebooks
Lecture 97 Creating Jobs and executing end to end flow
Lecture 98 Attaching trigger to workflows
Section 13: Reporting with Power BI
Lecture 99 Installing Power BI Desktop
Lecture 100 Reporting data to Power BI
Section 14: Continuous Integration and Continuous Deployment (CICD)
Lecture 101 Section Introduction
Lecture 102 Expected Setup
Lecture 103 Understanding Continuous Integration
Lecture 104 Understanding Continuous Deployment
Lecture 105 Creating Required resources for UAT
Lecture 106 Configuring storage containers and external locations for UAT
Lecture 107 Login and create repository in Azure DevOps
Lecture 108 Integrating Azure Devops with Databricks
Lecture 109 Creating feature branch and pull request to main branch
Lecture 110 Creating pull request as new user
Lecture 111 Uploading and understanding YAML Files for CICD
Lecture 112 Creating CI pipeline to have live folder
Lecture 113 Permissions to see Live Folder
Lecture 114 Creating Deployment pipeline and deploying
Lecture 115 End to end test CICD pipeline
Lecture 116 Running notebooks in UAT
Section 15: Delta Live Tables (DLT)
Lecture 117 Section Intro
Lecture 118 Origin of Delta live tables
Lecture 119 Considerations in Lakehouse Architecture
Lecture 120 Understanding Declarative ETL
Lecture 121 Limitations of Delta Live Tables
Lecture 122 Defining Tables from datasets
Lecture 123 Creating DLT Pipeline
Lecture 124 End to end DLT Pipeline
Lecture 125 Deleting cluster by DLT pipeline
Section 16: Conclusion
Lecture 126 Course completion
Lecture 127 My other Data Engineering Courses
Data Engineers who want to get real time experience using Azure Databricks,Data professionals who want to build an end to end project in Azure Databricks,Engineers who want to learn Azure Databricks and its implementation