Data Engineering with Azure Databricks

4 days
UADB
4 days

Upcoming Sessions

Date:

Format:

Price:

Location:

Book now

Date:

Format:

Price:

Location:

Book now

Date:

Format:

Price:

Location:

Book now

Date:

Format:

Price:

Book now

Interested in a private company training? Request it here.

Getting Started with Azure Databricks

Azure Databricks allows us to use the power of Apache Spark without the configuration hassle of manually creating and configuring Apache Spark clusters. In this chapter you will learn how to setup an Azure Databricks environment and work with Databricks workspaces.

  • What is Azure Databricks
  • Introducing Apache Spark
  • Workspaces in Azure Databricks
  • Provision Azure Databricks Workspaces
  • Navigating Workspaces
  • Azure Databricks Configuration and Security
  • Azure Databricks Pricing
  • LAB: Getting started with Azure Databricks

Azure Storage and Data Lakes

Databricks does not come with it's own cloud object storage. When you are using Databricks on the Azure platform, it stores it's data and metadata in one ore more Azure Data Lake Gen2 storage accounts.

  • Storing Data in Azure Databricks
  • An introduction to Azure Storage
  • Accessing an Azure Storage Account
  • Storing Data in a Data Lake
  • The Medallion Architecture
  • Storage Formats in Data Lakes
  • Delta Lake
  • Other Open Table Formats
  • LAB: Provision an Azure Storage Account

Introduction to the Unity Catalog

Unity Catalog provides centralized access control, auditing, lineage, and data discovery capabilities across Databricks workspaces. In this chapter you will learn how to setup and configure a Unity Catalog metastore for your workspaces

  • Introduction to the Unity Catalog
  • Create a Unity Catalog Metastore
  • Creating Unity Catalog Artifacts
  • Working with Schemas, Tables and Volumes
  • LAB: Setup and configure a Unity Catalog Metastore

Configure Databricks Compute

Databricks compute refers to the selection of computing resources available in the Databricks workspace. Users need access to compute to run data engineering, data science, and data analytics workloads, such as production ETL pipelines, streaming analytics, ad-hoc analytics, and machine learning. Learn about the different types of Compute that can be provisioned in Azure Databricks.

  • Apache Spark
  • the Datbricks Runtime
  • Databricks Compute Types
  • Provisioned Compute Types
  • Databricks Serverless Compute
  • Attaching Notebooks to Compute
  • Usage Monitoring
  • LAB: Creating and Using Databricks Compute

Using Notebooks in Azure Databricks

Using popular languages such as Python, SQL and R data can be loaded, visualized, transformed and analyzed via interactive notebooks.

  • The Databricks File System (DBFS)
  • Working with Notebooks in Databricks
  • Magic Commands
  • Databricks Utilities
  • The Databricks Assistant
  • Working with IPython Widgets
  • Working with Databricks Widgets
  • Notebook Dashboards
  • Scheduling Notebooks
  • LAB: Using Notebooks in Azure Databricks

Accessing data in Azure Databricks

There are many ways to access data in Azure Databricks. From uploading small files via the portal over ad-hoc connections up to mounting Azure Storage or data lakes. The files can also be treated as a table, providing easy access.

  • The Spark Framework
  • Introduction to Spark DataFrames
  • Reading and writing data using Spark DataFrames
  • Mounting Azure Blob and Data Lake Gen2 Storage
  • Cleaning and Transforming data using the Spark DataFrame API
  • Schemas and Tables in Databricks
  • Managed vs Unmanaged Tables
  • Tables in the Unity Catalog
  • LAB: Working with Data in Azure Databricks

Building a Lakehouse using Azure Databricks

Delta Lake is an optimized storage layer that provides the foundation for storing data and tables in a Databricks lakehouse. Learn how to create, query and optimize Delta Tables in a Databricks lakehouse

  • Implementing a Delta Lake
  • Working with Delta Tables
  • Managing Schema change
  • Version and Optimize Delta Tables
  • Data skipping and Z-order
  • Delta Tables and Change Data Feeds
  • Delta Tables and the Unity Catalog
  • Securing Tables in the Unity Catalog
  • LAB: Building a Lakehouse using Delta Tables

Data Warehousing and Analysis with Databricks SQL

The lakehouse architecture and Databricks SQL Warehouse bring cloud data warehousing capabilities to your data lakes. A SQL warehouse is a compute resource that lets you run SQL commands on objects within Databricks SQL. Learn about the available warehouse types and how to query them.

  • What are SQL Warehouses?
  • Writing queries using the SQL Editor
  • Working with Tables and Views
  • Ingesting Data
  • Visualizing Data
  • Creating and using Dashboards
  • LAB: Using SQL Warehouses

Databricks and Power BI

Microsoft Power BI is a business analytics tool that provides interactive visualizations with self-service business intelligence capabilities, enabling end users to create reports and dashboards. You can connect Power BI Desktop to your Databricks clusters and Databricks SQL warehouses

  • Power BI Introduction
  • Connect Power BI Desktop to Databricks using Partner Connect
  • Connect Power BI Desktop to Databricks manually
  • LAB: Connection Power BI to Databricks

Databricks is a data analytics platform powered by Apache Spark for data engineering, data science, and machine learning. This training teaches how to use Azure Databricks to design and build a data lakehouse architecture.

No prior knowledge of Azure Databricks is required.

Contact Us
  • Address:
    U2U nv/sa
    Z.1. Researchpark 110
    1731 Zellik (Brussels)
    BELGIUM
  • Phone: +32 2 466 00 16
  • Email: info@u2u.be
  • Monday - Friday: 9:00 - 17:00
    Saturday - Sunday: Closed
Say Hi
© 2025 U2U All rights reserved.