Data Engineering with Microsoft Fabric

5 days

UFAB

5 days

Upcoming Sessions

Date:

Format:

Price:

Location:

Book now

Date:

Format:

Price:

Location:

Book now

Date:

Format:

Price:

Location:

Book now

Date:

Format:

Price:

Book now

Show fewer Show more

Interested in a private company training? Request it here.

Not ready to book yet? Request an offer here.

Introduction into Microsoft Fabric

The chapter introduces the data lake approach. It also provides a high-level overview of the building blocks of Microsoft Fabric and how to get started. The Data Mesh architecture is discussed and compared with Microsoft Fabric.

What is Microsoft Fabric?
From traditional data warehousing to data lakes
Data Mesh Architecture
Working with Task Flow
Microsoft Fabric Licensing
Monitor Microsoft Fabric
Domains and Workspaces in Microsoft Fabric
LAB: Getting started with Microsoft Fabric

Introduction to Data Lakes

Microsoft Fabric is built on the idea of replacing a traditional data warehouse with a data lake. This module explains why and how the relational data warehouse could be replaced by a file-based data lake.

From Data Warehouse to Data Lake
Volume, velocity and variety problems
From Data Lake to Lakehouse

Microsoft OneLake

Microsoft OneLake is the OneDrive equivalent for business data: A place to host files (data lake or delta lake) and tables.

What is OneLake?
Creating Workspaces
Working with Domains
Workspaces and Source Control: Azure DevOps and Github integration

Storing Data in OneLake

OneLake provides a single, unified, logical data lake for your whole organization. Like OneDrive, OneLake comes automatically with every Microsoft Fabric tenant and is designed to be the single place for all your analytics data.

Creating a LakeHouse
Manually loading data in Lakehouse
The Lakehouse SQL Analytics Endpoint
Create a semantic model
Working with Shortcuts
Connecting External Applications with Microsoft OneLake
LAB: Setting up Lakehouses in OneLake

Getting started with Data Factory

Data Factory allows you to ingest, prepare and transform data from a rich set of data sources like databases, files, cloud data sources,... This chapter illustrates how to use Activities to build pipelines that ingest data in a Lakehouse.

What is Data Factory?
Creating Data Pipelines
The Copy Data Activity
Executing and Monitoring Data Pipelines
LAB: Ingesting data using Pipelines

Authoring advanced Pipelines

This module dives deeper into the process of building a Fabric pipeline. The module mainly focusses on how to work with expressions, variables and parameters to make dynamic pipelines.

Working with Expressions
Variables and Parameters
Using Looping and Conditional Logic in pipelines
Debugging a pipeline
LAB: Authoring and debugging advanced Pipeline

Ingest and Transform data using Dataflow Gen2

With Dataflows you can visually design data transformations without the need to learn yet another tool or language. Dataflows in Microsoft Fabric are based on Power Query Online.

Creating Queries to load data
Applying Transformations
Appending and Merging Queries
Query Folding
Using Dataflows inside a Pipeline
Managing connections
LAB: Ingesting and Transforming Data using Dataflows

Data Engineering with Spark

Data engineering is the process of designing and building systems that let people collect and analyze raw data from multiple sources and formats. Using popular languages such as Python, SQL and R data can be loaded, transformed and analyzed via interactive notebooks.

Introducing Apache Spark
Creating Environments or Apache Spark clusters
Working with Notebooks in Fabric
Magic commands
Visual Studio Code integration
Scheduling Notebooks
Microsoft Fabric decision guide: Copy activity, Dataflow or Spark
Using Python Notebooks
LAB: Getting started with Notebooks in Microsoft Fabric

Data wrangling using PySpark and Spark SQL

PySpark and Spark SQL allow users to perform complex data processing tasks with few lines of code using Notebooks.

The SparkSession, SparkContext and SQLContext objects
Reading and writing data using DataFrames
Data Cleansing using PySpark
Grouping and aggregating data in PySpark
Joining DataFrames
Using Spark SQL to select and manipulate data
Visualizing data using Notebooks and DataFrames
LAB: Data wrangling using PySpark and Spark SQL

Working with Delta Tables

Delta Lake is an optimized storage layer that provides the foundation for storing data and tables in a Fabric lakehouse. Learn how to create, query and optimize Delta Tables in a Microsoft Fabric.

What is a Delta Lake?
Working with Delta Tables
Managing Schema change
Version and Optimize Delta Tables
LAB: Working with Delta Tables

Building a Fabric Data Warehouse

A Synapse Data Warehouse is a database that stores data in OneLake and provides a medium to interact with the database using SQL commands.

The SQL analytics endpoint of the Lakehouse
Creating tables in a Synapse Data Warehouse
Ingesting data using pipelines
Ingesting data using T-SQL
Querying the Warehouse
The Default Power BI semantic model
LAB: Creating and using a Warehouse

Fabric SQL Databases

Sometimes the restrictions on a Fabric Data Warehouse make it difficult to use for applications that are closer to the operational side. With Fabric SQL Databases, an operational database becomes available, with constraints, indexes, and many more features that SQL Server users might be used to.

What is Fabric SQL Database
Connecting clients to the database
Controlling security
Disaster recovery
Fabric SQL Database versus Fabric Warehouse

Real-Time Analytics in Fabric

Real-Time Analytics is a fully managed big data analytics platform optimized for streaming, time-series data. It contains a dedicated query language and engine with for searching structured, semi-structured, and unstructured data in close to real-time.

Creating a KQL database
Ingesting data into tables
Query data using KQL
Create and manage Eventstreams
LAB: Working with Real-Time Analytics

Reporting in Fabric

Power BI transforms your company's data into rich visuals for you to monitor your business and get answers quickly. Learn how to connect to your data stored in Microsoft Fabric using Power BI.

Creating Power BI Reports
DirectQuery vs Import with Microsoft OneLake
Using and configuring Direct Lake Mode
LAB: Creating Power BI Reports

Data Activator

Data Activator in Microsoft Fabric takes action based on what's happening in your data. Learn how to setup conditions against your data and trigger actions like run a Power Automate Flow when the conditions are met.

Creating and using Reflexes
Defining Triggers, Conditions and Actions
Getting data from Reports or Eventstreams
LAB: Use Data Activator in Fabric

Microsoft Fabric is an all-in-one analytics solution for enterprises that covers everything from data movement to data science, real-time analytics, and business intelligence. It offers a comprehensive suite of services, including data lake, data engineering, and data integration, all in one place. In this 5-day course, you will learn about and experience the major parts of Microsoft Fabric.

This course is targeted at data engineers and BI professionals who want to build and use lakehouses and data warehouses using Microsoft Fabric.

Developer and IT Training

Data Engineering with Microsoft Fabric

UFAB

5 days

Upcoming Sessions

Introduction into Microsoft Fabric

Introduction to Data Lakes

Microsoft OneLake

Storing Data in OneLake

Getting started with Data Factory

Authoring advanced Pipelines

Ingest and Transform data using Dataflow Gen2

Data Engineering with Spark

Data wrangling using PySpark and Spark SQL

Working with Delta Tables

Building a Fabric Data Warehouse

Fabric SQL Databases

Real-Time Analytics in Fabric

Reporting in Fabric

Data Activator

Contact Us

Say Hi