Interested in a private company training? Request it here.
The cloud requires to reconsider some of the choices made for on-premises data handling. This module introduces the different services in Azure that can be used for data processing, and compares them to the traditional on-premises data stack. It also provides a brief intro in Azure and the use of the Azure portal.
Azure Databricks allows us to use the power of Apache Spark without the configuration hassle of manually creating and configuring Apache Spark clusters. In this chapter you will learn how to setup an Azure Databricks environment and work with Databricks workspaces.
Using popular languages such as Python, SQL and R data can be loaded, visualized, transformed and analyzed via interactive notebooks.
This module discusses the different types of storage available in Azure Storage and how to configure them for Big Data Analytics. Also some of the tools to load and manage files in Azure Storage are covered.
There are many ways to access data in Azure Databricks. From uploading small files via the portal over ad-hoc connections up to mounting Azure Storage or data lakes. The files can also be treated as a table, providing easy access.
Delta Lake is an optimized storage layer that provides the foundation for storing data and tables in a Databricks lakehouse. Learn how to create, query and optimize Delta Tables in a Databricks lakehouse
You can use Databricks for near real-time data ingestion and processing. Most incremental and streaming workloads on Databricks are powered by Structured Streaming, including Delta Live Tables and Auto Loader. The main focus of this chapter is on how you can incrementally load data in a Lakehouse.
The lakehouse architecture and Databricks SQL Warehouse bring cloud data warehousing capabilities to your data lakes. A SQL warehouse is a compute resource that lets you run SQL commands on objects within Databricks SQL. Learn about the available warehouse types and how to query them.
Microsoft Power BI is a business analytics tool that provides interactive visualizations with self-service business intelligence capabilities, enabling end users to create reports and dashboards. You can connect Power BI Desktop to your Databricks clusters and Databricks SQL warehouses
Databricks is a data analytics platform powered by Apache Spark for data engineering, data science, and machine learning. This training teaches how to use Azure Databricks to design and build a data lakehouse architecture.
No prior knowledge of Azure Databricks is required.