Which language for Apache Spark

Best Language for Apache Spark

The question is raised often, “What programming language should we choose for our Apache Spark project?” The short answer I give is to choose between Scala or Python. I admit, this is only slightly more helpful than saying it depends, which I try to avoid. The real question is what are the tradeoffs between the… Continue Reading


Apache Spark with Python

Azure Synapse Spark with Python

In this video, I share with you about Apache Spark using the Python language, often referred to as PySpark. We’ll walk through a quick demo on Azure Synapse Analytics, an integrated platform for analytics within Microsoft Azure cloud. This short demo is meant for those who are curious about PySpark or just want to get… Continue Reading


Apache Spark with Scala

Azure Synapse Spark with Scala

In this video, I share with you about Apache Spark using the Scala language. We’ll walk through a quick demo on Azure Synapse Analytics, an integrated platform for analytics within Microsoft Azure cloud. This short demo is meant for those who are curious about Spark with Scala or just want to get a peek at… Continue Reading


Apache Spark .NET

Azure Synapse Spark .NET (C#)

Spark .NET is the C# API for Apache Spark - a popular platform for big data processing. This demo is for you if you are curious to see a sample Spark .NET program in action or are interested in seeing Azure Synapse serverless Apache Spark notebooks. This demo includes guidance of how you can follow along to build a Spark .NET data load that reads linked sample data, transforms data, joins to a lookup table, and saves as a Delta Lake file to your Azure Data Lake Storage Gen2 account.


Spark Summit Takeaways

Wrapping up my attendance at Spark + AI Summit 2020 and I found a lot of value. Here are my quick takeaways to try and save you time. To keep it real, some sessions were a big miss for me either due to too much detail or not enough focus, but some were awesome. If… Continue Reading


Data Lake Introduction

Hearing a lot of mention of Data Lakes but still not sure what that means or why anyone cares? This video will cover a brief introduction to what a Data Lake is and why so many organizations are adding them to their analytics ecosystem. To show what interacting with a data lake may look like for a typical data analyst, I included a demo of how you would use Spark SQL to query the data lake from Azure Databricks.


Apache Spark Introduction

This video we will quickly cover Apache Spark.  The goal is to cover why use Spark and where it fits in the data ecosystem.  If you want to just get hands on with Spark, check out one of my next videos on Spark and Databricks. Watch the video to get my overview of Spark and… Continue Reading