Log Analytics provides a way to easily query Spark logs and setup alerts in Azure. This provides a huge help when monitoring Apache Spark. In this video I walk through the setup steps and quick demo of this capability for the Azure Databricks log4j output and the Spark metrics. I include written instructions and troubleshooting guidance in this post to help you set this up yourself.
Setup steps
- Clone repository (or just download jars and skip to step 3): https://github.com/datakickstart/spark-monitoring
- Build jars
- Run commands in upload_with_dbfs.sh to upload files
dbfs mkdirs dbfs:/databricks/spark-monitoring dbfs cp --overwrite src/target/spark-listeners_3.1.1_2.12-1.0.0.jar dbfs:/databricks/spark-monitoring/ dbfs cp --overwrite src/target/spark-listeners-loganalytics_3.1.1_2.12-1.0.0.jar dbfs:/databricks/spark-monitoring/ dbfs cp --overwrite src/spark-listeners/scripts/spark-monitoring.sh dbfs:/databricks/spark-monitoring/
- Create log analytics workspace (if doesn’t exist)
- Get log analytics workspace id and key (from “Agents management” pane)
- Add log analytics workspace ID and key to a Databricks secret scope
- Add environment configs to cluster environment variables
- Add the spark-monitoring.sh init script in the cluster advanced options
- Start cluster and confirm Event Log shows successful cluster init
- Confirm custom logs are created in Log Analytics and messages are flowing to it
Troubleshooting
What if custom logs do not show up in Azure Log Analytics?
There are a few things to look at to try and see what has gone wrong.
- Start cluster and watch event log to confirm you see an INITS_SCRIPTS_FINISHED message for spark-monitoring.sh (setup step 9).
- Confirm cluster environment variables were set (setup step 7) and that they reference secret names in a Databricks secret scope. To check what is in your databricks secret scope, replace demo with your secret scope name and run the following script from a notebook: dbutils.secrets.list(scope=”demo”)
- Confirm the init script was added properly (setup step 8). To confirm script exists in the location you configured, run the following script from a notebook: dbutils.fs.ls(“dbfs:/databricks/spark-monitoring/spark-monitoring.sh”)
Nice summary of how to set up spark-monitoring with law
I’m facing now an issue to compile sample files to test it
https://github.com/mspnp/spark-monitoring#run-the-sample-job-optional
with this message
[ERROR] Failed to execute goal on project spark-monitoring-sample: Could not resolve dependencies for project com.microsoft.pnp:spark-monitoring-sample:jar:1.0.0: com.microsoft.pnp:spark-listeners:jar:1.0.0 was not found in https://repo.maven.apache.org/maven2 during a previous attempt. This failure was cached in the local repository and resolution is not reattempted until the update interval of central has elapsed or updates are forced
Do you know what it could be?
Many thanks