Site icon DUSTIN VANNOY

Databricks CI/CD: Intro to Asset Bundles (DABs)

Databricks Asset Bundles provides a way to version and deploy Databricks assets – notebooks, workflows, Delta Live Tables pipelines, etc. This is a great option to let data teams setup CI/CD (Continuous Integration / Continuous Deployment). Some of the common approaches in the past have been Terraform, REST API, Databricks command line interface (CLI), or dbx. You can watch this video to hear why I think Databricks Asset Bundles is a good choice for many teams and see a demo of using it from your local environment or in your CI/CD pipeline.

Databricks Asset Bundles full example

A full repo with examples is available here: https://github.com/datakickstart/datakickstart_dabs

After you read (or watch) the intro material, go check out my advanced Databricks Asset Bundles post for more patterns and examples.

Let’s start by looking at the bundle file which defines some base settings, a few target environments, and specifies other resources to include.

In the include section we pointed to a folder which has multiple .yml files that define resources that will be included when deploying the bundle. First is the a multi-step workflow named resources/datakickstart_dabs_jobs.yml.

The second step of the workflow is a Delta Live Tables pipeline which is defined in file resources/datakickstart_dabs_pipeline.yml.

Additional examples

Add libraries

References

Data & AI Summit Presentation

Data & AI Summit Repo

Add existing job to bundle

Exit mobile version