Azure Synapse Analytics, the next evolution of Azure SQL Data Warehouse, has taken the BI world by storm since originating in November 2019. Consulting firms are shifting their attention to a Serverless SQL Pool architecture which enables them to offer their clients a mature and performant logical datawarehouse, providing the possibility to store structured as well as unstructured data, for only a few pennies on the dollar (pay-for-what-you-use) as opposed to other prevalent technologies.
However, this shift does not present itself without any challenges. Serverless SQL & Spark Pools bring a new way of cooking up syntax and best practices with regards to Continuous Integration and Continuous Deployment (CI/CD).
These days, when even Chat GPT is not able to come up with a helpful answer, you know you struck gold for a useful article. That is why, in this article we will address:
When deciding how to implement CI/CD in ASA, the main consideration can be summed up as: ‘To YAML or not to YAML’.
In other words, you should ask yourself whether you would prefer to develop your CI/CD pipeline solely based on code (i.e., ‘To YAML’) or whether to use the DevOps Releases User Interface (i.e., ‘Not to YAML’). Both come with advantages and disadvantages.
In this CI/CD scenario, a YAML file contains all the necessary instructions as code for packaging the artifacts and deploying them to an environment of choice.
An example of such a YAML file can be found via the publicly available GIT repo from Ryoma Nagata (Microsoft MVP) in which we can find the ‘azure-pipelines-ci-cd-synapse-artifacts.yml’ file.
Advantages of this approach are:
A disadvantage of this approach is that you need to understand the YAML language in order to make changes to the CI/CD workflow. With no prior experience with YAML, this can be challenging.
In this CI/CD scenario, we will not use a YAML script to describe our CI/CD workflow. Instead, we will add agent jobs and tasks in the Azure DevOps Pipelines & Releases components (user interface).
An advantage of this is that the CI/CD workflow can be managed visually by putting tasks in the right sequence in the user interface.
Disadvantages are that this setup:
Both approaches are valid alternatives to implement CI/CD on Azure Synapse Analytics. Choosing one of these boils down to your affinity and/or experience with YAML files and their perks.
Thanks for reading and enjoy your deploying!