A cross tenant metadata driven processing framework for Azure Data Factory and Azure Synapse Analytics achieved by coupling orchestration pipelines with a SQL database and a set of Azure Functions.
During a run of the processing framework worker pipelines are triggered for execution. These worker pipelines can be deployed to any orchestrator resource, anywhere within the Microsoft Azure platform. In addition, a single set of metadata can call out to worker pipelines in multiple orchestrator instances. As long as the framework is provided with the following details to authenticate against the target orchestrator instance where the worker pipeline can be triggered.
To further clarify, within the metadata database these authentication details are connected to a worker pipeline. Meaning, granular authentication to a worker pipeline wherever it is deployed.
For example, the processing framework could be setup in the following ways depending on your requirements:
Using a single Data Factory or Synapse instance for all framework pipelines and all worker pipelines authenticated with a single service principals.
Using multiple Data Factory instances, one for framework pipelines and a second for all worker pipelines. The worker factory instance uses a single service principal for all worker pipelines.
Using a single Synapse instance for all pipelines, but every worker pipeline requires a different service principal to authenticate.
Using three orchestrators; one for framework pipelines, one for worker pipelines doing none PII data processing, one for worker pipelines doing PII data processing. Each worker orchestrator uses a different set of service principal details that are returned from different Azure Key Vault instances.
Using one Data Factory instnace for all framework pipelines, that calls 6x worker Synapse instances in different Azure Subscriptions. Each worker Synapse instance has a different service principal to authenticate against its worker pipelines.
Using one Data Factory instnace for all framework pipelines in one Azure Tenant, that calls 3x worker Data Factory’s all in different Azure Tenants and different Azure Regions. Each worker Data Factory requires localised service principal details for the target worker pipelines on the target tenant.
These are just example scenarios, any other combination of Tenant/Subscription/orchestrator is possible. See service principal handling for more details on setup and authentication storage options within the framework.
Azure Data Factory and Azure Synapse Analytics are also interchangeable in any of the above statements given the processing frameworks support for different orchestrator.
A demonstration of using cross tenant worker pipelines is available on YouTube here for Azure Data Factory: