A cross tenant metadata driven processing framework for Azure Data Factory and Azure Synapse Analytics achieved by coupling orchestration pipelines with a SQL database and a set of Azure Functions.
Schema: procfwk
Definition: Used to provide a static list of available pipeline status outcomes that can be compared to email recipient requirements for event alerts at runtime. These status values are used in the context of a bitmask position when the framework performs any email alerting. The position means an email recipient can subscribe to any combination of pipeline outcomes.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | OutcomeBitPosition | int | 4 | No |
2 | PipelineOutcomeStatus | nvarchar | 400 | No |
3 | BitValue | int | 4 | Yes |
Schema: procfwk
Definition: Base metadata information used in the running of batch executions. For a given execution values from this table are inspected and taken into the [BatchExecution] table.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | BatchId | uniqueidentifier | 16 | No |
2 | BatchName | varchar | 255 | No |
3 | BatchDescription | varchar | 4000 | Yes |
4 | Enabled | bit | 1 | No |
Schema: procfwk
Definition: For new and previous executions of the processing framework using the batch executions concept this table will house details of when the batch started/finished and its status. Either; Running, Stopping, Stopped or Succeeded.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | BatchId | uniqueidentifier | 16 | No |
2 | ExecutionId | uniqueidentifier | 16 | No |
3 | BatchName | varchar | 255 | No |
4 | BatchStatus | nvarchar | 400 | No |
5 | StartDateTime | datetime | 8 | No |
6 | EndDateTime | datetime | 8 | Yes |
Schema: procfwk
Definition: Provides a link between the batch execution table and the execution stages. It is expected that a defined batch will have one or many execution stages, these can also be reused with the same stage belonging to different batches. Both attributes within the table form the primary key.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | BatchId | uniqueidentifier | 16 | No |
2 | StageId | int | 4 | No |
Schema: procfwk
Definition: For a given execution run this table will be used to handle all metadata exchanges between the orchestrator and the database. After a successful run the table is truncated. Indexing is used here to ensure the current execution run has access to the requested metadata given common where clause requirements.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | LocalExecutionId | uniqueidentifier | 16 | No |
2 | StageId | int | 4 | No |
3 | PipelineId | int | 4 | No |
4 | CallingOrchestratorName | nvarchar | 400 | No |
5 | ResourceGroupName | nvarchar | 400 | No |
6 | OrchestratorType | char | 3 | No |
7 | OrchestratorName | nvarchar | 400 | No |
8 | PipelineName | nvarchar | 400 | No |
9 | StartDateTime | datetime | 8 | Yes |
10 | PipelineStatus | nvarchar | 400 | Yes |
11 | LastStatusCheckDateTime | datetime | 8 | Yes |
12 | EndDateTime | datetime | 8 | Yes |
13 | IsBlocked | bit | 1 | No |
14 | PipelineRunId | uniqueidentifier | 16 | Yes |
15 | PipelineParamsUsed | nvarchar | max | Yes |
Schema: procfwk
Definition: To support the decoupling of worker pipelines from the orchestration pipelines this table houses information about the orchestrators used by the framework.
Within the table metadata one orchestrator needs to be set as the framework orchestrator, where the orchestration pipelines (parent, child, infant) are running from. This is done using the ‘IsFrameworkOrchestrator’ attribute.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | OrchestratorId | int | 4 | No |
2 | OrchestratorName | nvarchar | 400 | No |
3 | OrchestratorType | char | 3 | No |
4 | IsFrameworkOrchestrator | bit | No | |
5 | ResourceGroupName | nvarchar | 400 | No |
6 | SubscriptionId | uniqueidentifier | 16 | No |
7 | Description | nvarchar | max | Yes |
Schema: procfwk
Definition: In the event of a worker pipeline failure, activity level error message details will be captured and inserted into this table.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | LogId | int | 4 | No |
2 | LocalExecutionId | uniqueidentifier | 16 | No |
3 | PipelineRunId | uniqueidentifier | 16 | No |
4 | ActivityRunId | uniqueidentifier | 16 | No |
5 | ActivityName | varchar | 100 | No |
6 | ActivityType | varchar | 100 | No |
7 | ErrorCode | varchar | 100 | No |
8 | ErrorType | varchar | 100 | No |
9 | ErrorMessage | nvarchar | max | Yes |
Schema: procfwk
Definition: This table is used as a long term store from the current execution table. When the current execution table is cleared now records will be moved here.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | LogId | int | 4 | No |
2 | LocalExecutionId | uniqueidentifier | 16 | No |
3 | StageId | int | 4 | No |
4 | PipelineId | int | 4 | No |
5 | CallingOrchestratorName | nvarchar | 400 | No |
6 | ResourceGroupName | nvarchar | 400 | No |
7 | OrchestratorType | char | 3 | No |
8 | OrchestratorName | nvarchar | 400 | No |
9 | PipelineName | nvarchar | 400 | No |
10 | StartDateTime | datetime | 8 | Yes |
11 | PipelineStatus | nvarchar | 400 | Yes |
12 | EndDateTime | datetime | 8 | Yes |
13 | PipelineRunId | uniqueidentifier | 16 | Yes |
14 | PipelineParamsUsed | nvarchar | max | Yes |
Schema: procfwk
Definition: This table provides a many to many connection between the email recipients and the pipelines they wish to subscribe to for email alerts.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | AlertId | int | 4 | No |
2 | PipelineId | int | 4 | No |
3 | RecipientId | int | 4 | No |
4 | OutcomesBitValue | int | 4 | No |
5 | Enabled | bit | 1 | No |
Schema: procfwk
Definition: For the purposes of granular security when providing service principal details that can be used to execute worker pipelines this table provides a many to many link between pipelines and credentials. Further referential integrity checks are then done to include the orchestrator in the metadata relationship.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | AuthId | int | 4 | No |
2 | PipelineId | int | 4 | No |
3 | OrchestratorId | int | 4 | No |
4 | CredentialId | int | 4 | No |
Schema: procfwk
Definition: When using the failure handling property this table is used to establish the links for worker pipelines across the execution stages.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | DependencyId | int | 4 | No |
2 | PipelineId | int | 4 | No |
3 | DependantPipelineId | int | 4 | No |
Schema: procfwk
Definition: Worker pipeline parameters are stored in this table as metadata and provided to the worker at runtime by the framework infant pipeline. A given worker can have none or many parameters limited only by the underlying resources in terms of amount and data value size. In addition, a parameter value used will get persisted to the last value used attribute, this is then over written every time the worker requests its parameters.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | ParameterId | int | 4 | No |
2 | PipelineId | int | 4 | No |
3 | ParameterName | varchar | 128 | No |
4 | ParameterValue | nvarchar | max | Yes |
5 | ParameterValueLastUsed | nvarchar | max | Yes |
Schema: procfwk
Definition: This core table in the framework houses all worker pipelines that the framework is expected to call per batch/stage executions and for a given worker orchestrator.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | PipelineId | int | 4 | No |
2 | OrchestratorId | int | 4 | No |
3 | StageId | int | 4 | No |
4 | PipelineName | nvarchar | 400 | No |
5 | LogicalPredecessorId | int | 4 | Yes |
6 | Enabled | bit | 1 | No |
Schema: procfwk
Definition: Properties and values housed in this table provide runtime configuration information for the framework to influence behaviour and setup.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | PropertyId | int | 4 | No |
2 | PropertyName | varchar | 128 | No |
3 | PropertyValue | nvarchar | max | No |
4 | Description | nvarchar | max | Yes |
5 | ValidFrom | datetime | 8 | No |
6 | ValidTo | datetime | 8 | Yes |
Schema: procfwk
Definition: Named people and email addresses are stored in this table for the purposes of providing email alerting when worker pipelines are executed by the framework.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | RecipientId | int | 4 | No |
2 | Name | varchar | 255 | Yes |
3 | EmailAddress | nvarchar | 1000 | No |
4 | MessagePreference | char | 3 | No |
5 | Enabled | bit | 1 | No |
Schema: dbo
Definition: At runtime a worker pipeline will be executed by the framework functions. The function will authenticate against the target worker orchestrator instance using SPN details sorted in this table directly, or referenced by this table to Azure Key Vault. See service principal handling for details.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | CredentialId | int | 4 | No |
2 | PrincipalName | nvarchar | 512 | Yes |
3 | PrincipalId | uniqueidentifier | 16 | Yes |
4 | PrincipalSecret | varbinary | 256 | Yes |
5 | PrincipalIdUrl | nvarchar | max | Yes |
6 | PrincipalSecretUrl | nvarchar | max | Yes |
Schema: procfwk
Definition: This core table in the framework houses details of all the sequential execution stages that need to process by the framework in order based on the stage Id.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | StageId | int | 4 | No |
2 | StageName | varchar | 225 | No |
3 | StageDescription | varchar | 4000 | Yes |
4 | Enabled | bit | 1 | No |
Schema: procfwk
Definition: To support the decoupling of pipelines and orchestrators this table houses details of Azure Subscriptions that are connected with 1 or many worker orchestrator instances.
At least 1 subscription must exist within the metadata.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | SubscriptionId | uniqueidentifier | 16 | No |
2 | Name | nvarchar | 400 | No |
3 | Description | nvarchar | max | Yes |
4 | TenantId | uniqueidentifier | 16 | No |
Schema: procfwk
Definition: To support the decoupling of pipelines within orchestrator at this final level tenant details are sorted within this table and connected to 1 or many Azure Subscriptions.
At least 1 tenant must exist within the metadata.
Id | Attribute | Data Type | Length | Nullable |
---|---|---|---|---|
1 | TenantId | uniqueidentifier | 16 | No |
2 | Name | nvarchar | 400 | No |
3 | Description | nvarchar | max | Yes |