Snowflake-Labs/snowflake-flow-diff
Java
Captured source
source ↗Snowflake-Labs/snowflake-flow-diff
Description: GitHub Action for Comparing Differences in Apache NiFi Flow Definitions
Language: Java
License: Apache-2.0
Stars: 17
Forks: 4
Open issues: 3
Created: 2025-01-30T13:09:47Z
Pushed: 2026-06-04T20:13:50Z
Default branch: main
Fork: no
Archived: no
README:
!https://www.snowflake.com/ Snowflake Flow Diff for Apache NiFi
This action is brought to you by Snowflake, don't hesitate to visit our website and reach out to us if you have questions about this action.
Usage
When using the GitHub Flow Registry Client in NiFi to version control your flows, add the below file .github/workflows/flowdiff.yml to the repository into which flow definitions are versioned.
Whenever a pull request is opened, reopened or when a new commit is pushed to an existing pull request, this workflow will be triggered and will compare the modified flow(s) in the pull request and will automatically comment the pull request with a human readable description of the changes included in the pull request.
name: Snowflake Flow Diff on Pull Requests
on:
pull_request:
types: [opened, reopened, synchronize]
jobs:
execute_flow_diff:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
name: Executing Flow Diff
steps:
# checking out the code of the pull request (merge commit - if the PR is mergeable)
- name: Checkout PR code
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.ref }}
fetch-depth: 0
path: submitted-changes
# getting the path of the flow definition(s) that changed
- name: Get changed files
id: files
run: |
cd submitted-changes
# 1. grab every changed connector JSON
files=$(git diff --name-only $(git merge-base HEAD origin/${{ github.event.pull_request.base.ref }}) HEAD | grep '\.json$')
# 2. make it a comma-separated list (no trailing comma)
bare=$(echo "$files" | tr '\n' ',' | sed 's/,$//')
# 3. prefix for original-code (flowA)
flowA=$(echo "$bare" | sed 's|[^,]\+|original-code/&|g')
# 4. prefix for submitted-changes (flowB)
flowB=$(echo "$bare" | sed 's|[^,]\+|submitted-changes/&|g')
# 5. export both as outputs
echo "flowA=$flowA" >> $GITHUB_OUTPUT
echo "flowB=$flowB" >> $GITHUB_OUTPUT
# checking out the code without the change of the PR
- name: Checkout original code
uses: actions/checkout@v4
with:
fetch-depth: 2
path: original-code
- run: cd original-code && git checkout HEAD^
# Running the diff
- name: Snowflake Flow Diff
uses: snowflake-labs/snowflake-flow-diff@v0
id: flowdiff
with:
flowA: ${{ steps.files.outputs.flowA }}
flowB: ${{ steps.files.outputs.flowB }}Note - you may want to change grep '\.json$' with a more specific pattern to match your specific requirements.
GitHub Enterprise Server
This action works with GitHub Enterprise Server (GHE) out of the box. The GitHub API URL is automatically detected via github.api_url. If you need to override it for any reason, you can explicitly set the api-url input:
- name: Snowflake Flow Diff
uses: snowflake-labs/snowflake-flow-diff@v0
id: flowdiff
with:
flowA: ${{ steps.files.outputs.flowA }}
flowB: ${{ steps.files.outputs.flowB }}
api-url: https://github.example.com/api/v3Checkstyle
Optionally, it is possible to enable a checkstyle check on the new version of the flow. If some violations against NiFi best practices are found, a message will be added to the comment published on the pull request.
To enable checkstyle:
# Running the diff
- name: Snowflake Flow Diff
uses: snowflake-labs/snowflake-flow-diff@v0
id: flowdiff
with:
flowA: ${{ steps.files.outputs.flowA }}
flowB: ${{ steps.files.outputs.flowB }}
checkstyle: true
# optional: path to YAML configuration of the rules
checkstyle-rules: submitted-changes/.github/checkstyle/checkstyle-rules.yaml
# optional: fail the action when violations are detected
checkstyle-fail: trueIf checkstyle-fail is set to true, the GitHub Action will exit with a non-zero status whenever checkstyle violations are detected, which ensures the workflow (and therefore the pull request) is blocked until the issues are fixed.
The YAML file can be used to include or exclude specific rules and to configure rule parameters. For example:
include: - concurrentTasks rules: concurrentTasks: parameters: limit: 2 overrides: ".*sql-connector": limit: 4 exclude: - "Ignore.*" componentExclusions: "ProdFlow.*": - "1a59f65f-8b3a-3db9-982e-e0d334bd7e9c" # processor UUID to ignore
At the root level, if include is specified, only those rules will be executed. If exclude is specified AND include is not specified, all rules except the ones specified will be executed. For each rule, it is possible to specify default values for parameters (when supported by the rule, see below), to override parameters for specific flows using a regular expression against the flow name, to exclude flows using a regular expression, and to silence violations for specific component UUIDs via componentExclusions.
Component-level exclusions
Some rules (for example concurrentTasks, noSelfLoop, enforcePrioritizer, or backpressureThreshold) may need an exception for a single processor or connection while you keep the rule enabled for the rest of the flow. You can scope exclusions down to the UUID under the componentExclusions map, keyed by flow-name regular expressions:
include: - noSelfLoop rules: noSelfLoop: componentExclusions: "ProductionFlow.*": - "2d8da922-fd1f-3519-9d54-6482dfd42c56" ".*": # optional global fallback - "50a3b081-d54d-3ad8-b74c-caa7fef59bb2"
When the flow name matches the regex, violations produced for the listed component IDs are ignored, while all other components continue to be checked normally. Use the same pattern for rules that operate on connections (for example enforcePrioritizer or backpressureThreshold) by listing the connection identifiers to suppress.
Rule identifiers and exclusion targets
The following table summarizes what each rule reports and which identifiers you can expect when configuring componentExclusions:
| Rule id | Violation references | componentExclusions target | | --- | --- | --- | | concurrentTasks | Processor name and UUID | Processor UUID (VersionedProcessor#getIdentifier) | | snapshotMetadata | Flow name only | _Not applicable_ | | emptyParameter |…
Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Low traction new repo with 17 stars