Collectors
The release process with Konflux is well-structured, and the documentation provides clear examples of how to supply data to the Release
, ReleasePlan
, or ReleasePlanAdmission
resources for use within the release workflow.
Despite this, a limitation remains that prevents full workflow automation. In scenarios where a data field in one of the release resources needs to be populated with dynamic information retrieved from an external service before initiating the release, relying on manual steps or custom scripts introduces inefficiency and potential for error.
To address this limitation, Konflux includes a feature called collectors.
A collector is essentially a Python script executed as part of the tenant and managed collectors pipelines. It generates information that is embedded into the Release
status. These pipelines are integrated into the release workflow and run at the very beginning, immediately after the validation step. As a result, the collected data becomes available to both the tenant and managed pipelines.
Using a collector in a Konflux release
To use a collector, the first step is to select one from the available options in the official repository. The structure of this repository may evolve over time, but the README.md file provides useful details about the available collectors and the data they produce. The key piece of information needed is the collector’s name, which will be referenced in one of the release resources.
Collectors can be defined in the following resources:
-
ReleasePlan: Collectors defined here are executed by the tenant collectors pipeline, which runs in the tenant namespace.
-
ReleasePlanAdmission: Collectors defined here are executed by the managed collectors pipeline, which runs in the managed namespace.
For example, to run the jira
collector—which retrieves a list of Jira issues when provided with a server and a query—the following configuration should be added to the ReleasePlan:
apiVersion: appstudio.redhat.com/v1alpha1
kind: ReleasePlan
metadata:
labels:
release.appstudio.openshift.io/auto-release: 'true' (1)
release.appstudio.openshift.io/standing-attribution: 'true'
name: collectors-rp
namespace: dev-workspace (2)
spec:
application: <application-name> (3)
collectors:
serviceAccountName: <service-account> (4)
items: (5)
- name: project-issues
params:
- name: url
value: https://issues.redhat.com
- name: query
value: 'project = "My Project" AND summary ~ "test issue"'
- name: secretName
value: "jira-collectors-secret"
timeout: 60
type: jira (6)
secrets: (7)
- jira-collectors-secret
serviceAccountName: collector-service
data: <key> (8)
target: managed-workspace
1 | Optional: Control if Releases should be created automatically for this ReleasePlan when tests pass. Defaults to true. |
2 | The development team’s workspace. The collector pipeline will be executed in this namespace. |
3 | The name of the application that you want to release via a pipeline in the development workspace. |
4 | The ServiceAccount that the pipeline will use. |
5 | List of parameters to be passed to the collector. |
6 | The collector type as seen in the official collectors repository. |
7 | Secrets to be provided to the collectors. |
8 | Optional: An unstructured key used for providing data for the managed Pipeline. |
Retrieving collectors data
After the collectors pipelines complete execution, the output from each collector is added to the Release
resource under the status.collectors
field. Below is an example showing the result of a collector defined in the previously mentioned ReleasePlan:
apiVersion: appstudio.redhat.com/v1alpha1
kind: Release
...
status:
collectors:
tenant:
- project-issues:
releaseNotes:
fixed:
- id: "CVE-3444"
source: "issues.redhat.com"
In this case, the project-issues
collector generated a list of issues, which is included under status.collectors.tenant
. Since this collector was defined in the ReleasePlan
, its output is categorized under the tenant
section. Collectors defined in a ReleasePlanAdmission
will have their results stored under the managed
key instead.
The following example shows a Release
status containing results from multiple collectors, both tenant and managed:
apiVersion: appstudio.redhat.com/v1alpha1
kind: Release
...
status:
collectors:
managed:
- foo:
releaseNotes:
cves:
- key: "CVE-3444"
component: "my-component"
tenant:
- bar:
baz: qux
- project-issues:
releaseNotes:
issues:
fixed:
- id: "CPAAS-1234"
source: "issues.redhat.com"
Collectors in the managed pipeline
Releases can reference managed pipelines, which—as described in other sections—rely on the data
field to retrieve user-provided information. To ensure that data generated by collectors is also considered, the contents of status.collectors
are merged with the data fields from the Release
, ReleasePlan
, and ReleasePlanAdmission
resources.
The order of precedence follows the same hierarchy previously described, with status.collectors
having the lowest priority. This means that if both the collector output and any data field define the same key, the value from the data
field will take precedence.
For example, if a collector like jira
produces the following output:
status:
collectors:
tenant:
- project-issues:
releaseNotes:
issues:
fixed:
- id: "CPAAS-1234"
source: "issues.redhat.com"
releaseNotes:
cves:
- key: "CVE-3444"
component: "my-component"
And the ReleasePlanAdmission
defines this:
data:
releaseNotes:
issues:
fixed: []
Then the empty issues.fixed
array from the data
field will override the collector’s output.
In contrast, if the data
field contains unrelated content:
data:
foo: bar
Then both sources will be merged, and the final data used by the managed pipeline will be:
data:
foo: bar
releaseNotes:
issues:
fixed:
- id: "CPAAS-1234"
source: "issues.redhat.com"
This merging strategy ensures flexibility while allowing user-defined data to take precedence when needed.