Architecture Of Konflux

51. KITE Architecture and Components

Date: 2025-09-08

# Status

Implementable

# Context

The KITE is a proof-of-concept designed to detect, create, and track issues that block application releases in Konflux.

It prevents duplicate issue records, automates issue creation and resolution, and powers the Issues Dashboard where teams can view and manage disruptions.

# Concrete Uses Cases and User Workflows

# Why Konflux Issues Dashboard vs External Issue Trackers

The Konflux Issues Dashboard serves a fundamentally different purpose from traditional issue tracking systems like Jira. Rather than replacing external issue management, it acts as an operational monitoring dashboard, similar to a car dashboard that alerts you to immediate mechanical problems that need attention now, not planning or project management issues.

# Primary Use Cases

1. Developer Morning Check-in Workflow

Scenario: A developer is managing 8 components across 3 different applications.

Current Problem:

With Issues Dashboard:

Value:

2. Cross-Component Failure Correlation

Scenario: A developer notices that his application is failing but doesn’t realize it’s related to a shared library issue affecting multiple teams.

Current Problem: Multiple teams independently debug the same root cause, wasting collective hours.

With Issues Dashboard:

3. User Support Rotation

Scenario: I am an associate on a Konflux Support rotation. A user complains that their application won’t build and they don’t know why

Current Problem:

With Issues Dashboard:

Value:

4. Security Vulnerability Response

Scenario: A critical CVE is discovered in a widely-used dependency (e.g. a container base image, Python package, or RPM package) that affects multiple components across different teams.

Current Problem:

With Issues Dashboard:

- Issue: "[SECURITY] Critical CVE-2024-1234 in base image - MintMaker updates failing"
- Severity: Critical
- Affected Scope: 15 components across 6 namespaces
- Failure Details:
  • 4 components: GitLab Renovate token expired in namespace secrets
  • 7 components: Container registry authentication failed
  • 3 components: Dependency version conflicts preventing clean updates
  • 1 component: Repository lacks required GitHub App installation
- Direct Links:
  • Failed Renovate execution logs for each component
  • CVE vulnerability details and severity assessment
  • Registry credential setup instructions
  • Dependency conflict resolution guide

Resolution Workflow:

  1. Immediate Alert: All affected teams see the security issue in their dashboard when MintMaker detects failures
  2. Targeted Remediation: Teams can address specific failure causes (refresh tokens, fix registry auth, resolve conflicts)
  3. Progress Tracking: As MintMaker successfully creates security PRs for each component and those PRs get merged, the issue scope automatically shrinks
  4. Auto-Resolution: Issue resolves completely when all components have successful security updates merged

Key Value:

# Why This Can’t Be Replaced by Jira

  1. Real-time Pipeline Integration: Records of issues related to failed pipelineruns are created/resolved automatically based on pipeline state changes.
  2. (Potentially) Zero Manual Overhead: Once teams are integrated, no humans are needed to create, categorize or close issues.
  3. Temporal Context: Issues exist only while problems exist, no stale issue cleanup should be needed.
  4. Operational Focus: The Issues Dashboard shows “what is broken right now” not “what work needs to be done”
  5. Cross-team Correlation: The Issues Dashboard has the ability to group related failures across components and applications, informing multiple teams (if applicable).

# Common Concerns

Why not use existing issue trackers?

Isn’t this just another monitoring tool?

This dashboard fills the gaps between low-level monitoring alerts and high-level project management. It gives developers a single pane view for “what needs my immediate attention to keep shipping software”.

# Architecture Overview

The following diagram illustrates the key components and data flow of the KITE system:

graph TB
    subgraph "Konflux Cluster"
        K8S[Kubernetes Resources:<br/> PipelineRuns, Deployments, etc.]
        BO[KITE Bridge Operator]
        CTRL1[PipelineRun Controller]
        CTRL2[Custom Controller A]
        CTRL3[Custom Controller B]
    end

    subgraph "KITE Backend"
        API[KITE Backend API]
        WH[Webhook Endpoints]
        SVC[Issue Service]
        REPO[Issue Repository]
    end

    subgraph "Data Layer"
        PG[(PostgreSQL Database:<br/> Issues Records)]
    end

    subgraph "User Interfaces"
        DASH[Issues Dashboard]
        CLI[KITE CLI Tool]
        EXT[External Tools:<br/> Monitoring, Alerts]
    end

    %% Operator Flow
    BO -->|watches| K8S
    BO -->|watches specific resource| CTRL1
    BO -->|watches specific resource| CTRL2
    BO -->|watches specific resource| CTRL3
    CTRL1 -->|HTTP POST| API
    CTRL2 -->|HTTP POST| API
    CTRL3 -->|HTTP POST| API

    %% Backend Flow
    API -->|handle issue creation or update| SVC
    WH -->|handle issue creation or update| SVC
    SVC -->|Duplicate Prevention| REPO
    REPO -->|ACID Transactions| PG

    %% User Interface Flow
    DASH -->|Query Issues| API
    CLI -->|Query Issues| API
    EXT -->|REST API| API

# Decision

We will implement KITE as a distributed system with the following key architectural decisions:

# Bridge Operator Architecture

The KITE Bridge Operator implements the “bridge operator” pattern, which connects a Kubernetes environment with external systems not natively managed by Kubernetes.

The operator:

The operator runs as a standard Kubernetes controller with cluster-wide permissions to monitor resources across namespaces.

# KITE Backend Service

The KITE Backend is a Go-based REST API service that:

The backend is built using the Gin HTTP web framework and follows standard HTTP API patterns.

# Team Integration Strategy

KITE provides two primary integration paths for teams to onboard their services and start tracking issues:

1. Build Custom Controllers Teams develop custom controllers for the KITE Bridge Operator that:

2. Implement Custom Webhook Endpoints Teams can develop custom webhook endpoints tailored to their specific events, giving them:

Recommended Integration Path Benefits

# Alternative Integration Approach

For teams that cannot integrate directly with KITE controllers or webhooks, external service integration is available through:

# External PostgreSQL Database

We have chosen to use an external PostgreSQL database instead of storing issues as Kubernetes Custom Resources in etcd for the following critical reasons:

# Protecting etcd from Overload

# Volume and Performance Considerations

# Duplicate Issue Prevention

The KITE backend implements several mechanisms to prevent duplicate issues from being created:

# Database-Level Protection

# Application-Level Logic

# Automatic Issue Lifecycle Management

KITE implements automatic issue creation, updating and resolution using the combination of a custom controller + webhook. This minimizes manual intervention and prevents duplicate issue records.

# High-Level Issue Automation Overview

This diagram shows a simplified flow of how KITE automatically detects and manages issues:

flowchart LR
    subgraph "Kubernetes Cluster"
        RESOURCE[Kubernetes Resource:<br/> PipelineRun, Deployment, etc.]
        CONTROLLER[KITE Bridge Operator:<br/> Controllers Watch & Evaluate Kubernetes Resource]
    end

    subgraph "KITE Backend"
        API[KITE API]
        LOGIC[Issue Management: <br> Create/Update/Resolve]
        DB[(PostgreSQL: <br/>Issue Storage)]
    end

    subgraph "User Interface"
        DASHBOARD[Issues Dashboard]
    end

    %% Main Flow
    RESOURCE -->|State Changes| CONTROLLER
    CONTROLLER --> DECISION{Success or Failure?}

    DECISION -->|Failure| FAIL_REQ[POST endpoint to upsert issue]
    DECISION -->|Success| SUCCESS_REQ[POST endpoint to resolve active issues]

    FAIL_REQ -->|Send event| API
    SUCCESS_REQ -->|Send event| API

    API --> LOGIC
    LOGIC <--> DB
    DB --> DASHBOARD

# Additional Architectural Decisions

# Modular Design

# Configuration Management

# Development & Operations

# Requirements Alignment

This section demonstrates how KITE’s architecture addresses the specified project requirements.

# Dashboard with issues

Requirement: Dashboard with issues, an issue groups one or multiple events that have the same cause or are otherwise connected.

# Scope Support

Requirement:

Implementation:

# Filtering and Search Capabilities

Requirement:

Implementation:

Requirement: Users can get through the links in the issue to the logs or other information needed to debug and resolve the problem

Implementation:

# Extensibility

Requirement: Dashboard must be easily extendable, especially when it comes to adding new issue types.

Implementation:

# Issue Types and Automatic Resolution

Requirements:

Implementation:

# Deployment and integration

Requirements:

Implementation:

# API Access for External Tools

Requirement: Provide an API so external CI tools (for example RHEL on GitLab) can query issues related to a particular pipeline run.

Implementation:

# Error Handling and Debugging

Requirement: UI should reflect backend API errors where they are happening for easier debugging

Implementation:

# Consequences

# Positive

# Negative

# Future Considerations