agents-govern in this project

Language: Suomeksi → governance-study.md

This page describes what the agents-govern framework is, what problems it tries to solve, what "learning" means inside the framework, and how this project (blue-marlin) uses it.

The structural side (agents, communication channels, gates) lives in its own document: Agents map.

What is agents-govern?

Source, lightly adapted: agents-govern README (CC-BY-SA-4.0).

agents-govern is a governance framework for multi-agent AI systems in software development. When multiple AI agents work together on a codebase — planning, coding, testing, reviewing, deploying — they need boundaries, quality gates, and accountability. Without these, you get authority conflicts, capability drift, accountability gaps, and knowledge decay. The framework defines the structure to prevent those failures.

What this is: Governance of multi-agent collaboration in software development workflows — the boundaries, gates, and accountability needed when AI agents (and humans) collaborate to plan, code, review, test, and deploy software.

What this is not:

The framework is open source (CC-BY-SA-4.0). This project uses v0.34.0 (installed from the release tarball on 2026-04-26).

Five problems the framework addresses

Source, lightly adapted: framework.md §1 (CC-BY-SA-4.0).

Before designing agents, understand what goes wrong without governance. These problems were identified empirically in production multi-agent systems:

  1. Authority without boundaries. Two agents both believe they own a technical decision. The Planner scopes a feature one way; the Architect redesigns it. Neither knows the other acted — the result is incoherent oscillation between competing visions.

  2. Capability drift. An agent asked to "improve the documentation" decides that means refactoring the codebase. A "review this PR" agent starts making its own commits. Without constraints, agents expand their scope to match their capabilities, not their mandate.

  3. The accountability gap. Agent A delegates to Agent B, which calls Agent C, which modifies a shared resource. When something breaks, there is no trace of the delegation chain. You see the symptom but not the cause.

  4. Local optimization, global misalignment. Each agent optimizes its local objective. The coder writes elegant code, the tester achieves high coverage, the deployer ships fast. Each is right within its own scope, but the system-level outcome can still be wrong.

  5. Knowledge decay. What the framework has previously learned evaporates. The same bug is rediscovered repeatedly because no prior solution (or attempted solution) is recorded in any searchable form.

"Learning" in this context

agents-govern is an evidence-driven framework. That phrase has concrete structural meaning:

What "learning" is NOT

What "learning" IS

A learning record in the framework is a YAML structure that captures one concrete observation from running the project's governed pipeline. Each entry contains, at minimum:

Records live in learnings/<codename>.yaml — for this project, blue-marlin.yaml.

Where learning leads

Learning is a feedback loop into the framework's own evolution:

  1. An adopter project hits a gap, validates an assumption, or adapts a rule → records a learning entry
  2. The entry is submitted upstream (issue / MR)
  3. The InfoSec Sentinel and Contribution Auditor agents review the entry (does it leak information? is it manipulative?)
  4. Once an observation has corroboration from multiple projects, the framework version is revised — into a rule, a new gate check, or a tier promotion
  5. Single-adopter evidence stays provisional until a second adopter hits the same thing

This is why even individual entries are valuable: they are raw evidence on which the framework evolves — they don't need a "solution" at submission time.

This project's adoption

Setting Value
Adoption layout Layout B — framework vendored under agents-govern/
Codename blue-marlin (anonymous identifier in upstream learnings)
Framework version v0.34.0
Adoption started 2026-04-26
Active agents 6 (Agents map)
Active gates 2 (Gate 1 + Gate 2)
Human Governor Jani Päijänen
LLM driver Claude AI (via Claude Code)

What this project has surfaced so far

The project has captured 17 learning entries in blue-marlin.yaml. Distribution:

Category Count Severity Count
gap 6 critical 1
adaptation 6 significant 5
validation 5 minor 8
informational 3

From the framework's perspective the most valuable entries are the gap-class ones (the framework didn't cover the situation — three of these became upstream issues and one became a feature proposal), and the critical-severity entry (a single one but a meaningful demonstration):

Upstream proposals

ID Topic Status
C1 Output-level invariants (Iter 7 gap) Submitted (issue #39)
C2 Explicit visual acceptance gate (Iter 13 gap) Submitted (issue #40)
C3 Lowest-common-denominator output (Iter 9–10 gap) Submitted (issue #41)
D1–D4 Documentary batch (4 minor) Draft ready
E1 agov-render-agents-map (new framework command + prototype) Draft ready

What the gates have caught

Concrete examples where Gate 2 review produced value (Gate 1 has mostly been fast-tracked in this project for small tasks):

Iter What the gate caught Severity
13→15 Beam placed above deck (trip hazard) Critical
7 X-cross brace rendered horizontal due to rotation bug Significant
7 Lower-waling z-formula placed it ABOVE the "upper" waling Significant
6 Pytest invariants didn't catch the visual bug Significant
9–10 DXF $INSUNITS missing — CAD tools mis-interpreted scale Minor
14a DXFs missing unit suffix on dimension labels Minor

What the study shows so far

Observation. The governance process surfaces visible incidents that would otherwise ship invisibly:

Caveats. This is illustrative, not statistical:


The concept sections (What is agents-govern, Five problems) are adapted from the framework's own README and framework.md, both licensed CC-BY-SA-4.0. The remaining sections are this project's own content.