Editor's Note
191-configuration_6818209b
Complete reference for configuring ZERG — config files, environment variables, quality gates, logging, plugins, and tuning.
Install
npx skills add https://github.com/tools-only/X-Skills --skill 191-configuration_6818209bZERG Configuration Guide
Complete reference for configuring ZERG — config files, environment variables, quality gates, logging, plugins, and tuning.
Table of Contents
- Configuration File
- Workers
- Quality Gates
- Pre-commit Hooks
- Logging
- Plugins
- Context Engineering
- Security
- MCP Servers
- Environment Variables
- Container Mode
- Tuning Guide
Configuration File
Location: .zerg/config.yaml
Created automatically by zerg init. All settings have sensible defaults.
Full Example
version: "1.0"
project_type: python
workers:
default_count: 5
max_count: 10
context_threshold: 0.7
timeout_seconds: 3600
retry_attempts: 3
security:
network_isolation: true
filesystem_sandbox: true
secrets_scanning: true
quality_gates:
lint:
command: "ruff check ."
required: true
typecheck:
command: "mypy ."
required: false
test:
command: "pytest"
required: true
hooks:
pre_commit:
enabled: true
security_checks:
secrets_detection: true
shell_injection: true
block_on_violation: true
quality_checks:
ruff_lint: true
warn_on_violation: true
plugins:
enabled: true
hooks:
- event: level_complete
command: echo "Level {level} done"
timeout: 60
quality_gates:
- name: security-scan
command: bandit -r src/
required: false
timeout: 300
mcp_servers:
- name: filesystem
command: npx
args: ["-y", "@anthropic/mcp-filesystem"]
Workers
Control zergling behavior and resource allocation.
workers:
default_count: 5 # Default workers for /zerg:rush
max_count: 10 # Hard limit on concurrent workers
context_threshold: 0.7 # Checkpoint at this context usage (0.0-1.0)
timeout_seconds: 3600 # Max time per worker session
retry_attempts: 3 # Max retries per task before marking blocked
| Setting | Range | Default | Description |
|---|---|---|---|
default_count | 1-10 | 5 | Workers spawned when count not specified |
max_count | 1-10 | 10 | Never exceed this many concurrent workers |
context_threshold | 0.1-1.0 | 0.7 | Workers checkpoint at this context usage |
timeout_seconds | 60-86400 | 3600 | Kill worker after this many seconds |
retry_attempts | 1-10 | 3 | Retries before marking a task blocked |
Worker Count Guidelines
| Workers | Best For |
|---|---|
| 1-2 | Small features, learning ZERG |
| 3-5 | Medium features, balanced throughput |
| 6-10 | Large features, maximum parallelism |
Diminishing returns beyond the widest level's parallelizable tasks.
Quality Gates
Quality gates run after each level merge. They validate the merged code before workers proceed to the next level.
quality_gates:
lint:
command: "ruff check ."
required: true
typecheck:
command: "mypy . --ignore-missing-imports"
required: false
test:
command: "pytest tests/ -v --tb=short"
required: true
| Field | Description | Default |
|---|---|---|
command | Shell command to run | Required |
required | If true, failure blocks the merge | true |
Gate Results
| Result | Meaning | Action |
|---|---|---|
pass | Exit code 0 | Continue to next level |
fail | Non-zero exit | Block merge if required: true |
timeout | Exceeded limit | Treated as failure |
error | Could not run | Pause for intervention |
Adding Custom Gates
Via YAML (simple shell commands):
plugins:
quality_gates:
- name: security-scan
command: bandit -r src/ --severity medium
required: false
timeout: 300
Via Python entry points (complex logic): See Plugin System.
Pre-commit Hooks
ZERG installs comprehensive pre-commit hooks at .zerg/hooks/pre-commit.
hooks:
pre_commit:
enabled: true
security_checks:
secrets_detection: true
shell_injection: true
block_on_violation: true
quality_checks:
ruff_lint: true
warn_on_violation: true
Security Checks (Block Commit)
These patterns cause commits to be rejected:
| Check | Description |
|---|---|
| AWS Keys | AWS Access Key IDs |
| GitHub PATs | Personal Access Tokens |
| OpenAI Keys | OpenAI API Keys |
| Anthropic Keys | Anthropic API Keys |
| Private Keys | PEM key file headers |
| Dangerous shell patterns | Unsafe subprocess usage |
| Dynamic code patterns | Unsafe dynamic code patterns |
| Unsafe deserialization | Unsafe deserialization patterns |
| Sensitive Files | .env, credentials.json |
Quality Checks (Warn Only)
| Check | Description |
|---|---|
| Ruff Lint | Style issues in Python files |
| Debugger statements | Debug breakpoints left in code |
| Merge Markers | Unresolved conflict markers |
| Large Files | Files over 5MB |
ZERG-Specific Checks (Warn Only)
| Check | Validation |
|---|---|
| Branch Naming | zerg/{feature}/worker-{N} format |
| Print Statements | print calls in zerg/ directory |
| Hardcoded URLs | localhost:PORT outside tests |
Exempt Paths
Tests and fixtures are exempt: tests/, fixtures/, *_test.py, test_*.py, conftest.py
Logging
ZERG uses structured JSONL logging with per-worker and per-task output.
Log Locations
| Type | Path | Format |
|---|---|---|
| Worker logs | .zerg/logs/workers/worker-{id}.jsonl | Structured JSONL |
| Orchestrator | .zerg/logs/orchestrator.jsonl | Structured JSONL |
| Task execution | .zerg/logs/tasks/{TASK-ID}/execution.jsonl | Structured JSONL |
| Claude output | .zerg/logs/tasks/{TASK-ID}/claude_output.txt | Plain text |
| Verification | .zerg/logs/tasks/{TASK-ID}/verification_output.txt | Plain text |
| Git diff | .zerg/logs/tasks/{TASK-ID}/git_diff.patch | Patch format |
JSONL Entry Format
{
"ts": "2026-01-28T10:30:45.123Z",
"level": "info",
"worker_id": 0,
"feature": "user-auth",
"message": "Task T1.1 started",
"task_id": "T1.1",
"phase": "execute",
"event": "task_started",
"data": {},
"duration_ms": null
}
Log Rotation
Worker logs auto-rotate at 50 MB (renamed to .jsonl.1).
Aggregation
LogAggregator merges JSONL files by timestamp at query time. No pre-built aggregate file exists on disk. Use zerg logs --aggregate to query across all workers.
Plugins
plugins:
enabled: true
hooks:
- event: task_completed
command: echo "Task {task_id} done"
timeout: 60
- event: level_complete
command: ./scripts/notify.sh "Level {level} done"
timeout: 120
quality_gates:
- name: security-scan
command: bandit -r src/ --severity medium
required: false
timeout: 300
See the Plugin System documentation for Python entry point plugins, the security model, and examples.
Context Engineering Plugin
The context engineering plugin minimizes token usage across workers. See Context Engineering for full details.
plugins:
context_engineering:
enabled: true # Master switch
command_splitting: true # Split large commands into core/details
security_rule_filtering: true # Filter security rules by task file types
task_context_budget_tokens: 4000 # Max tokens per task context
fallback_to_full: true # Fall back to full context on errors
| Setting | Default | Description |
|---|---|---|
enabled | true | Enable/disable all context engineering |
command_splitting | true | Split commands into .core.md and .details.md |
security_rule_filtering | true | Filter security rules by task file types |
task_context_budget_tokens | 4000 | Maximum tokens for task-scoped context |
fallback_to_full | true | If context engineering fails, load full context |
Hook Event Types
| Event | When |
|---|---|
task_started | Worker begins task |
task_completed | Task passes verification |
level_complete | All tasks in level done |
merge_complete | Level branches merged |
worker_spawned | New worker starts |
quality_gate_run | Gate runs |
rush_started | /zerg:rush begins |
rush_finished | All levels complete |
Variable Substitution
Hook commands support: {level}, {feature}, {task_id}, {worker_id}
Security
security:
network_isolation: true
filesystem_sandbox: true
secrets_scanning: true
Environment Variable Filtering
ZERG controls which environment variables are passed to workers:
Allowed:
ZERG_WORKER_ID,ZERG_FEATURE,ZERG_WORKTREEANTHROPIC_API_KEY,OPENAI_API_KEYCI,DEBUG,LOG_LEVEL
Blocked:
LD_PRELOAD,DYLD_INSERT_LIBRARIESPYTHONPATH,HOME,USER,SHELL
Command Safety
| Protection | Implementation |
|---|---|
| No shell=True | Commands parsed with shlex |
| Allowlist | Commands checked against config |
| Timeout | Every command has max duration |
| Output capture | Separate stdout/stderr |
Task ID Validation
Task IDs are validated against: [A-Za-z][A-Za-z0-9_-]{0,63}
Rejects shell metacharacters, path traversal, and IDs longer than 64 chars.
MCP Servers
Configure MCP servers available to workers:
mcp_servers:
- name: filesystem
command: npx
args: ["-y", "@anthropic/mcp-filesystem", "/workspace"]
- name: github
command: npx
args: ["-y", "@anthropic/mcp-github"]
MCP server configuration is copied to worker containers at .devcontainer/mcp-servers/config.json.
Cross-Cutting Capabilities
New configuration sections for the cross-cutting capabilities framework.
Engineering Rules
rules:
enabled: true # Master switch for rule injection
base_rules: true # Include built-in safety/quality/efficiency rules
custom_rules: true # Include project-specific custom rules
disabled_rules: [] # List of rule IDs to disable
inject_into_workers: true # Inject relevant rules into worker context
Rule files are YAML in .zerg/rules/ (safety.yaml, quality.yaml, efficiency.yaml). Rules are filtered by file extension and injected into worker context at ~15% of the task context budget.
Efficiency
efficiency:
auto_compact_threshold: 0.75 # Context usage % to trigger compact mode
symbol_system: true # Use symbols for status/domain indicators
abbreviations: true # Abbreviate common terms (configuration→cfg)
Improvement Loops
improvement_loops:
max_iterations: 5 # Max loop iterations (1-10)
plateau_threshold: 2 # Consecutive no-improvement rounds to stop (1-5)
rollback_on_regression: true # Revert if score decreases
convergence_threshold: 0.02 # Min improvement to count as progress (0.001-0.5)
Verification Gates
verification:
require_before_completion: true # Require verification before marking done
staleness_threshold_seconds: 300 # Re-run if older than this (10-3600)
store_artifacts: true # Store verification results as JSON
artifact_dir: ".zerg/artifacts" # Artifact storage directory
Behavioral Modes
behavioral_modes:
auto_detect: true # Auto-detect mode from task keywords
default_mode: precision # Default when no mode detected
log_transitions: true # Log mode changes
Available modes: precision, speed, exploration, refactor, debug.
MCP Auto-Routing
mcp_routing:
auto_detect: true # Enable capability-based server matching
available_servers: # Servers to consider
- sequential
- context7
- playwright
- morphllm
- magic
- serena
cost_aware: true # Optimize for lower-cost servers
telemetry: true # Record routing decisions
max_servers: 3 # Max servers per task (1-6)
TDD Enforcement
tdd:
enabled: false # Master switch (off by default)
enforce_red_green: true # Require red→green→refactor order
anti_patterns: # Anti-patterns to detect
- mock_heavy
- testing_impl
- no_assertions
Heartbeat Monitoring
heartbeat:
interval_seconds: 15 # How often workers write heartbeat (5-300)
stall_timeout_seconds: 120 # Seconds before declaring a worker stalled (30-600)
max_restarts: 2 # Auto-restarts before reassigning tasks (0-5)
The orchestrator reads heartbeat files from .zerg/state/heartbeat-{worker_id}.json. When a worker's heartbeat is older than stall_timeout_seconds, the orchestrator marks it as STALLED and triggers auto-restart. After max_restarts consecutive stalls, the worker's tasks are reassigned to a fresh worker.
Escalation Handling
escalation:
auto_interrupt: true # Alert terminal on new escalations
poll_interval_seconds: 5 # How often orchestrator checks for escalations (1-60)
Workers escalate ambiguous failures (unclear spec, missing dependency, unclear verification criteria) to .zerg/state/escalations.json. When auto_interrupt is enabled, the orchestrator prints escalation alerts to stderr.
Three-Tier Verification
verification_tiers:
tier1_blocking: true # Tier 1 (syntax) blocks on failure
tier1_command: null # Override: custom lint/typecheck command
tier2_blocking: true # Tier 2 (correctness) blocks on failure
tier2_command: null # Override: custom test command (defaults to task verification)
tier3_blocking: false # Tier 3 (quality) does not block by default
tier3_command: null # Override: custom quality check command
Workers execute verification in three tiers. Blocking tiers must pass for the task to complete. Non-blocking tiers are logged but don't prevent progress. If no custom command is set, Tier 2 uses the task's verification.command from the task graph.
Repository Symbol Map
repo_map:
enabled: true # Build symbol graph at rush start
languages: # Languages to extract symbols from
- python
- javascript
- typescript
max_tokens_per_module: 3000 # Max tokens for symbol output per module (500-10000)
context_budget_percent: 15 # % of task context budget for repo map (5-30)
At rush start, ZERG builds a symbol graph using Python AST for .py files and regex extraction for .js/.ts/.jsx/.tsx files. The context plugin queries the graph per-task to inject relevant symbols (functions, classes, imports) into worker prompts, giving workers awareness of nearby code without reading full source files.
Token Metrics
token_metrics:
enabled: true # Master switch for token usage tracking
api_counting: false # Use Anthropic API for exact counts (requires API key)
cache_enabled: true # Cache token counts to avoid re-counting
cache_ttl_seconds: 3600 # Cache time-to-live in seconds (60-86400)
fallback_chars_per_token: 4.0 # Heuristic ratio when API counting is off (1.0-10.0)
Each worker writes token usage to .zerg/state/tokens-{worker_id}.json with per-task breakdowns (command template, task context, repo map, security rules, spec excerpt). Use /zerg:status to view aggregate token consumption across workers. When api_counting is enabled, the Anthropic count_tokens API provides exact counts; otherwise a character-based heuristic (len(text) / fallback_chars_per_token) is used.
Error Recovery
error_recovery:
circuit_breaker:
enabled: true
failure_threshold: 3 # Failures before tripping (1-20)
cooldown_seconds: 60 # Recovery wait time (5-600)
backpressure:
enabled: true
failure_rate_threshold: 0.5 # Rate to trigger throttling (0.1-1.0)
window_size: 10 # Rolling window size (3-100)
Environment Variables
Required
| Variable | Description |
|---|---|
ANTHROPIC_API_KEY | Claude API key (for container/subprocess modes) |
ZERG-Specific (Set by Orchestrator)
| Variable | Description |
|---|---|
ZERG_WORKER_ID | Worker identifier (0-N) |
ZERG_FEATURE | Current feature name |
ZERG_BRANCH | Worker's git branch |
ZERG_ANALYSIS_DEPTH | Analysis depth tier (quick/standard/think/think_hard/ultrathink) |
ZERG_COMPACT_MODE | Compact output mode (true/false) |
ZERG_MCP_HINT | Recommended MCP servers for the task |
CLAUDE_CODE_TASK_LIST_ID | Shared task list for coordination |
Optional
| Variable | Description | Default |
|---|---|---|
LOG_LEVEL | Logging verbosity | info |
DEBUG | Enable debug output | false |
CI | CI environment flag | unset |
Container Mode
Setup
# 1. Initialize with container support
zerg init --with-containers
# 2. Build the devcontainer image
devcontainer build --workspace-folder .
# 3. Run with container mode
zerg rush --mode container --workers 5
Authentication
Container workers authenticate via two methods:
| Method | How | Best For |
|---|---|---|
| OAuth | Mount ~/.claude into container | Claude Pro/Team accounts |
| API Key | Pass ANTHROPIC_API_KEY env var | API key authentication |
Container Configuration
The devcontainer is configured at .devcontainer/devcontainer.json:
{
"name": "project-zerg",
"build": {
"dockerfile": "Dockerfile",
"context": ".."
},
"mounts": [
"source=${localWorkspaceFolder},target=/workspace,type=bind",
"source=zerg-claude-tasks,target=/root/.claude/tasks,type=volume"
],
"containerEnv": {
"CLAUDE_CODE_TASK_LIST_ID": "${localEnv:ZERG_FEATURE}",
"ZERG_WORKER_ID": "${localEnv:ZERG_WORKER_ID:-0}"
}
}
Docker Network
Container mode creates a zerg-internal Docker network for worker isolation. Workers communicate via state files mounted from the host.
Tuning Guide
For Speed
workers:
default_count: 8 # More workers
timeout_seconds: 1800 # Shorter timeout
context_threshold: 0.8 # Use more context before checkpoint
For Reliability
workers:
default_count: 3 # Fewer workers, less contention
retry_attempts: 5 # More retries
context_threshold: 0.6 # Checkpoint earlier
quality_gates:
lint:
required: true
typecheck:
required: true
test:
required: true
For Large Features
workers:
default_count: 10
max_count: 10
timeout_seconds: 7200 # 2 hours per worker
For CI/CD
workers:
default_count: 5
timeout_seconds: 3600
quality_gates:
lint:
command: "ruff check . --select ALL"
required: true
test:
command: "pytest --cov --cov-fail-under=80"
required: true
security:
command: "bandit -r src/"
required: true
Directory Structure Reference
.zerg/
├── config.yaml # Main configuration file
├── hooks/
│ └── pre-commit # Pre-commit hook script
├── state/
│ ├── {feature}.json # Runtime state per feature
│ ├── heartbeat-{id}.json # Per-worker heartbeat (worker intelligence)
│ ├── progress-{id}.json # Per-worker progress (worker intelligence)
│ └── escalations.json # Shared escalation file (worker intelligence)
└── logs/
├── workers/
│ └── worker-{id}.jsonl # Structured per-worker logs
├── tasks/
│ └── {TASK-ID}/ # Per-task artifacts
│ ├── execution.jsonl
│ ├── claude_output.txt
│ ├── verification_output.txt
│ └── git_diff.patch
└── orchestrator.jsonl # Orchestrator log