Files
claw-code/src/main.py

755 lines
32 KiB
Python
Raw Normal View History

from __future__ import annotations
import argparse
from .bootstrap_graph import build_bootstrap_graph
from .command_graph import build_command_graph
from .commands import execute_command, get_command, get_commands, render_command_index
from .direct_modes import run_deep_link, run_direct_connect
from .parity_audit import run_parity_audit
from .permissions import ToolPermissionContext
from .port_manifest import build_port_manifest
from .query_engine import QueryEnginePort
from .remote_runtime import run_remote_mode, run_ssh_mode, run_teleport_mode
from .runtime import PortRuntime
feat(#160): wire claw list-sessions and delete-session CLI commands Closes the last #160 gap: claws can now manage session lifecycle entirely through the CLI without filesystem hacks. New commands: - claw list-sessions [--directory DIR] [--output-format text|json] Enumerates stored session IDs. JSON mode emits {sessions, count}. Missing/empty directories return empty list (exit 0), not an error. - claw delete-session SESSION_ID [--directory DIR] [--output-format text|json] Idempotent: not-found is exit 0 with status='not_found' (no raise). Partial-failure: exit 1 with typed JSON error envelope: {session_id, deleted: false, error: {kind, message, retryable}} The 'session_delete_failed' kind is retryable=true so orchestrators know to retry vs escalate. Public API surface extended in src/__init__.py: - list_sessions, session_exists, delete_session - SessionNotFoundError, SessionDeleteError Tests added (tests/test_porting_workspace.py): - test_list_sessions_cli_runs: text + json modes against tempdir - test_delete_session_cli_idempotent: first call deleted=true, second call deleted=false (exit 0, status=not_found) - test_delete_session_cli_partial_failure_exit_1: permission error surfaces as exit 1 + typed JSON error with retryable=true All 43 tests pass. The session storage abstraction chapter is closed: - storage layer decoupled from claw code (#160 initial impl) - delete contract hardened + caller-audited (#160 hardening pass) - CLI wired with idempotency preserved at exit-code boundary (this commit)
2026-04-22 17:16:53 +09:00
from .session_store import (
SessionDeleteError,
SessionNotFoundError,
delete_session,
list_sessions,
load_session,
session_exists,
)
from .setup import run_setup
from .tool_pool import assemble_tool_pool
from .tools import execute_tool, get_tool, get_tools, render_tool_index
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
def wrap_json_envelope(data: dict, command: str, exit_code: int = 0) -> dict:
"""Wrap command output in canonical JSON envelope per SCHEMAS.md."""
from datetime import datetime, timezone
now_utc = datetime.now(timezone.utc).isoformat(timespec='seconds').replace('+00:00', 'Z')
return {
'timestamp': now_utc,
'command': command,
'exit_code': exit_code,
'output_format': 'json',
'schema_version': '1.0',
**data,
}
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(description='Python porting workspace for the Claude Code rewrite effort')
feat: #180 implement --version flag for metadata protocol (#28 proactive demand) Cycle #28 closes the low-hanging metadata protocol gap identified in #180. ## The Gap Pinpoint #180 (filed cycle #24) documented a metadata protocol gap: - `--help` works (argparse default) - `--version` does NOT exist The ROADMAP entry deferred implementation pending demand. Cycle #28 dogfood probe found this during routine invariant audit (attempt to call `--version` as part of comprehensive CLI surface coverage). This is concrete evidence of real friction, not speculative gap-filling. ## Implementation Added `--version` flag to argparse in `build_parser()`: ```python parser.add_argument('--version', action='version', version='claw-code 1.0.0 (Python harness)') ``` Simple one-liner. Follows Python argparse conventions (built-in action='version'). ## Tests Added (3) TestMetadataFlags in test_exec_route_bootstrap_output_format.py: 1. test_version_flag_returns_version_text — `claw --version` prints version 2. test_help_flag_returns_help_text — `claw --help` still works 3. test_help_still_works_after_version_added — Both -h and --help work Regression guard on the original help surface. ## Test Status - 214 → 217 tests passing (+3) - Zero regressions - Full suite green ## Discipline This cycle exemplifies the cycle #24 calibration: - #180 was filed as 'deferred pending demand' - Cycle #28 dogfood found actual friction (proactive test coverage gap) - Evidence = concrete ('--version not found during invariant audit') - Action = minimal implementation + regression tests - No speculation, no feature creep, no implementation before evidence Not 'we imagined someone might want this.' Instead: 'we tried to call it during routine maintenance, got ENOENT, fixed it.' ## Related - #180 (cycle #24): Metadata protocol gap filed - Cycle #27: Cross-channel consistency audit established framework - Cycle #28 invariant audit: Discovered actual friction, triggered fix --- Classification (per cycle #24 calibration): - Red-state bug? ✗ (not a malfunction, just an absence) - Real friction? ✓ (audit probe could not call the flag, had to special-case) - Evidence-backed? ✓ (proactive test coverage revealed the gap) Source: Jobdori cycle #28 dogfood — invariant audit attempting comprehensive CLI surface coverage found that --version was unsupported.
2026-04-22 21:56:20 +09:00
# #180: Add --version flag to match canonical CLI contract
parser.add_argument('--version', action='version', version='claw-code 1.0.0 (Python harness)')
subparsers = parser.add_subparsers(dest='command', required=True)
subparsers.add_parser('summary', help='render a Markdown summary of the Python porting workspace')
subparsers.add_parser('manifest', help='print the current Python workspace manifest')
subparsers.add_parser('parity-audit', help='compare the Python workspace against the local ignored TypeScript archive when available')
subparsers.add_parser('setup-report', help='render the startup/prefetch setup report')
2026-04-22 18:47:34 +09:00
command_graph_parser = subparsers.add_parser('command-graph', help='show command graph segmentation')
command_graph_parser.add_argument('--output-format', choices=['text', 'json'], default='text')
tool_pool_parser = subparsers.add_parser('tool-pool', help='show assembled tool pool with default settings')
tool_pool_parser.add_argument('--output-format', choices=['text', 'json'], default='text')
bootstrap_graph_parser = subparsers.add_parser('bootstrap-graph', help='show the mirrored bootstrap/runtime graph stages')
bootstrap_graph_parser.add_argument('--output-format', choices=['text', 'json'], default='text')
list_parser = subparsers.add_parser('subsystems', help='list the current Python modules in the workspace')
list_parser.add_argument('--limit', type=int, default=32)
commands_parser = subparsers.add_parser('commands', help='list mirrored command entries from the archived snapshot')
commands_parser.add_argument('--limit', type=int, default=20)
commands_parser.add_argument('--query')
commands_parser.add_argument('--no-plugin-commands', action='store_true')
commands_parser.add_argument('--no-skill-commands', action='store_true')
tools_parser = subparsers.add_parser('tools', help='list mirrored tool entries from the archived snapshot')
tools_parser.add_argument('--limit', type=int, default=20)
tools_parser.add_argument('--query')
tools_parser.add_argument('--simple-mode', action='store_true')
tools_parser.add_argument('--no-mcp', action='store_true')
tools_parser.add_argument('--deny-tool', action='append', default=[])
tools_parser.add_argument('--deny-prefix', action='append', default=[])
route_parser = subparsers.add_parser('route', help='route a prompt across mirrored command/tool inventories')
route_parser.add_argument('prompt')
route_parser.add_argument('--limit', type=int, default=5)
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
# #168: parity with show-command/show-tool/session-lifecycle CLI family
route_parser.add_argument('--output-format', choices=['text', 'json'], default='text')
bootstrap_parser = subparsers.add_parser('bootstrap', help='build a runtime-style session report from the mirrored inventories')
bootstrap_parser.add_argument('prompt')
bootstrap_parser.add_argument('--limit', type=int, default=5)
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
# #168: parity with CLI family
bootstrap_parser.add_argument('--output-format', choices=['text', 'json'], default='text')
loop_parser = subparsers.add_parser('turn-loop', help='run a small stateful turn loop for the mirrored runtime')
loop_parser.add_argument('prompt')
loop_parser.add_argument('--limit', type=int, default=5)
loop_parser.add_argument('--max-turns', type=int, default=3)
loop_parser.add_argument('--structured-output', action='store_true')
fix: #161 — wall-clock timeout for run_turn_loop; stalled turns now abort with stop_reason='timeout' Previously, run_turn_loop was bounded only by max_turns (turn count). If engine.submit_message stalled — slow provider, hung network, infinite stream — the loop blocked indefinitely with no cancellation path. Claws calling run_turn_loop in CI or orchestration had no reliable way to enforce a deadline; the loop would hang until OS kill or human intervention. Fix: - Add timeout_seconds parameter to run_turn_loop (default None = legacy unbounded). - When set, each submit_message call runs inside a ThreadPoolExecutor and is bounded by the remaining wall-clock budget (total across all turns, not per-turn). - On timeout, synthesize a TurnResult with stop_reason='timeout' carrying the turn's prompt and routed matches so transcripts preserve orchestration context. - Exhausted/negative budget short-circuits before calling submit_message. - Legacy path (timeout_seconds=None) bypasses the executor entirely — zero overhead for callers that don't opt in. CLI: - Added --timeout-seconds flag to 'turn-loop' command. - Exit code 2 when the loop terminated on timeout (vs 0 for completed), so shell scripts can distinguish 'done' from 'budget exhausted'. Tests (tests/test_run_turn_loop_timeout.py, 6 tests): - Legacy unbounded path unchanged (timeout_seconds=None never emits 'timeout') - Hung submit_message aborted within budget (0.3s budget, 5s mock hang → exit <1.5s) - Budget is cumulative across turns (0.6s budget, 0.4s per turn, not per-turn) - timeout_seconds=0 short-circuits first turn without calling submit_message - Negative timeout treated as exhausted (guard against caller bugs) - Timeout TurnResult carries correct prompt, matches, UsageSummary shape Full suite: 49/49 passing, zero regression. Blocker: none. Closes ROADMAP #161.
2026-04-22 17:23:43 +09:00
loop_parser.add_argument(
'--timeout-seconds',
type=float,
default=None,
help='total wall-clock budget across all turns (#161). Default: unbounded.',
)
fix: #163 — remove [turn N] suffix pollution from run_turn_loop; file #164 timeout-cancellation followup #163: run_turn_loop no longer injects f'{prompt} [turn N]' into follow-up prompts. The suffix was never defined or interpreted anywhere — not by the engine, not by the system prompt, not by any LLM. It looked like a real user-typed annotation in the transcript and made replay/analysis fragile. New behaviour: - turn 0 submits the original prompt (unchanged) - turn > 0 submits caller-supplied continuation_prompt if provided, else the loop stops cleanly — no fabricated user turn - added continuation_prompt: str | None = None parameter to run_turn_loop - added --continuation-prompt CLI flag for claws scripting multi-turn loops - zero '[turn' strings ever appear in mutable_messages or stdout now Behaviour change for existing callers: - Before: run_turn_loop(prompt, max_turns=3) submitted 3 turns ('prompt', 'prompt [turn 2]', 'prompt [turn 3]') - After: run_turn_loop(prompt, max_turns=3) submits 1 turn ('prompt') - To preserve old multi-turn behaviour, pass continuation_prompt='Continue.' or any structured follow-up text One existing timeout test (test_budget_is_cumulative_across_turns) updated to pass continuation_prompt so the cumulative-budget contract is actually exercised across turns instead of trivially satisfied by a one-turn loop. #164 filed: addresses reviewer feedback on #161. The wall-clock timeout bounds the caller-facing wait, but the underlying submit_message worker thread keeps running and can mutate engine state after the timeout TurnResult is returned. A cooperative cancel_event pattern is sketched in the pinpoint; real asyncio.Task.cancel() support will come once provider IO is async-native (larger refactor). Tests (tests/test_run_turn_loop_continuation.py, 8 tests): - TestNoTurnSuffixInjection (2): zero '[turn' strings in any submitted prompt, both default and explicit-continuation paths - TestContinuationDefaultStopsAfterTurnZero (2): default loops run exactly one turn; engine.submit_message called exactly once despite max_turns=10 - TestExplicitContinuationBehaviour (2): turn 0 = original, turn N = continuation verbatim; max_turns still respected - TestCLIContinuationFlag (2): CLI default emits only '## Turn 1'; --continuation-prompt wires through to multi-turn behaviour Full suite: 67/67 passing. Closes ROADMAP #163. Files #164.
2026-04-22 17:37:22 +09:00
loop_parser.add_argument(
'--continuation-prompt',
default=None,
help=(
'prompt to submit on turns after the first (#163). Default: None '
'(loop stops after turn 0). Replaces the deprecated implicit "[turn N]" '
'suffix that used to pollute the transcript.'
),
)
feat: #164 Stage B CLOSURE — turn-loop JSON + cancel_observed coverage + CLAWABLE promotion Closes all three gaebal-gajae-identified closure criteria for #164 Stage B: 1. turn-loop runtime surface exposes cancel_observed consistently 2. cancellation path tests validate safe-to-reuse semantics 3. turn-loop promoted from OPT_OUT to CLAWABLE surface Changes: src/main.py: - turn-loop accepts --output-format {text,json} - JSON envelope includes per-turn cancel_observed + final_cancel_observed - All turn fields exposed: prompt, output, stop_reason, cancel_observed, matched_commands, matched_tools - Exit code 2 on final timeout preserved tests/test_cli_parity_audit.py: - CLAWABLE_SURFACES now contains 14 commands (was 13) - Removed 'turn-loop' from OPT_OUT_SURFACES - Parametrized --output-format test auto-validates turn-loop JSON tests/test_cancel_observed_field.py (new, 9 tests): - TestCancelObservedField (5 tests): field contract - default False - explicit True preserved - normal completion → False - bootstrap JSON exposes field - turn-loop JSON exposes per-turn field - TestCancelObservedSafeReuseSemantics (2 tests): reuse contract - timeout result has cancel_observed=True when signaled - engine.mutable_messages not corrupted after cancelled turn - engine accepts fresh message after cancellation - TestCancelObservedSchemaCompliance (2 tests): SCHEMAS.md contract - cancel_observed is always bool - final_cancel_observed convenience field present Closure criteria validated: - ✅ Field exposed in bootstrap JSON - ✅ Field exposed per-turn in turn-loop JSON - ✅ Field is always bool, never null - ✅ Safe-to-reuse: engine can accept fresh messages after cancellation - ✅ mutable_messages not corrupted by cancelled turn - ✅ turn-loop promoted from OPT_OUT (14 clawable commands now) Protocol now distinguishes at runtime: timeout + cancel_observed=false → infra/wedge (escalate) timeout + cancel_observed=true → cooperative cancellation (safe to retry) Test results: 182 → 192 passing, +10 tests, zero regression, 3 skipped unchanged. Closes #164 Stage B. Stage C (async-native preemption) remains future work.
2026-04-22 19:49:20 +09:00
loop_parser.add_argument(
'--output-format',
choices=['text', 'json'],
default='text',
help='output format (#164 Stage B: JSON includes cancel_observed per turn)',
)
flush_parser = subparsers.add_parser(
'flush-transcript',
help='persist and flush a temporary session transcript (#160/#166: claw-native session API)',
)
flush_parser.add_argument('prompt')
flush_parser.add_argument(
'--directory', help='session storage directory (default: .port_sessions)'
)
flush_parser.add_argument(
'--output-format',
choices=['text', 'json'],
default='text',
help='output format',
)
flush_parser.add_argument(
'--session-id',
help='deterministic session ID (default: auto-generated UUID)',
)
fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.
2026-04-22 17:44:48 +09:00
load_session_parser = subparsers.add_parser(
'load-session',
help='load a previously persisted session (#160/#165: claw-native session API)',
)
load_session_parser.add_argument('session_id')
fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.
2026-04-22 17:44:48 +09:00
load_session_parser.add_argument(
'--directory', help='session storage directory (default: .port_sessions)'
)
load_session_parser.add_argument(
'--output-format',
choices=['text', 'json'],
default='text',
help='output format',
)
feat(#160): wire claw list-sessions and delete-session CLI commands Closes the last #160 gap: claws can now manage session lifecycle entirely through the CLI without filesystem hacks. New commands: - claw list-sessions [--directory DIR] [--output-format text|json] Enumerates stored session IDs. JSON mode emits {sessions, count}. Missing/empty directories return empty list (exit 0), not an error. - claw delete-session SESSION_ID [--directory DIR] [--output-format text|json] Idempotent: not-found is exit 0 with status='not_found' (no raise). Partial-failure: exit 1 with typed JSON error envelope: {session_id, deleted: false, error: {kind, message, retryable}} The 'session_delete_failed' kind is retryable=true so orchestrators know to retry vs escalate. Public API surface extended in src/__init__.py: - list_sessions, session_exists, delete_session - SessionNotFoundError, SessionDeleteError Tests added (tests/test_porting_workspace.py): - test_list_sessions_cli_runs: text + json modes against tempdir - test_delete_session_cli_idempotent: first call deleted=true, second call deleted=false (exit 0, status=not_found) - test_delete_session_cli_partial_failure_exit_1: permission error surfaces as exit 1 + typed JSON error with retryable=true All 43 tests pass. The session storage abstraction chapter is closed: - storage layer decoupled from claw code (#160 initial impl) - delete contract hardened + caller-audited (#160 hardening pass) - CLI wired with idempotency preserved at exit-code boundary (this commit)
2026-04-22 17:16:53 +09:00
list_sessions_parser = subparsers.add_parser(
'list-sessions',
help='enumerate stored session IDs (#160: claw-native session API)',
)
list_sessions_parser.add_argument(
'--directory', help='session storage directory (default: .port_sessions)'
)
list_sessions_parser.add_argument(
'--output-format',
choices=['text', 'json'],
default='text',
help='output format',
)
delete_session_parser = subparsers.add_parser(
'delete-session',
help='delete a persisted session (#160: idempotent, race-safe)',
)
delete_session_parser.add_argument('session_id')
delete_session_parser.add_argument(
'--directory', help='session storage directory (default: .port_sessions)'
)
delete_session_parser.add_argument(
'--output-format',
choices=['text', 'json'],
default='text',
help='output format',
)
remote_parser = subparsers.add_parser('remote-mode', help='simulate remote-control runtime branching')
remote_parser.add_argument('target')
ssh_parser = subparsers.add_parser('ssh-mode', help='simulate SSH runtime branching')
ssh_parser.add_argument('target')
teleport_parser = subparsers.add_parser('teleport-mode', help='simulate teleport runtime branching')
teleport_parser.add_argument('target')
direct_parser = subparsers.add_parser('direct-connect-mode', help='simulate direct-connect runtime branching')
direct_parser.add_argument('target')
deep_link_parser = subparsers.add_parser('deep-link-mode', help='simulate deep-link runtime branching')
deep_link_parser.add_argument('target')
show_command = subparsers.add_parser('show-command', help='show one mirrored command entry by exact name')
show_command.add_argument('name')
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
show_command.add_argument('--output-format', choices=['text', 'json'], default='text')
show_tool = subparsers.add_parser('show-tool', help='show one mirrored tool entry by exact name')
show_tool.add_argument('name')
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
show_tool.add_argument('--output-format', choices=['text', 'json'], default='text')
exec_command_parser = subparsers.add_parser('exec-command', help='execute a mirrored command shim by exact name')
exec_command_parser.add_argument('name')
exec_command_parser.add_argument('prompt')
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
# #168: parity with CLI family
exec_command_parser.add_argument('--output-format', choices=['text', 'json'], default='text')
exec_tool_parser = subparsers.add_parser('exec-tool', help='execute a mirrored tool shim by exact name')
exec_tool_parser.add_argument('name')
exec_tool_parser.add_argument('payload')
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
# #168: parity with CLI family
exec_tool_parser.add_argument('--output-format', choices=['text', 'json'], default='text')
return parser
fix: #179 — JSON mode now fully suppresses argparse stderr + preserves real error message Dogfood discovered #178 had two residual gaps: 1. Stderr pollution: argparse usage + error text still leaked to stderr even in JSON mode (envelope was correct on stdout, but stderr noise broke the 'machine-first protocol' contract — claws capturing both streams got dual output) 2. Generic error message: envelope carried 'invalid command or argument (argparse rejection)' instead of argparse's actual text like 'the following arguments are required: session_id' or 'invalid choice: typo (choose from ...)' Before #179: $ claw load-session --output-format json [stdout] {"error": {"message": "invalid command or argument (argparse rejection)"}} [stderr] usage: main.py load-session [-h] ... main.py load-session: error: the following arguments are required: session_id [exit 1] After #179: $ claw load-session --output-format json [stdout] {"error": {"message": "the following arguments are required: session_id"}} [stderr] (empty) [exit 1] Implementation: - New _ArgparseError exception class captures argparse's real message - main() monkey-patches parser.error (+ all subparser.error) in JSON mode to raise _ArgparseError instead of print-to-stderr + sys.exit(2) - _emit_parse_error_envelope() now receives the real message verbatim - Text mode path unchanged: still uses original argparse print+exit behavior Contract: - JSON mode: stdout carries envelope with argparse's actual error; stderr silent - Text mode: unchanged — argparse usage to stderr, exit 2 - Parse errors still error.kind='parse', retryable=false Test additions (5 new, 14 total in test_parse_error_envelope.py): - TestParseErrorStderrHygiene (5): - test_json_mode_stderr_is_silent_on_unknown_command - test_json_mode_stderr_is_silent_on_missing_arg - test_json_mode_envelope_carries_real_argparse_message - test_json_mode_envelope_carries_invalid_choice_details (verifies valid-choices list) - test_text_mode_stderr_preserved_on_unknown_command (backward compat) Operational impact: Claws capturing both stdout and stderr no longer get garbled output. The envelope message now carries discoverability info (valid command list, missing-arg name) that claws can use for retry/recovery without probing the CLI a second time. Test results: 201 → 206 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 20:30 KST (cycle #20).
2026-04-22 20:32:28 +09:00
class _ArgparseError(Exception):
"""#179: internal exception capturing argparse's real error message.
Subclassed ArgumentParser raises this instead of printing + exiting,
so JSON mode can preserve the actual error (e.g. 'the following arguments
are required: session_id') in the envelope.
"""
def __init__(self, message: str) -> None:
super().__init__(message)
self.message = message
feat: #178 — argparse errors emit JSON envelope when --output-format json requested Dogfood pinpoint: running 'claw nonexistent-command --output-format json' bypasses the JSON envelope contract — argparse dumps human-readable usage to stderr with exit 2, breaking the SCHEMAS.md guarantee that JSON mode returns structured output. Problem: $ claw nonexistent --output-format json usage: main.py [-h] {summary,manifest,...} ... main.py: error: argument command: invalid choice: 'nonexistent' (choose from ...) [exit 2 — no envelope, claws must parse argparse usage messages] Fix: $ claw nonexistent --output-format json { "timestamp": "2026-04-22T11:00:29Z", "command": "nonexistent-command", "exit_code": 1, "output_format": "json", "schema_version": "1.0", "error": { "kind": "parse", "operation": "argparse", "target": "nonexistent-command", "retryable": false, "message": "invalid command or argument (argparse rejection)", "hint": "run with no arguments to see available subcommands" } } [exit 1, clean JSON envelope on stdout per SCHEMAS.md] Changes: - src/main.py: - _wants_json_output(argv): pre-scan for --output-format json before parsing - _emit_parse_error_envelope(argv, message): emit wrapped envelope on stdout - main(): catch SystemExit from argparse; if JSON requested, emit envelope instead of letting argparse's help dump go through - tests/test_parse_error_envelope.py (new, 9 tests): - TestParseErrorJsonEnvelope (7): unknown command, =syntax, text mode unchanged, invalid flag, missing command, valid command unaffected, common fields - TestParseErrorSchemaCompliance (2): error.kind='parse', retryable=false Contract: - text mode (default): unchanged — argparse dumps help to stderr, exits 2 - JSON mode: envelope per SCHEMAS.md, error.kind='parse', exit 1 - Parse errors always retryable=false (typo won't self-fix) - error.kind='parse' already enumerated in SCHEMAS.md (no schema changes) This closes a real gap: claws invoking unknown commands in JSON mode can now route via exit code + envelope.kind='parse' instead of scraping argparse output. Test results: 192 → 201 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 19:59 KST (cycle #19).
2026-04-22 20:02:39 +09:00
def _emit_parse_error_envelope(argv: list[str], message: str) -> None:
fix: #179 — JSON mode now fully suppresses argparse stderr + preserves real error message Dogfood discovered #178 had two residual gaps: 1. Stderr pollution: argparse usage + error text still leaked to stderr even in JSON mode (envelope was correct on stdout, but stderr noise broke the 'machine-first protocol' contract — claws capturing both streams got dual output) 2. Generic error message: envelope carried 'invalid command or argument (argparse rejection)' instead of argparse's actual text like 'the following arguments are required: session_id' or 'invalid choice: typo (choose from ...)' Before #179: $ claw load-session --output-format json [stdout] {"error": {"message": "invalid command or argument (argparse rejection)"}} [stderr] usage: main.py load-session [-h] ... main.py load-session: error: the following arguments are required: session_id [exit 1] After #179: $ claw load-session --output-format json [stdout] {"error": {"message": "the following arguments are required: session_id"}} [stderr] (empty) [exit 1] Implementation: - New _ArgparseError exception class captures argparse's real message - main() monkey-patches parser.error (+ all subparser.error) in JSON mode to raise _ArgparseError instead of print-to-stderr + sys.exit(2) - _emit_parse_error_envelope() now receives the real message verbatim - Text mode path unchanged: still uses original argparse print+exit behavior Contract: - JSON mode: stdout carries envelope with argparse's actual error; stderr silent - Text mode: unchanged — argparse usage to stderr, exit 2 - Parse errors still error.kind='parse', retryable=false Test additions (5 new, 14 total in test_parse_error_envelope.py): - TestParseErrorStderrHygiene (5): - test_json_mode_stderr_is_silent_on_unknown_command - test_json_mode_stderr_is_silent_on_missing_arg - test_json_mode_envelope_carries_real_argparse_message - test_json_mode_envelope_carries_invalid_choice_details (verifies valid-choices list) - test_text_mode_stderr_preserved_on_unknown_command (backward compat) Operational impact: Claws capturing both stdout and stderr no longer get garbled output. The envelope message now carries discoverability info (valid command list, missing-arg name) that claws can use for retry/recovery without probing the CLI a second time. Test results: 201 → 206 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 20:30 KST (cycle #20).
2026-04-22 20:32:28 +09:00
"""#178/#179: emit JSON envelope for argparse-level errors when --output-format json is requested.
feat: #178 — argparse errors emit JSON envelope when --output-format json requested Dogfood pinpoint: running 'claw nonexistent-command --output-format json' bypasses the JSON envelope contract — argparse dumps human-readable usage to stderr with exit 2, breaking the SCHEMAS.md guarantee that JSON mode returns structured output. Problem: $ claw nonexistent --output-format json usage: main.py [-h] {summary,manifest,...} ... main.py: error: argument command: invalid choice: 'nonexistent' (choose from ...) [exit 2 — no envelope, claws must parse argparse usage messages] Fix: $ claw nonexistent --output-format json { "timestamp": "2026-04-22T11:00:29Z", "command": "nonexistent-command", "exit_code": 1, "output_format": "json", "schema_version": "1.0", "error": { "kind": "parse", "operation": "argparse", "target": "nonexistent-command", "retryable": false, "message": "invalid command or argument (argparse rejection)", "hint": "run with no arguments to see available subcommands" } } [exit 1, clean JSON envelope on stdout per SCHEMAS.md] Changes: - src/main.py: - _wants_json_output(argv): pre-scan for --output-format json before parsing - _emit_parse_error_envelope(argv, message): emit wrapped envelope on stdout - main(): catch SystemExit from argparse; if JSON requested, emit envelope instead of letting argparse's help dump go through - tests/test_parse_error_envelope.py (new, 9 tests): - TestParseErrorJsonEnvelope (7): unknown command, =syntax, text mode unchanged, invalid flag, missing command, valid command unaffected, common fields - TestParseErrorSchemaCompliance (2): error.kind='parse', retryable=false Contract: - text mode (default): unchanged — argparse dumps help to stderr, exits 2 - JSON mode: envelope per SCHEMAS.md, error.kind='parse', exit 1 - Parse errors always retryable=false (typo won't self-fix) - error.kind='parse' already enumerated in SCHEMAS.md (no schema changes) This closes a real gap: claws invoking unknown commands in JSON mode can now route via exit code + envelope.kind='parse' instead of scraping argparse output. Test results: 192 → 201 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 19:59 KST (cycle #19).
2026-04-22 20:02:39 +09:00
Pre-scans argv for --output-format json. If found, prints a parse-error envelope
to stdout (per SCHEMAS.md 'error' envelope shape) instead of letting argparse
dump help text to stderr. This preserves the JSON contract for claws that can't
parse argparse usage messages.
fix: #179 — JSON mode now fully suppresses argparse stderr + preserves real error message Dogfood discovered #178 had two residual gaps: 1. Stderr pollution: argparse usage + error text still leaked to stderr even in JSON mode (envelope was correct on stdout, but stderr noise broke the 'machine-first protocol' contract — claws capturing both streams got dual output) 2. Generic error message: envelope carried 'invalid command or argument (argparse rejection)' instead of argparse's actual text like 'the following arguments are required: session_id' or 'invalid choice: typo (choose from ...)' Before #179: $ claw load-session --output-format json [stdout] {"error": {"message": "invalid command or argument (argparse rejection)"}} [stderr] usage: main.py load-session [-h] ... main.py load-session: error: the following arguments are required: session_id [exit 1] After #179: $ claw load-session --output-format json [stdout] {"error": {"message": "the following arguments are required: session_id"}} [stderr] (empty) [exit 1] Implementation: - New _ArgparseError exception class captures argparse's real message - main() monkey-patches parser.error (+ all subparser.error) in JSON mode to raise _ArgparseError instead of print-to-stderr + sys.exit(2) - _emit_parse_error_envelope() now receives the real message verbatim - Text mode path unchanged: still uses original argparse print+exit behavior Contract: - JSON mode: stdout carries envelope with argparse's actual error; stderr silent - Text mode: unchanged — argparse usage to stderr, exit 2 - Parse errors still error.kind='parse', retryable=false Test additions (5 new, 14 total in test_parse_error_envelope.py): - TestParseErrorStderrHygiene (5): - test_json_mode_stderr_is_silent_on_unknown_command - test_json_mode_stderr_is_silent_on_missing_arg - test_json_mode_envelope_carries_real_argparse_message - test_json_mode_envelope_carries_invalid_choice_details (verifies valid-choices list) - test_text_mode_stderr_preserved_on_unknown_command (backward compat) Operational impact: Claws capturing both stdout and stderr no longer get garbled output. The envelope message now carries discoverability info (valid command list, missing-arg name) that claws can use for retry/recovery without probing the CLI a second time. Test results: 201 → 206 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 20:30 KST (cycle #20).
2026-04-22 20:32:28 +09:00
#179 update: `message` now carries argparse's actual error text, not a generic
rejection string. Stderr is fully suppressed in JSON mode.
feat: #178 — argparse errors emit JSON envelope when --output-format json requested Dogfood pinpoint: running 'claw nonexistent-command --output-format json' bypasses the JSON envelope contract — argparse dumps human-readable usage to stderr with exit 2, breaking the SCHEMAS.md guarantee that JSON mode returns structured output. Problem: $ claw nonexistent --output-format json usage: main.py [-h] {summary,manifest,...} ... main.py: error: argument command: invalid choice: 'nonexistent' (choose from ...) [exit 2 — no envelope, claws must parse argparse usage messages] Fix: $ claw nonexistent --output-format json { "timestamp": "2026-04-22T11:00:29Z", "command": "nonexistent-command", "exit_code": 1, "output_format": "json", "schema_version": "1.0", "error": { "kind": "parse", "operation": "argparse", "target": "nonexistent-command", "retryable": false, "message": "invalid command or argument (argparse rejection)", "hint": "run with no arguments to see available subcommands" } } [exit 1, clean JSON envelope on stdout per SCHEMAS.md] Changes: - src/main.py: - _wants_json_output(argv): pre-scan for --output-format json before parsing - _emit_parse_error_envelope(argv, message): emit wrapped envelope on stdout - main(): catch SystemExit from argparse; if JSON requested, emit envelope instead of letting argparse's help dump go through - tests/test_parse_error_envelope.py (new, 9 tests): - TestParseErrorJsonEnvelope (7): unknown command, =syntax, text mode unchanged, invalid flag, missing command, valid command unaffected, common fields - TestParseErrorSchemaCompliance (2): error.kind='parse', retryable=false Contract: - text mode (default): unchanged — argparse dumps help to stderr, exits 2 - JSON mode: envelope per SCHEMAS.md, error.kind='parse', exit 1 - Parse errors always retryable=false (typo won't self-fix) - error.kind='parse' already enumerated in SCHEMAS.md (no schema changes) This closes a real gap: claws invoking unknown commands in JSON mode can now route via exit code + envelope.kind='parse' instead of scraping argparse output. Test results: 192 → 201 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 19:59 KST (cycle #19).
2026-04-22 20:02:39 +09:00
"""
import json
# Extract the attempted command (argv[0] is the first positional)
attempted = argv[0] if argv and not argv[0].startswith('-') else '<missing>'
envelope = wrap_json_envelope(
{
'error': {
'kind': 'parse',
'operation': 'argparse',
'target': attempted,
'retryable': False,
'message': message,
'hint': 'run with no arguments to see available subcommands',
},
},
command=attempted,
exit_code=1,
)
print(json.dumps(envelope))
def _wants_json_output(argv: list[str]) -> bool:
"""#178: check if argv contains --output-format json anywhere (for parse-error routing)."""
for i, arg in enumerate(argv):
if arg == '--output-format' and i + 1 < len(argv) and argv[i + 1] == 'json':
return True
if arg == '--output-format=json':
return True
return False
def main(argv: list[str] | None = None) -> int:
feat: #178 — argparse errors emit JSON envelope when --output-format json requested Dogfood pinpoint: running 'claw nonexistent-command --output-format json' bypasses the JSON envelope contract — argparse dumps human-readable usage to stderr with exit 2, breaking the SCHEMAS.md guarantee that JSON mode returns structured output. Problem: $ claw nonexistent --output-format json usage: main.py [-h] {summary,manifest,...} ... main.py: error: argument command: invalid choice: 'nonexistent' (choose from ...) [exit 2 — no envelope, claws must parse argparse usage messages] Fix: $ claw nonexistent --output-format json { "timestamp": "2026-04-22T11:00:29Z", "command": "nonexistent-command", "exit_code": 1, "output_format": "json", "schema_version": "1.0", "error": { "kind": "parse", "operation": "argparse", "target": "nonexistent-command", "retryable": false, "message": "invalid command or argument (argparse rejection)", "hint": "run with no arguments to see available subcommands" } } [exit 1, clean JSON envelope on stdout per SCHEMAS.md] Changes: - src/main.py: - _wants_json_output(argv): pre-scan for --output-format json before parsing - _emit_parse_error_envelope(argv, message): emit wrapped envelope on stdout - main(): catch SystemExit from argparse; if JSON requested, emit envelope instead of letting argparse's help dump go through - tests/test_parse_error_envelope.py (new, 9 tests): - TestParseErrorJsonEnvelope (7): unknown command, =syntax, text mode unchanged, invalid flag, missing command, valid command unaffected, common fields - TestParseErrorSchemaCompliance (2): error.kind='parse', retryable=false Contract: - text mode (default): unchanged — argparse dumps help to stderr, exits 2 - JSON mode: envelope per SCHEMAS.md, error.kind='parse', exit 1 - Parse errors always retryable=false (typo won't self-fix) - error.kind='parse' already enumerated in SCHEMAS.md (no schema changes) This closes a real gap: claws invoking unknown commands in JSON mode can now route via exit code + envelope.kind='parse' instead of scraping argparse output. Test results: 192 → 201 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 19:59 KST (cycle #19).
2026-04-22 20:02:39 +09:00
import sys
if argv is None:
argv = sys.argv[1:]
parser = build_parser()
fix: #179 — JSON mode now fully suppresses argparse stderr + preserves real error message Dogfood discovered #178 had two residual gaps: 1. Stderr pollution: argparse usage + error text still leaked to stderr even in JSON mode (envelope was correct on stdout, but stderr noise broke the 'machine-first protocol' contract — claws capturing both streams got dual output) 2. Generic error message: envelope carried 'invalid command or argument (argparse rejection)' instead of argparse's actual text like 'the following arguments are required: session_id' or 'invalid choice: typo (choose from ...)' Before #179: $ claw load-session --output-format json [stdout] {"error": {"message": "invalid command or argument (argparse rejection)"}} [stderr] usage: main.py load-session [-h] ... main.py load-session: error: the following arguments are required: session_id [exit 1] After #179: $ claw load-session --output-format json [stdout] {"error": {"message": "the following arguments are required: session_id"}} [stderr] (empty) [exit 1] Implementation: - New _ArgparseError exception class captures argparse's real message - main() monkey-patches parser.error (+ all subparser.error) in JSON mode to raise _ArgparseError instead of print-to-stderr + sys.exit(2) - _emit_parse_error_envelope() now receives the real message verbatim - Text mode path unchanged: still uses original argparse print+exit behavior Contract: - JSON mode: stdout carries envelope with argparse's actual error; stderr silent - Text mode: unchanged — argparse usage to stderr, exit 2 - Parse errors still error.kind='parse', retryable=false Test additions (5 new, 14 total in test_parse_error_envelope.py): - TestParseErrorStderrHygiene (5): - test_json_mode_stderr_is_silent_on_unknown_command - test_json_mode_stderr_is_silent_on_missing_arg - test_json_mode_envelope_carries_real_argparse_message - test_json_mode_envelope_carries_invalid_choice_details (verifies valid-choices list) - test_text_mode_stderr_preserved_on_unknown_command (backward compat) Operational impact: Claws capturing both stdout and stderr no longer get garbled output. The envelope message now carries discoverability info (valid command list, missing-arg name) that claws can use for retry/recovery without probing the CLI a second time. Test results: 201 → 206 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 20:30 KST (cycle #20).
2026-04-22 20:32:28 +09:00
json_mode = _wants_json_output(argv)
# #178/#179: capture argparse errors with real message and emit JSON envelope
# when --output-format json is requested. In JSON mode, stderr is silenced
# so claws only see the envelope on stdout.
if json_mode:
# Monkey-patch parser.error to raise instead of print+exit. This preserves
# the original error message text (e.g. 'argument X: invalid choice: ...').
original_error = parser.error
def _json_mode_error(message: str) -> None:
raise _ArgparseError(message)
parser.error = _json_mode_error # type: ignore[method-assign]
# Also patch all subparsers
for action in parser._actions:
if hasattr(action, 'choices') and isinstance(action.choices, dict):
for subp in action.choices.values():
subp.error = _json_mode_error # type: ignore[method-assign]
try:
args = parser.parse_args(argv)
except _ArgparseError as err:
_emit_parse_error_envelope(argv, err.message)
feat: #178 — argparse errors emit JSON envelope when --output-format json requested Dogfood pinpoint: running 'claw nonexistent-command --output-format json' bypasses the JSON envelope contract — argparse dumps human-readable usage to stderr with exit 2, breaking the SCHEMAS.md guarantee that JSON mode returns structured output. Problem: $ claw nonexistent --output-format json usage: main.py [-h] {summary,manifest,...} ... main.py: error: argument command: invalid choice: 'nonexistent' (choose from ...) [exit 2 — no envelope, claws must parse argparse usage messages] Fix: $ claw nonexistent --output-format json { "timestamp": "2026-04-22T11:00:29Z", "command": "nonexistent-command", "exit_code": 1, "output_format": "json", "schema_version": "1.0", "error": { "kind": "parse", "operation": "argparse", "target": "nonexistent-command", "retryable": false, "message": "invalid command or argument (argparse rejection)", "hint": "run with no arguments to see available subcommands" } } [exit 1, clean JSON envelope on stdout per SCHEMAS.md] Changes: - src/main.py: - _wants_json_output(argv): pre-scan for --output-format json before parsing - _emit_parse_error_envelope(argv, message): emit wrapped envelope on stdout - main(): catch SystemExit from argparse; if JSON requested, emit envelope instead of letting argparse's help dump go through - tests/test_parse_error_envelope.py (new, 9 tests): - TestParseErrorJsonEnvelope (7): unknown command, =syntax, text mode unchanged, invalid flag, missing command, valid command unaffected, common fields - TestParseErrorSchemaCompliance (2): error.kind='parse', retryable=false Contract: - text mode (default): unchanged — argparse dumps help to stderr, exits 2 - JSON mode: envelope per SCHEMAS.md, error.kind='parse', exit 1 - Parse errors always retryable=false (typo won't self-fix) - error.kind='parse' already enumerated in SCHEMAS.md (no schema changes) This closes a real gap: claws invoking unknown commands in JSON mode can now route via exit code + envelope.kind='parse' instead of scraping argparse output. Test results: 192 → 201 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 19:59 KST (cycle #19).
2026-04-22 20:02:39 +09:00
return 1
fix: #179 — JSON mode now fully suppresses argparse stderr + preserves real error message Dogfood discovered #178 had two residual gaps: 1. Stderr pollution: argparse usage + error text still leaked to stderr even in JSON mode (envelope was correct on stdout, but stderr noise broke the 'machine-first protocol' contract — claws capturing both streams got dual output) 2. Generic error message: envelope carried 'invalid command or argument (argparse rejection)' instead of argparse's actual text like 'the following arguments are required: session_id' or 'invalid choice: typo (choose from ...)' Before #179: $ claw load-session --output-format json [stdout] {"error": {"message": "invalid command or argument (argparse rejection)"}} [stderr] usage: main.py load-session [-h] ... main.py load-session: error: the following arguments are required: session_id [exit 1] After #179: $ claw load-session --output-format json [stdout] {"error": {"message": "the following arguments are required: session_id"}} [stderr] (empty) [exit 1] Implementation: - New _ArgparseError exception class captures argparse's real message - main() monkey-patches parser.error (+ all subparser.error) in JSON mode to raise _ArgparseError instead of print-to-stderr + sys.exit(2) - _emit_parse_error_envelope() now receives the real message verbatim - Text mode path unchanged: still uses original argparse print+exit behavior Contract: - JSON mode: stdout carries envelope with argparse's actual error; stderr silent - Text mode: unchanged — argparse usage to stderr, exit 2 - Parse errors still error.kind='parse', retryable=false Test additions (5 new, 14 total in test_parse_error_envelope.py): - TestParseErrorStderrHygiene (5): - test_json_mode_stderr_is_silent_on_unknown_command - test_json_mode_stderr_is_silent_on_missing_arg - test_json_mode_envelope_carries_real_argparse_message - test_json_mode_envelope_carries_invalid_choice_details (verifies valid-choices list) - test_text_mode_stderr_preserved_on_unknown_command (backward compat) Operational impact: Claws capturing both stdout and stderr no longer get garbled output. The envelope message now carries discoverability info (valid command list, missing-arg name) that claws can use for retry/recovery without probing the CLI a second time. Test results: 201 → 206 passing, 3 skipped unchanged, zero regression. Pinpoint discovered via dogfood at 2026-04-22 20:30 KST (cycle #20).
2026-04-22 20:32:28 +09:00
except SystemExit as exc:
# Defensive: if argparse exits via some other path (e.g. --help in JSON mode)
if exc.code != 0:
_emit_parse_error_envelope(argv, 'argparse exited with non-zero code')
return 1
raise
else:
args = parser.parse_args(argv)
manifest = build_port_manifest()
if args.command == 'summary':
print(QueryEnginePort(manifest).render_summary())
return 0
if args.command == 'manifest':
print(manifest.to_markdown())
return 0
if args.command == 'parity-audit':
print(run_parity_audit().to_markdown())
return 0
if args.command == 'setup-report':
print(run_setup().as_markdown())
return 0
if args.command == 'command-graph':
2026-04-22 18:47:34 +09:00
graph = build_command_graph()
if args.output_format == 'json':
import json
envelope = {
'builtins_count': len(graph.builtins),
'plugin_like_count': len(graph.plugin_like),
'skill_like_count': len(graph.skill_like),
'total_count': len(graph.flattened()),
'builtins': [{'name': m.name, 'source_hint': m.source_hint} for m in graph.builtins],
'plugin_like': [{'name': m.name, 'source_hint': m.source_hint} for m in graph.plugin_like],
'skill_like': [{'name': m.name, 'source_hint': m.source_hint} for m in graph.skill_like],
}
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
print(json.dumps(wrap_json_envelope(envelope, args.command)))
2026-04-22 18:47:34 +09:00
else:
print(graph.as_markdown())
return 0
if args.command == 'tool-pool':
2026-04-22 18:47:34 +09:00
pool = assemble_tool_pool()
if args.output_format == 'json':
import json
envelope = {
'simple_mode': pool.simple_mode,
'include_mcp': pool.include_mcp,
'tool_count': len(pool.tools),
'tools': [{'name': t.name, 'source_hint': t.source_hint} for t in pool.tools],
}
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
print(json.dumps(wrap_json_envelope(envelope, args.command)))
2026-04-22 18:47:34 +09:00
else:
print(pool.as_markdown())
return 0
if args.command == 'bootstrap-graph':
graph = build_bootstrap_graph()
if args.output_format == 'json':
import json
envelope = {'stages': graph.as_markdown().split('\n'), 'note': 'bootstrap-graph is markdown-only in this version'}
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
print(json.dumps(wrap_json_envelope(envelope, args.command)))
else:
print(graph.as_markdown())
return 0
if args.command == 'subsystems':
for subsystem in manifest.top_level_modules[: args.limit]:
print(f'{subsystem.name}\t{subsystem.file_count}\t{subsystem.notes}')
return 0
if args.command == 'commands':
if args.query:
print(render_command_index(limit=args.limit, query=args.query))
else:
commands = get_commands(include_plugin_commands=not args.no_plugin_commands, include_skill_commands=not args.no_skill_commands)
output_lines = [f'Command entries: {len(commands)}', '']
output_lines.extend(f'- {module.name}{module.source_hint}' for module in commands[: args.limit])
print('\n'.join(output_lines))
return 0
if args.command == 'tools':
if args.query:
print(render_tool_index(limit=args.limit, query=args.query))
else:
permission_context = ToolPermissionContext.from_iterables(args.deny_tool, args.deny_prefix)
tools = get_tools(simple_mode=args.simple_mode, include_mcp=not args.no_mcp, permission_context=permission_context)
output_lines = [f'Tool entries: {len(tools)}', '']
output_lines.extend(f'- {module.name}{module.source_hint}' for module in tools[: args.limit])
print('\n'.join(output_lines))
return 0
if args.command == 'route':
matches = PortRuntime().route_prompt(args.prompt, limit=args.limit)
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
# #168: JSON envelope for machine parsing
if args.output_format == 'json':
import json
envelope = {
'prompt': args.prompt,
'limit': args.limit,
'match_count': len(matches),
'matches': [
{
'kind': m.kind,
'name': m.name,
'score': m.score,
'source_hint': m.source_hint,
}
for m in matches
],
}
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
print(json.dumps(wrap_json_envelope(envelope, args.command)))
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
return 0
if not matches:
print('No mirrored command/tool matches found.')
return 0
for match in matches:
print(f'{match.kind}\t{match.name}\t{match.score}\t{match.source_hint}')
return 0
if args.command == 'bootstrap':
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
session = PortRuntime().bootstrap_session(args.prompt, limit=args.limit)
# #168: JSON envelope for machine parsing
if args.output_format == 'json':
import json
envelope = {
'prompt': session.prompt,
'limit': args.limit,
'setup': {
'python_version': session.setup.python_version,
'implementation': session.setup.implementation,
'platform_name': session.setup.platform_name,
'test_command': session.setup.test_command,
},
'routed_matches': [
{
'kind': m.kind,
'name': m.name,
'score': m.score,
'source_hint': m.source_hint,
}
for m in session.routed_matches
],
'command_execution_messages': list(session.command_execution_messages),
'tool_execution_messages': list(session.tool_execution_messages),
'turn': {
'prompt': session.turn_result.prompt,
'output': session.turn_result.output,
'stop_reason': session.turn_result.stop_reason,
'cancel_observed': session.turn_result.cancel_observed,
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
},
'persisted_session_path': session.persisted_session_path,
}
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
print(json.dumps(wrap_json_envelope(envelope, args.command)))
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
return 0
print(session.as_markdown())
return 0
if args.command == 'turn-loop':
fix: #161 — wall-clock timeout for run_turn_loop; stalled turns now abort with stop_reason='timeout' Previously, run_turn_loop was bounded only by max_turns (turn count). If engine.submit_message stalled — slow provider, hung network, infinite stream — the loop blocked indefinitely with no cancellation path. Claws calling run_turn_loop in CI or orchestration had no reliable way to enforce a deadline; the loop would hang until OS kill or human intervention. Fix: - Add timeout_seconds parameter to run_turn_loop (default None = legacy unbounded). - When set, each submit_message call runs inside a ThreadPoolExecutor and is bounded by the remaining wall-clock budget (total across all turns, not per-turn). - On timeout, synthesize a TurnResult with stop_reason='timeout' carrying the turn's prompt and routed matches so transcripts preserve orchestration context. - Exhausted/negative budget short-circuits before calling submit_message. - Legacy path (timeout_seconds=None) bypasses the executor entirely — zero overhead for callers that don't opt in. CLI: - Added --timeout-seconds flag to 'turn-loop' command. - Exit code 2 when the loop terminated on timeout (vs 0 for completed), so shell scripts can distinguish 'done' from 'budget exhausted'. Tests (tests/test_run_turn_loop_timeout.py, 6 tests): - Legacy unbounded path unchanged (timeout_seconds=None never emits 'timeout') - Hung submit_message aborted within budget (0.3s budget, 5s mock hang → exit <1.5s) - Budget is cumulative across turns (0.6s budget, 0.4s per turn, not per-turn) - timeout_seconds=0 short-circuits first turn without calling submit_message - Negative timeout treated as exhausted (guard against caller bugs) - Timeout TurnResult carries correct prompt, matches, UsageSummary shape Full suite: 49/49 passing, zero regression. Blocker: none. Closes ROADMAP #161.
2026-04-22 17:23:43 +09:00
results = PortRuntime().run_turn_loop(
args.prompt,
limit=args.limit,
max_turns=args.max_turns,
structured_output=args.structured_output,
timeout_seconds=args.timeout_seconds,
fix: #163 — remove [turn N] suffix pollution from run_turn_loop; file #164 timeout-cancellation followup #163: run_turn_loop no longer injects f'{prompt} [turn N]' into follow-up prompts. The suffix was never defined or interpreted anywhere — not by the engine, not by the system prompt, not by any LLM. It looked like a real user-typed annotation in the transcript and made replay/analysis fragile. New behaviour: - turn 0 submits the original prompt (unchanged) - turn > 0 submits caller-supplied continuation_prompt if provided, else the loop stops cleanly — no fabricated user turn - added continuation_prompt: str | None = None parameter to run_turn_loop - added --continuation-prompt CLI flag for claws scripting multi-turn loops - zero '[turn' strings ever appear in mutable_messages or stdout now Behaviour change for existing callers: - Before: run_turn_loop(prompt, max_turns=3) submitted 3 turns ('prompt', 'prompt [turn 2]', 'prompt [turn 3]') - After: run_turn_loop(prompt, max_turns=3) submits 1 turn ('prompt') - To preserve old multi-turn behaviour, pass continuation_prompt='Continue.' or any structured follow-up text One existing timeout test (test_budget_is_cumulative_across_turns) updated to pass continuation_prompt so the cumulative-budget contract is actually exercised across turns instead of trivially satisfied by a one-turn loop. #164 filed: addresses reviewer feedback on #161. The wall-clock timeout bounds the caller-facing wait, but the underlying submit_message worker thread keeps running and can mutate engine state after the timeout TurnResult is returned. A cooperative cancel_event pattern is sketched in the pinpoint; real asyncio.Task.cancel() support will come once provider IO is async-native (larger refactor). Tests (tests/test_run_turn_loop_continuation.py, 8 tests): - TestNoTurnSuffixInjection (2): zero '[turn' strings in any submitted prompt, both default and explicit-continuation paths - TestContinuationDefaultStopsAfterTurnZero (2): default loops run exactly one turn; engine.submit_message called exactly once despite max_turns=10 - TestExplicitContinuationBehaviour (2): turn 0 = original, turn N = continuation verbatim; max_turns still respected - TestCLIContinuationFlag (2): CLI default emits only '## Turn 1'; --continuation-prompt wires through to multi-turn behaviour Full suite: 67/67 passing. Closes ROADMAP #163. Files #164.
2026-04-22 17:37:22 +09:00
continuation_prompt=args.continuation_prompt,
fix: #161 — wall-clock timeout for run_turn_loop; stalled turns now abort with stop_reason='timeout' Previously, run_turn_loop was bounded only by max_turns (turn count). If engine.submit_message stalled — slow provider, hung network, infinite stream — the loop blocked indefinitely with no cancellation path. Claws calling run_turn_loop in CI or orchestration had no reliable way to enforce a deadline; the loop would hang until OS kill or human intervention. Fix: - Add timeout_seconds parameter to run_turn_loop (default None = legacy unbounded). - When set, each submit_message call runs inside a ThreadPoolExecutor and is bounded by the remaining wall-clock budget (total across all turns, not per-turn). - On timeout, synthesize a TurnResult with stop_reason='timeout' carrying the turn's prompt and routed matches so transcripts preserve orchestration context. - Exhausted/negative budget short-circuits before calling submit_message. - Legacy path (timeout_seconds=None) bypasses the executor entirely — zero overhead for callers that don't opt in. CLI: - Added --timeout-seconds flag to 'turn-loop' command. - Exit code 2 when the loop terminated on timeout (vs 0 for completed), so shell scripts can distinguish 'done' from 'budget exhausted'. Tests (tests/test_run_turn_loop_timeout.py, 6 tests): - Legacy unbounded path unchanged (timeout_seconds=None never emits 'timeout') - Hung submit_message aborted within budget (0.3s budget, 5s mock hang → exit <1.5s) - Budget is cumulative across turns (0.6s budget, 0.4s per turn, not per-turn) - timeout_seconds=0 short-circuits first turn without calling submit_message - Negative timeout treated as exhausted (guard against caller bugs) - Timeout TurnResult carries correct prompt, matches, UsageSummary shape Full suite: 49/49 passing, zero regression. Blocker: none. Closes ROADMAP #161.
2026-04-22 17:23:43 +09:00
)
feat: #164 Stage B CLOSURE — turn-loop JSON + cancel_observed coverage + CLAWABLE promotion Closes all three gaebal-gajae-identified closure criteria for #164 Stage B: 1. turn-loop runtime surface exposes cancel_observed consistently 2. cancellation path tests validate safe-to-reuse semantics 3. turn-loop promoted from OPT_OUT to CLAWABLE surface Changes: src/main.py: - turn-loop accepts --output-format {text,json} - JSON envelope includes per-turn cancel_observed + final_cancel_observed - All turn fields exposed: prompt, output, stop_reason, cancel_observed, matched_commands, matched_tools - Exit code 2 on final timeout preserved tests/test_cli_parity_audit.py: - CLAWABLE_SURFACES now contains 14 commands (was 13) - Removed 'turn-loop' from OPT_OUT_SURFACES - Parametrized --output-format test auto-validates turn-loop JSON tests/test_cancel_observed_field.py (new, 9 tests): - TestCancelObservedField (5 tests): field contract - default False - explicit True preserved - normal completion → False - bootstrap JSON exposes field - turn-loop JSON exposes per-turn field - TestCancelObservedSafeReuseSemantics (2 tests): reuse contract - timeout result has cancel_observed=True when signaled - engine.mutable_messages not corrupted after cancelled turn - engine accepts fresh message after cancellation - TestCancelObservedSchemaCompliance (2 tests): SCHEMAS.md contract - cancel_observed is always bool - final_cancel_observed convenience field present Closure criteria validated: - ✅ Field exposed in bootstrap JSON - ✅ Field exposed per-turn in turn-loop JSON - ✅ Field is always bool, never null - ✅ Safe-to-reuse: engine can accept fresh messages after cancellation - ✅ mutable_messages not corrupted by cancelled turn - ✅ turn-loop promoted from OPT_OUT (14 clawable commands now) Protocol now distinguishes at runtime: timeout + cancel_observed=false → infra/wedge (escalate) timeout + cancel_observed=true → cooperative cancellation (safe to retry) Test results: 182 → 192 passing, +10 tests, zero regression, 3 skipped unchanged. Closes #164 Stage B. Stage C (async-native preemption) remains future work.
2026-04-22 19:49:20 +09:00
# Exit 2 when a timeout terminated the loop so claws can distinguish
# 'ran to completion' from 'hit wall-clock budget'.
loop_exit_code = 2 if results and results[-1].stop_reason == 'timeout' else 0
if args.output_format == 'json':
# #164 Stage B + #173: JSON envelope with per-turn cancel_observed
# Promotes turn-loop from OPT_OUT to CLAWABLE surface.
import json
envelope = {
'prompt': args.prompt,
'max_turns': args.max_turns,
'turns_completed': len(results),
'timeout_seconds': args.timeout_seconds,
'continuation_prompt': args.continuation_prompt,
'turns': [
{
'prompt': r.prompt,
'output': r.output,
'stop_reason': r.stop_reason,
'cancel_observed': r.cancel_observed,
'matched_commands': list(r.matched_commands),
'matched_tools': list(r.matched_tools),
}
for r in results
],
'final_stop_reason': results[-1].stop_reason if results else None,
'final_cancel_observed': results[-1].cancel_observed if results else False,
}
print(json.dumps(wrap_json_envelope(envelope, args.command, exit_code=loop_exit_code)))
return loop_exit_code
for idx, result in enumerate(results, start=1):
print(f'## Turn {idx}')
print(result.output)
print(f'stop_reason={result.stop_reason}')
feat: #164 Stage B CLOSURE — turn-loop JSON + cancel_observed coverage + CLAWABLE promotion Closes all three gaebal-gajae-identified closure criteria for #164 Stage B: 1. turn-loop runtime surface exposes cancel_observed consistently 2. cancellation path tests validate safe-to-reuse semantics 3. turn-loop promoted from OPT_OUT to CLAWABLE surface Changes: src/main.py: - turn-loop accepts --output-format {text,json} - JSON envelope includes per-turn cancel_observed + final_cancel_observed - All turn fields exposed: prompt, output, stop_reason, cancel_observed, matched_commands, matched_tools - Exit code 2 on final timeout preserved tests/test_cli_parity_audit.py: - CLAWABLE_SURFACES now contains 14 commands (was 13) - Removed 'turn-loop' from OPT_OUT_SURFACES - Parametrized --output-format test auto-validates turn-loop JSON tests/test_cancel_observed_field.py (new, 9 tests): - TestCancelObservedField (5 tests): field contract - default False - explicit True preserved - normal completion → False - bootstrap JSON exposes field - turn-loop JSON exposes per-turn field - TestCancelObservedSafeReuseSemantics (2 tests): reuse contract - timeout result has cancel_observed=True when signaled - engine.mutable_messages not corrupted after cancelled turn - engine accepts fresh message after cancellation - TestCancelObservedSchemaCompliance (2 tests): SCHEMAS.md contract - cancel_observed is always bool - final_cancel_observed convenience field present Closure criteria validated: - ✅ Field exposed in bootstrap JSON - ✅ Field exposed per-turn in turn-loop JSON - ✅ Field is always bool, never null - ✅ Safe-to-reuse: engine can accept fresh messages after cancellation - ✅ mutable_messages not corrupted by cancelled turn - ✅ turn-loop promoted from OPT_OUT (14 clawable commands now) Protocol now distinguishes at runtime: timeout + cancel_observed=false → infra/wedge (escalate) timeout + cancel_observed=true → cooperative cancellation (safe to retry) Test results: 182 → 192 passing, +10 tests, zero regression, 3 skipped unchanged. Closes #164 Stage B. Stage C (async-native preemption) remains future work.
2026-04-22 19:49:20 +09:00
return loop_exit_code
if args.command == 'flush-transcript':
from pathlib import Path as _Path
engine = QueryEnginePort.from_workspace()
# #166: allow deterministic session IDs for claw checkpointing/replay.
# When unset, the engine's auto-generated UUID is used (backward compat).
if args.session_id:
engine.session_id = args.session_id
engine.submit_message(args.prompt)
directory = _Path(args.directory) if args.directory else None
path = engine.persist_session(directory)
if args.output_format == 'json':
import json as _json
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
_env = {
'session_id': engine.session_id,
'path': path,
'flushed': engine.transcript_store.flushed,
'messages_count': len(engine.mutable_messages),
'input_tokens': engine.total_usage.input_tokens,
'output_tokens': engine.total_usage.output_tokens,
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
}
print(_json.dumps(wrap_json_envelope(_env, args.command)))
else:
# #166: legacy text output preserved byte-for-byte for backward compat.
print(path)
print(f'flushed={engine.transcript_store.flushed}')
return 0
if args.command == 'load-session':
fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.
2026-04-22 17:44:48 +09:00
from pathlib import Path as _Path
directory = _Path(args.directory) if args.directory else None
# #165: catch typed SessionNotFoundError + surface a JSON error envelope
# matching the delete-session contract shape. No more raw tracebacks.
try:
session = load_session(args.session_id, directory)
except SessionNotFoundError as exc:
if args.output_format == 'json':
import json as _json
resolved_dir = str(directory) if directory else '.port_sessions'
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
_env = {
fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.
2026-04-22 17:44:48 +09:00
'session_id': args.session_id,
'loaded': False,
'error': {
'kind': 'session_not_found',
'message': str(exc),
'directory': resolved_dir,
'retryable': False,
},
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
}
print(_json.dumps(wrap_json_envelope(_env, args.command, exit_code=1)))
fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.
2026-04-22 17:44:48 +09:00
else:
print(f'error: {exc}')
return 1
except (OSError, ValueError) as exc:
# Corrupted session file, IO error, JSON decode error — distinct
# from 'not found'. Callers may retry here (fs glitch).
if args.output_format == 'json':
import json as _json
resolved_dir = str(directory) if directory else '.port_sessions'
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
_env = {
fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.
2026-04-22 17:44:48 +09:00
'session_id': args.session_id,
'loaded': False,
'error': {
'kind': 'session_load_failed',
'message': str(exc),
'directory': resolved_dir,
'retryable': True,
},
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
}
print(_json.dumps(wrap_json_envelope(_env, args.command, exit_code=1)))
fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.
2026-04-22 17:44:48 +09:00
else:
print(f'error: {exc}')
return 1
if args.output_format == 'json':
import json as _json
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
_env = {
fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.
2026-04-22 17:44:48 +09:00
'session_id': session.session_id,
'loaded': True,
'messages_count': len(session.messages),
'input_tokens': session.input_tokens,
'output_tokens': session.output_tokens,
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
}
print(_json.dumps(wrap_json_envelope(_env, args.command)))
fix: #165 — load-session CLI now parity-matches list/delete (--directory, --output-format, typed JSON errors) The #160 session-lifecycle CLI triplet was asymmetric: list-sessions and delete-session accepted --directory + --output-format and emitted typed JSON error envelopes, but load-session had neither flag and dumped a raw Python traceback (including the SessionNotFoundError class name) on a missing session. Three concrete impacts this fix closes: 1. Alternate session-store locations (e.g. /tmp/claw-run-XXX/.port_sessions) were unreachable via load-session; claws had to chdir or monkeypatch DEFAULT_SESSION_DIR to work around it. 2. Not-found emitted a multi-line Python stack, not a parseable envelope. Claws deciding retry/escalate/give-up had only exit code 1 to work with. 3. The traceback leaked 'src.session_store.SessionNotFoundError' verbatim, coupling version-pinned claws to our internal exception class name. Now all three triplet commands accept the same flag pair and emit the same JSON error shape: Success (json mode): {"session_id": "alpha", "loaded": true, "messages_count": 3, "input_tokens": 42, "output_tokens": 99} Not-found: {"session_id": "missing", "loaded": false, "error": {"kind": "session_not_found", "message": "session 'missing' not found in /path", "directory": "/path", "retryable": false}} Corrupted file: {"session_id": "broken", "loaded": false, "error": {"kind": "session_load_failed", "message": "...", "directory": "/path", "retryable": true}} Exit code contract: - 0 on successful load - 1 on not-found (preserves existing $?) - 1 on OSError/JSONDecodeError (distinct 'kind' in JSON) Backward compat: legacy 'claw load-session ID' text output unchanged byte-for-byte. Only new behaviour is the flags and structured error path. Tests (tests/test_load_session_cli.py, 13 tests): - TestDirectoryFlagParity (2): --directory works + fallback to CWD/.port_sessions - TestOutputFormatFlagParity (2): json schema + text-mode backward compat - TestNotFoundTypedError (2): JSON envelope on not-found; no traceback in either mode; no internal class name leak - TestLoadFailedDistinctFromNotFound (1): corrupted file = session_load_failed with retryable=true, distinct from session_not_found - TestTripletParityConsistency (6): parametrised over [list, delete, load] * [--directory, --output-format] — explicit parity guard for future regressions Full suite: 80/80 passing, zero regression. Discovered via Jobdori dogfood sweep 2026-04-22 17:44 KST — ran 'claw load-session nonexistent' expecting a clean error, got a Python traceback. Filed #165 + fixed in same commit. Closes ROADMAP #165.
2026-04-22 17:44:48 +09:00
else:
print(f'{session.session_id}\n{len(session.messages)} messages\nin={session.input_tokens} out={session.output_tokens}')
return 0
feat(#160): wire claw list-sessions and delete-session CLI commands Closes the last #160 gap: claws can now manage session lifecycle entirely through the CLI without filesystem hacks. New commands: - claw list-sessions [--directory DIR] [--output-format text|json] Enumerates stored session IDs. JSON mode emits {sessions, count}. Missing/empty directories return empty list (exit 0), not an error. - claw delete-session SESSION_ID [--directory DIR] [--output-format text|json] Idempotent: not-found is exit 0 with status='not_found' (no raise). Partial-failure: exit 1 with typed JSON error envelope: {session_id, deleted: false, error: {kind, message, retryable}} The 'session_delete_failed' kind is retryable=true so orchestrators know to retry vs escalate. Public API surface extended in src/__init__.py: - list_sessions, session_exists, delete_session - SessionNotFoundError, SessionDeleteError Tests added (tests/test_porting_workspace.py): - test_list_sessions_cli_runs: text + json modes against tempdir - test_delete_session_cli_idempotent: first call deleted=true, second call deleted=false (exit 0, status=not_found) - test_delete_session_cli_partial_failure_exit_1: permission error surfaces as exit 1 + typed JSON error with retryable=true All 43 tests pass. The session storage abstraction chapter is closed: - storage layer decoupled from claw code (#160 initial impl) - delete contract hardened + caller-audited (#160 hardening pass) - CLI wired with idempotency preserved at exit-code boundary (this commit)
2026-04-22 17:16:53 +09:00
if args.command == 'list-sessions':
from pathlib import Path as _Path
directory = _Path(args.directory) if args.directory else None
ids = list_sessions(directory)
if args.output_format == 'json':
import json as _json
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
_env = {'sessions': ids, 'count': len(ids)}
print(_json.dumps(wrap_json_envelope(_env, args.command)))
feat(#160): wire claw list-sessions and delete-session CLI commands Closes the last #160 gap: claws can now manage session lifecycle entirely through the CLI without filesystem hacks. New commands: - claw list-sessions [--directory DIR] [--output-format text|json] Enumerates stored session IDs. JSON mode emits {sessions, count}. Missing/empty directories return empty list (exit 0), not an error. - claw delete-session SESSION_ID [--directory DIR] [--output-format text|json] Idempotent: not-found is exit 0 with status='not_found' (no raise). Partial-failure: exit 1 with typed JSON error envelope: {session_id, deleted: false, error: {kind, message, retryable}} The 'session_delete_failed' kind is retryable=true so orchestrators know to retry vs escalate. Public API surface extended in src/__init__.py: - list_sessions, session_exists, delete_session - SessionNotFoundError, SessionDeleteError Tests added (tests/test_porting_workspace.py): - test_list_sessions_cli_runs: text + json modes against tempdir - test_delete_session_cli_idempotent: first call deleted=true, second call deleted=false (exit 0, status=not_found) - test_delete_session_cli_partial_failure_exit_1: permission error surfaces as exit 1 + typed JSON error with retryable=true All 43 tests pass. The session storage abstraction chapter is closed: - storage layer decoupled from claw code (#160 initial impl) - delete contract hardened + caller-audited (#160 hardening pass) - CLI wired with idempotency preserved at exit-code boundary (this commit)
2026-04-22 17:16:53 +09:00
else:
if not ids:
print('(no sessions)')
else:
for sid in ids:
print(sid)
return 0
if args.command == 'delete-session':
from pathlib import Path as _Path
directory = _Path(args.directory) if args.directory else None
try:
deleted = delete_session(args.session_id, directory)
except SessionDeleteError as exc:
if args.output_format == 'json':
import json as _json
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
_env = {
feat(#160): wire claw list-sessions and delete-session CLI commands Closes the last #160 gap: claws can now manage session lifecycle entirely through the CLI without filesystem hacks. New commands: - claw list-sessions [--directory DIR] [--output-format text|json] Enumerates stored session IDs. JSON mode emits {sessions, count}. Missing/empty directories return empty list (exit 0), not an error. - claw delete-session SESSION_ID [--directory DIR] [--output-format text|json] Idempotent: not-found is exit 0 with status='not_found' (no raise). Partial-failure: exit 1 with typed JSON error envelope: {session_id, deleted: false, error: {kind, message, retryable}} The 'session_delete_failed' kind is retryable=true so orchestrators know to retry vs escalate. Public API surface extended in src/__init__.py: - list_sessions, session_exists, delete_session - SessionNotFoundError, SessionDeleteError Tests added (tests/test_porting_workspace.py): - test_list_sessions_cli_runs: text + json modes against tempdir - test_delete_session_cli_idempotent: first call deleted=true, second call deleted=false (exit 0, status=not_found) - test_delete_session_cli_partial_failure_exit_1: permission error surfaces as exit 1 + typed JSON error with retryable=true All 43 tests pass. The session storage abstraction chapter is closed: - storage layer decoupled from claw code (#160 initial impl) - delete contract hardened + caller-audited (#160 hardening pass) - CLI wired with idempotency preserved at exit-code boundary (this commit)
2026-04-22 17:16:53 +09:00
'session_id': args.session_id,
'deleted': False,
'error': {
'kind': 'session_delete_failed',
'message': str(exc),
'retryable': True,
},
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
}
print(_json.dumps(wrap_json_envelope(_env, args.command, exit_code=1)))
feat(#160): wire claw list-sessions and delete-session CLI commands Closes the last #160 gap: claws can now manage session lifecycle entirely through the CLI without filesystem hacks. New commands: - claw list-sessions [--directory DIR] [--output-format text|json] Enumerates stored session IDs. JSON mode emits {sessions, count}. Missing/empty directories return empty list (exit 0), not an error. - claw delete-session SESSION_ID [--directory DIR] [--output-format text|json] Idempotent: not-found is exit 0 with status='not_found' (no raise). Partial-failure: exit 1 with typed JSON error envelope: {session_id, deleted: false, error: {kind, message, retryable}} The 'session_delete_failed' kind is retryable=true so orchestrators know to retry vs escalate. Public API surface extended in src/__init__.py: - list_sessions, session_exists, delete_session - SessionNotFoundError, SessionDeleteError Tests added (tests/test_porting_workspace.py): - test_list_sessions_cli_runs: text + json modes against tempdir - test_delete_session_cli_idempotent: first call deleted=true, second call deleted=false (exit 0, status=not_found) - test_delete_session_cli_partial_failure_exit_1: permission error surfaces as exit 1 + typed JSON error with retryable=true All 43 tests pass. The session storage abstraction chapter is closed: - storage layer decoupled from claw code (#160 initial impl) - delete contract hardened + caller-audited (#160 hardening pass) - CLI wired with idempotency preserved at exit-code boundary (this commit)
2026-04-22 17:16:53 +09:00
else:
print(f'error: {exc}')
return 1
if args.output_format == 'json':
import json as _json
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
_env = {
feat(#160): wire claw list-sessions and delete-session CLI commands Closes the last #160 gap: claws can now manage session lifecycle entirely through the CLI without filesystem hacks. New commands: - claw list-sessions [--directory DIR] [--output-format text|json] Enumerates stored session IDs. JSON mode emits {sessions, count}. Missing/empty directories return empty list (exit 0), not an error. - claw delete-session SESSION_ID [--directory DIR] [--output-format text|json] Idempotent: not-found is exit 0 with status='not_found' (no raise). Partial-failure: exit 1 with typed JSON error envelope: {session_id, deleted: false, error: {kind, message, retryable}} The 'session_delete_failed' kind is retryable=true so orchestrators know to retry vs escalate. Public API surface extended in src/__init__.py: - list_sessions, session_exists, delete_session - SessionNotFoundError, SessionDeleteError Tests added (tests/test_porting_workspace.py): - test_list_sessions_cli_runs: text + json modes against tempdir - test_delete_session_cli_idempotent: first call deleted=true, second call deleted=false (exit 0, status=not_found) - test_delete_session_cli_partial_failure_exit_1: permission error surfaces as exit 1 + typed JSON error with retryable=true All 43 tests pass. The session storage abstraction chapter is closed: - storage layer decoupled from claw code (#160 initial impl) - delete contract hardened + caller-audited (#160 hardening pass) - CLI wired with idempotency preserved at exit-code boundary (this commit)
2026-04-22 17:16:53 +09:00
'session_id': args.session_id,
'deleted': deleted,
'status': 'deleted' if deleted else 'not_found',
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
}
print(_json.dumps(wrap_json_envelope(_env, args.command)))
feat(#160): wire claw list-sessions and delete-session CLI commands Closes the last #160 gap: claws can now manage session lifecycle entirely through the CLI without filesystem hacks. New commands: - claw list-sessions [--directory DIR] [--output-format text|json] Enumerates stored session IDs. JSON mode emits {sessions, count}. Missing/empty directories return empty list (exit 0), not an error. - claw delete-session SESSION_ID [--directory DIR] [--output-format text|json] Idempotent: not-found is exit 0 with status='not_found' (no raise). Partial-failure: exit 1 with typed JSON error envelope: {session_id, deleted: false, error: {kind, message, retryable}} The 'session_delete_failed' kind is retryable=true so orchestrators know to retry vs escalate. Public API surface extended in src/__init__.py: - list_sessions, session_exists, delete_session - SessionNotFoundError, SessionDeleteError Tests added (tests/test_porting_workspace.py): - test_list_sessions_cli_runs: text + json modes against tempdir - test_delete_session_cli_idempotent: first call deleted=true, second call deleted=false (exit 0, status=not_found) - test_delete_session_cli_partial_failure_exit_1: permission error surfaces as exit 1 + typed JSON error with retryable=true All 43 tests pass. The session storage abstraction chapter is closed: - storage layer decoupled from claw code (#160 initial impl) - delete contract hardened + caller-audited (#160 hardening pass) - CLI wired with idempotency preserved at exit-code boundary (this commit)
2026-04-22 17:16:53 +09:00
else:
if deleted:
print(f'deleted: {args.session_id}')
else:
print(f'not found: {args.session_id}')
# Exit 0 for both cases — delete_session is idempotent,
# not-found is success from a cleanup perspective
return 0
if args.command == 'remote-mode':
print(run_remote_mode(args.target).as_text())
return 0
if args.command == 'ssh-mode':
print(run_ssh_mode(args.target).as_text())
return 0
if args.command == 'teleport-mode':
print(run_teleport_mode(args.target).as_text())
return 0
if args.command == 'direct-connect-mode':
print(run_direct_connect(args.target).as_text())
return 0
if args.command == 'deep-link-mode':
print(run_deep_link(args.target).as_text())
return 0
if args.command == 'show-command':
module = get_command(args.name)
if module is None:
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
if args.output_format == 'json':
import json
error_envelope = {
'name': args.name,
'found': False,
'error': {
'kind': 'command_not_found',
'message': f'Unknown command: {args.name}',
'retryable': False,
},
}
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
print(json.dumps(wrap_json_envelope(error_envelope, args.command, exit_code=1)))
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
else:
print(f'Command not found: {args.name}')
return 1
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
if args.output_format == 'json':
import json
output = {
'name': module.name,
'found': True,
'source_hint': module.source_hint,
'responsibility': module.responsibility,
}
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
print(json.dumps(wrap_json_envelope(output, args.command)))
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
else:
print('\n'.join([module.name, module.source_hint, module.responsibility]))
return 0
if args.command == 'show-tool':
module = get_tool(args.name)
if module is None:
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
if args.output_format == 'json':
import json
error_envelope = {
'name': args.name,
'found': False,
'error': {
'kind': 'tool_not_found',
'message': f'Unknown tool: {args.name}',
'retryable': False,
},
}
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
print(json.dumps(wrap_json_envelope(error_envelope, args.command, exit_code=1)))
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
else:
print(f'Tool not found: {args.name}')
return 1
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
if args.output_format == 'json':
import json
output = {
'name': module.name,
'found': True,
'source_hint': module.source_hint,
'responsibility': module.responsibility,
}
feat: #173 — wrap_json_envelope() applied to all 13 clawable commands (LOOP CLOSED) Completes the coverage → enforcement → documentation → alignment cycle. Every clawable command now emits the canonical JSON envelope per SCHEMAS.md: Common fields (now real in output): - timestamp (ISO 8601 UTC) - command (argv[1]) - exit_code (0/1/2) - output_format ('json') - schema_version ('1.0') 13 commands wrapped: - list-sessions, delete-session, load-session, flush-transcript - show-command, show-tool - exec-command, exec-tool, route, bootstrap - command-graph, tool-pool, bootstrap-graph Implementation: - Added wrap_json_envelope() helper in src/main.py - Wrapped all 18 JSON output paths (13 success + 5 error paths) - Applied exit_code=1 to error/not-found envelopes - Kept text mode byte-identical (backward compat preserved) Test updates: - 3 skipped common-field tests now pass automatically - 3 existing tests updated to verify common envelope fields while preserving command-specific field checks - test_list_sessions_cli_runs, test_delete_session_cli_idempotent, test_load_session_cli::test_json_mode_on_success Full suite: 179 → 182 passing (+3 activated from skipped), zero regression. Loop completion: Coverage (#167-#170) ✅ All 13 commands accept --output-format Enforcement (#171) ✅ CI blocks new commands without --output-format Documentation (#172) ✅ SCHEMAS.md defines envelope contract Alignment (#173 this) ✅ Actual output matches SCHEMAS.md contract Example output now: $ claw list-sessions --output-format json { "timestamp": "2026-04-22T10:34:12Z", "command": "list-sessions", "exit_code": 0, "output_format": "json", "schema_version": "1.0", "sessions": ["alpha", "bravo"], "count": 2 } Closes ROADMAP #173. Protocol is now documented AND real. Claws can build ONE error handler, ONE timestamp parser, ONE version check instead of 13 special cases.
2026-04-22 19:35:37 +09:00
print(json.dumps(wrap_json_envelope(output, args.command)))
fix: #167 — show-command and show-tool now accept --output-format flag; CLI parity with session-lifecycle family Closes the inspect-capability parity gap: show-command and show-tool were the only discovery/inspection CLI commands lacking --output-format support, making them outliers in the ecosystem that already had unified JSON contracts across list-sessions, load-session, delete-session, and flush-transcript (#160/#165/#166). Concrete additions: - show-command: --output-format {text,json} - show-tool: --output-format {text,json} JSON envelope shape (found case): {name, found: true, source_hint, responsibility} JSON envelope shape (not-found case): {name, found: false, error: {kind:'command_not_found'|'tool_not_found', message, retryable: false}} Exit codes: 0 = success 1 = not found Backward compatibility: - Default (no --output-format) is 'text' (unchanged) - Text output byte-identical to pre-#167 (three newline-separated lines) Tests (10 new, test_show_command_tool_output_format.py): - TestShowCommandOutputFormat (5): found + not-found in JSON; text mode backward compat; text is default - TestShowToolOutputFormat (3): found + not-found in JSON; text mode backward compat - TestShowCommandToolFormatParity (2): both accept same flag choices; consistent JSON envelope shape Full suite: 114 → 124 passing, zero regression. Closes ROADMAP #167. Why this matters: Before: Claws calling show-command/show-tool had to parse human-readable prose output via regex, with no structured error signal. After: Same envelope contract as load-session and friends: JSON-first, typed errors, machine-parseable. Related clusters: - Session-lifecycle CLI parity family (#160, #165, #166, #167) - Machine-readable error contracts (same vein as #162 atomicity + #164 cancellation state-safety: structured boundaries for orchestration)
2026-04-22 18:21:38 +09:00
else:
print('\n'.join([module.name, module.source_hint, module.responsibility]))
return 0
if args.command == 'exec-command':
result = execute_command(args.name, args.prompt)
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
# #168: JSON envelope with typed not-found error
fix: #181 — envelope exit_code must match process exit code (exec-command/exec-tool) Cycle #26 dogfood found a real red-state bug in the JSON envelope contract. ## The Bug exec-command and exec-tool not-found cases return exit code 1 from the process, but the envelope reports exit_code: 0 (the default from wrap_json_envelope). This is a protocol violation. Repro (before fix): $ claw exec-command unknown-cmd test --output-format json > out.json $ echo $? 1 $ jq '.exit_code' out.json 0 # WRONG — envelope lies about exit code Claws reading the envelope's exit_code field get misinformation. A claw implementing the canonical ERROR_HANDLING.md pattern (check exit_code, then classify by error.kind) would incorrectly treat failures as successes when dispatching on the envelope alone. ## Root Cause main.py lines 687–739 (exec-command + exec-tool handlers): - Return statement: 'return 0 if result.handled else 1' (correct) - Envelope wrap: 'wrap_json_envelope(envelope, args.command)' (uses default exit_code=0, IGNORES the return value) The envelope wrap was called BEFORE the return value was computed, so the exit_code field was never synchronized with the actual exit code. ## The Fix Compute exit_code ONCE at the top: exit_code = 0 if result.handled else 1 Pass it explicitly to wrap_json_envelope: wrap_json_envelope(envelope, args.command, exit_code=exit_code) Return the same value: return exit_code This ensures the envelope's exit_code field is always truth — the SAME value the process returns. ## Tests Added (3) TestEnvelopeExitCodeMatchesProcessExit in test_exec_route_bootstrap_output_format.py: 1. test_exec_command_not_found_envelope_exit_matches: Verifies exec-command unknown-cmd returns exit 1 in both envelope and process. 2. test_exec_tool_not_found_envelope_exit_matches: Same for exec-tool. 3. test_all_commands_exit_code_invariant: Audit across 4 known non-zero cases (show-command, show-tool, exec-command, exec-tool not-found). Guards against the same bug in other surfaces. ## Impact - 206 → 209 passing tests (+3) - Zero regressions - Protocol contract now truthful: envelope.exit_code == process exit - Claws using the one-handler pattern from ERROR_HANDLING.md now get correct information ## Related - ERROR_HANDLING.md (cycle #22): Documented exit_code as machine-readable contract field - #178/#179 (cycles #19/#20): Closed parser-front-door contract - This closes a gap in the WORK PROTOCOL contract — envelope values must match reality, not just be structurally present. Classification (per cycle #24 calibration): - Red-state bug: ✓ (contract violation, claws get misinformation) - Real friction: ✓ (discovered via dogfood, not speculative) - Fix ships same-cycle: ✓ (discipline per maintainership mode) Source: Jobdori cycle #26 dogfood — ran multiple edge-case probes, noticed exec-command envelope showed exit_code: 0 while process exited 1. Investigated wrap_json_envelope default behavior, confirmed bug, fixed and tested in same cycle.
2026-04-22 21:33:57 +09:00
# #181: envelope exit_code must match process exit code
exit_code = 0 if result.handled else 1
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
if args.output_format == 'json':
import json
if not result.handled:
envelope = {
'name': args.name,
'prompt': args.prompt,
'handled': False,
'error': {
'kind': 'command_not_found',
'message': result.message,
'retryable': False,
},
}
else:
envelope = {
'name': result.name,
'prompt': result.prompt,
'source_hint': result.source_hint,
'handled': True,
'message': result.message,
}
fix: #181 — envelope exit_code must match process exit code (exec-command/exec-tool) Cycle #26 dogfood found a real red-state bug in the JSON envelope contract. ## The Bug exec-command and exec-tool not-found cases return exit code 1 from the process, but the envelope reports exit_code: 0 (the default from wrap_json_envelope). This is a protocol violation. Repro (before fix): $ claw exec-command unknown-cmd test --output-format json > out.json $ echo $? 1 $ jq '.exit_code' out.json 0 # WRONG — envelope lies about exit code Claws reading the envelope's exit_code field get misinformation. A claw implementing the canonical ERROR_HANDLING.md pattern (check exit_code, then classify by error.kind) would incorrectly treat failures as successes when dispatching on the envelope alone. ## Root Cause main.py lines 687–739 (exec-command + exec-tool handlers): - Return statement: 'return 0 if result.handled else 1' (correct) - Envelope wrap: 'wrap_json_envelope(envelope, args.command)' (uses default exit_code=0, IGNORES the return value) The envelope wrap was called BEFORE the return value was computed, so the exit_code field was never synchronized with the actual exit code. ## The Fix Compute exit_code ONCE at the top: exit_code = 0 if result.handled else 1 Pass it explicitly to wrap_json_envelope: wrap_json_envelope(envelope, args.command, exit_code=exit_code) Return the same value: return exit_code This ensures the envelope's exit_code field is always truth — the SAME value the process returns. ## Tests Added (3) TestEnvelopeExitCodeMatchesProcessExit in test_exec_route_bootstrap_output_format.py: 1. test_exec_command_not_found_envelope_exit_matches: Verifies exec-command unknown-cmd returns exit 1 in both envelope and process. 2. test_exec_tool_not_found_envelope_exit_matches: Same for exec-tool. 3. test_all_commands_exit_code_invariant: Audit across 4 known non-zero cases (show-command, show-tool, exec-command, exec-tool not-found). Guards against the same bug in other surfaces. ## Impact - 206 → 209 passing tests (+3) - Zero regressions - Protocol contract now truthful: envelope.exit_code == process exit - Claws using the one-handler pattern from ERROR_HANDLING.md now get correct information ## Related - ERROR_HANDLING.md (cycle #22): Documented exit_code as machine-readable contract field - #178/#179 (cycles #19/#20): Closed parser-front-door contract - This closes a gap in the WORK PROTOCOL contract — envelope values must match reality, not just be structurally present. Classification (per cycle #24 calibration): - Red-state bug: ✓ (contract violation, claws get misinformation) - Real friction: ✓ (discovered via dogfood, not speculative) - Fix ships same-cycle: ✓ (discipline per maintainership mode) Source: Jobdori cycle #26 dogfood — ran multiple edge-case probes, noticed exec-command envelope showed exit_code: 0 while process exited 1. Investigated wrap_json_envelope default behavior, confirmed bug, fixed and tested in same cycle.
2026-04-22 21:33:57 +09:00
print(json.dumps(wrap_json_envelope(envelope, args.command, exit_code=exit_code)))
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
else:
print(result.message)
fix: #181 — envelope exit_code must match process exit code (exec-command/exec-tool) Cycle #26 dogfood found a real red-state bug in the JSON envelope contract. ## The Bug exec-command and exec-tool not-found cases return exit code 1 from the process, but the envelope reports exit_code: 0 (the default from wrap_json_envelope). This is a protocol violation. Repro (before fix): $ claw exec-command unknown-cmd test --output-format json > out.json $ echo $? 1 $ jq '.exit_code' out.json 0 # WRONG — envelope lies about exit code Claws reading the envelope's exit_code field get misinformation. A claw implementing the canonical ERROR_HANDLING.md pattern (check exit_code, then classify by error.kind) would incorrectly treat failures as successes when dispatching on the envelope alone. ## Root Cause main.py lines 687–739 (exec-command + exec-tool handlers): - Return statement: 'return 0 if result.handled else 1' (correct) - Envelope wrap: 'wrap_json_envelope(envelope, args.command)' (uses default exit_code=0, IGNORES the return value) The envelope wrap was called BEFORE the return value was computed, so the exit_code field was never synchronized with the actual exit code. ## The Fix Compute exit_code ONCE at the top: exit_code = 0 if result.handled else 1 Pass it explicitly to wrap_json_envelope: wrap_json_envelope(envelope, args.command, exit_code=exit_code) Return the same value: return exit_code This ensures the envelope's exit_code field is always truth — the SAME value the process returns. ## Tests Added (3) TestEnvelopeExitCodeMatchesProcessExit in test_exec_route_bootstrap_output_format.py: 1. test_exec_command_not_found_envelope_exit_matches: Verifies exec-command unknown-cmd returns exit 1 in both envelope and process. 2. test_exec_tool_not_found_envelope_exit_matches: Same for exec-tool. 3. test_all_commands_exit_code_invariant: Audit across 4 known non-zero cases (show-command, show-tool, exec-command, exec-tool not-found). Guards against the same bug in other surfaces. ## Impact - 206 → 209 passing tests (+3) - Zero regressions - Protocol contract now truthful: envelope.exit_code == process exit - Claws using the one-handler pattern from ERROR_HANDLING.md now get correct information ## Related - ERROR_HANDLING.md (cycle #22): Documented exit_code as machine-readable contract field - #178/#179 (cycles #19/#20): Closed parser-front-door contract - This closes a gap in the WORK PROTOCOL contract — envelope values must match reality, not just be structurally present. Classification (per cycle #24 calibration): - Red-state bug: ✓ (contract violation, claws get misinformation) - Real friction: ✓ (discovered via dogfood, not speculative) - Fix ships same-cycle: ✓ (discipline per maintainership mode) Source: Jobdori cycle #26 dogfood — ran multiple edge-case probes, noticed exec-command envelope showed exit_code: 0 while process exited 1. Investigated wrap_json_envelope default behavior, confirmed bug, fixed and tested in same cycle.
2026-04-22 21:33:57 +09:00
return exit_code
if args.command == 'exec-tool':
result = execute_tool(args.name, args.payload)
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
# #168: JSON envelope with typed not-found error
fix: #181 — envelope exit_code must match process exit code (exec-command/exec-tool) Cycle #26 dogfood found a real red-state bug in the JSON envelope contract. ## The Bug exec-command and exec-tool not-found cases return exit code 1 from the process, but the envelope reports exit_code: 0 (the default from wrap_json_envelope). This is a protocol violation. Repro (before fix): $ claw exec-command unknown-cmd test --output-format json > out.json $ echo $? 1 $ jq '.exit_code' out.json 0 # WRONG — envelope lies about exit code Claws reading the envelope's exit_code field get misinformation. A claw implementing the canonical ERROR_HANDLING.md pattern (check exit_code, then classify by error.kind) would incorrectly treat failures as successes when dispatching on the envelope alone. ## Root Cause main.py lines 687–739 (exec-command + exec-tool handlers): - Return statement: 'return 0 if result.handled else 1' (correct) - Envelope wrap: 'wrap_json_envelope(envelope, args.command)' (uses default exit_code=0, IGNORES the return value) The envelope wrap was called BEFORE the return value was computed, so the exit_code field was never synchronized with the actual exit code. ## The Fix Compute exit_code ONCE at the top: exit_code = 0 if result.handled else 1 Pass it explicitly to wrap_json_envelope: wrap_json_envelope(envelope, args.command, exit_code=exit_code) Return the same value: return exit_code This ensures the envelope's exit_code field is always truth — the SAME value the process returns. ## Tests Added (3) TestEnvelopeExitCodeMatchesProcessExit in test_exec_route_bootstrap_output_format.py: 1. test_exec_command_not_found_envelope_exit_matches: Verifies exec-command unknown-cmd returns exit 1 in both envelope and process. 2. test_exec_tool_not_found_envelope_exit_matches: Same for exec-tool. 3. test_all_commands_exit_code_invariant: Audit across 4 known non-zero cases (show-command, show-tool, exec-command, exec-tool not-found). Guards against the same bug in other surfaces. ## Impact - 206 → 209 passing tests (+3) - Zero regressions - Protocol contract now truthful: envelope.exit_code == process exit - Claws using the one-handler pattern from ERROR_HANDLING.md now get correct information ## Related - ERROR_HANDLING.md (cycle #22): Documented exit_code as machine-readable contract field - #178/#179 (cycles #19/#20): Closed parser-front-door contract - This closes a gap in the WORK PROTOCOL contract — envelope values must match reality, not just be structurally present. Classification (per cycle #24 calibration): - Red-state bug: ✓ (contract violation, claws get misinformation) - Real friction: ✓ (discovered via dogfood, not speculative) - Fix ships same-cycle: ✓ (discipline per maintainership mode) Source: Jobdori cycle #26 dogfood — ran multiple edge-case probes, noticed exec-command envelope showed exit_code: 0 while process exited 1. Investigated wrap_json_envelope default behavior, confirmed bug, fixed and tested in same cycle.
2026-04-22 21:33:57 +09:00
# #181: envelope exit_code must match process exit code
exit_code = 0 if result.handled else 1
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
if args.output_format == 'json':
import json
if not result.handled:
envelope = {
'name': args.name,
'payload': args.payload,
'handled': False,
'error': {
'kind': 'tool_not_found',
'message': result.message,
'retryable': False,
},
}
else:
envelope = {
'name': result.name,
'payload': result.payload,
'source_hint': result.source_hint,
'handled': True,
'message': result.message,
}
fix: #181 — envelope exit_code must match process exit code (exec-command/exec-tool) Cycle #26 dogfood found a real red-state bug in the JSON envelope contract. ## The Bug exec-command and exec-tool not-found cases return exit code 1 from the process, but the envelope reports exit_code: 0 (the default from wrap_json_envelope). This is a protocol violation. Repro (before fix): $ claw exec-command unknown-cmd test --output-format json > out.json $ echo $? 1 $ jq '.exit_code' out.json 0 # WRONG — envelope lies about exit code Claws reading the envelope's exit_code field get misinformation. A claw implementing the canonical ERROR_HANDLING.md pattern (check exit_code, then classify by error.kind) would incorrectly treat failures as successes when dispatching on the envelope alone. ## Root Cause main.py lines 687–739 (exec-command + exec-tool handlers): - Return statement: 'return 0 if result.handled else 1' (correct) - Envelope wrap: 'wrap_json_envelope(envelope, args.command)' (uses default exit_code=0, IGNORES the return value) The envelope wrap was called BEFORE the return value was computed, so the exit_code field was never synchronized with the actual exit code. ## The Fix Compute exit_code ONCE at the top: exit_code = 0 if result.handled else 1 Pass it explicitly to wrap_json_envelope: wrap_json_envelope(envelope, args.command, exit_code=exit_code) Return the same value: return exit_code This ensures the envelope's exit_code field is always truth — the SAME value the process returns. ## Tests Added (3) TestEnvelopeExitCodeMatchesProcessExit in test_exec_route_bootstrap_output_format.py: 1. test_exec_command_not_found_envelope_exit_matches: Verifies exec-command unknown-cmd returns exit 1 in both envelope and process. 2. test_exec_tool_not_found_envelope_exit_matches: Same for exec-tool. 3. test_all_commands_exit_code_invariant: Audit across 4 known non-zero cases (show-command, show-tool, exec-command, exec-tool not-found). Guards against the same bug in other surfaces. ## Impact - 206 → 209 passing tests (+3) - Zero regressions - Protocol contract now truthful: envelope.exit_code == process exit - Claws using the one-handler pattern from ERROR_HANDLING.md now get correct information ## Related - ERROR_HANDLING.md (cycle #22): Documented exit_code as machine-readable contract field - #178/#179 (cycles #19/#20): Closed parser-front-door contract - This closes a gap in the WORK PROTOCOL contract — envelope values must match reality, not just be structurally present. Classification (per cycle #24 calibration): - Red-state bug: ✓ (contract violation, claws get misinformation) - Real friction: ✓ (discovered via dogfood, not speculative) - Fix ships same-cycle: ✓ (discipline per maintainership mode) Source: Jobdori cycle #26 dogfood — ran multiple edge-case probes, noticed exec-command envelope showed exit_code: 0 while process exited 1. Investigated wrap_json_envelope default behavior, confirmed bug, fixed and tested in same cycle.
2026-04-22 21:33:57 +09:00
print(json.dumps(wrap_json_envelope(envelope, args.command, exit_code=exit_code)))
fix: #168 — exec-command / exec-tool / route / bootstrap now accept --output-format; CLI family JSON parity COMPLETE Extends the #167 inspect-surface parity fix to the four remaining CLI outliers: the commands claws actually invoke to DO work, not just inspect state. After this commit, the entire claw-code CLI family speaks a unified JSON envelope contract. Concrete additions: - exec-command: --output-format {text,json} - exec-tool: --output-format {text,json} - route: --output-format {text,json} - bootstrap: --output-format {text,json} JSON envelope shapes: exec-command (handled): {name, prompt, source_hint, handled: true, message} exec-command (not-found): {name, prompt, handled: false, error: {kind:'command_not_found', message, retryable: false}} exec-tool (handled): {name, payload, source_hint, handled: true, message} exec-tool (not-found): {name, payload, handled: false, error: {kind:'tool_not_found', message, retryable: false}} route: {prompt, limit, match_count, matches: [{kind, name, score, source_hint}]} bootstrap: {prompt, limit, setup: {python_version, implementation, platform_name, test_command}, routed_matches: [{kind, name, score, source_hint}], command_execution_messages: [str], tool_execution_messages: [str], turn: {prompt, output, stop_reason}, persisted_session_path} Exit codes (unchanged from pre-#168): 0 = success 1 = exec not-found (exec-command, exec-tool only) Backward compatibility: - Default (no --output-format) is 'text' - exec-command/exec-tool text output byte-identical - route text output: unchanged tab-separated kind/name/score/source_hint - bootstrap text output: unchanged Markdown runtime session report Tests (13 new, test_exec_route_bootstrap_output_format.py): - TestExecCommandOutputFormat (3): handled + not-found JSON; text compat - TestExecToolOutputFormat (3): handled + not-found JSON; text compat - TestRouteOutputFormat (3): JSON envelope; zero-matches case; text compat - TestBootstrapOutputFormat (2): JSON envelope; text-mode Markdown compat - TestFamilyWideJsonParity (2): parametrised over ALL 6 family commands (show-command, show-tool, exec-command, exec-tool, route, bootstrap) — every one accepts --output-format json and emits parseable JSON; every one defaults to text mode without a leading {. One future regression on any family member breaks this test. Full suite: 124 → 137 passing, zero regression. Closes ROADMAP #168. This completes the CLI-wide JSON parity sweep: - Session-lifecycle family: #160 (list/delete), #165 (load), #166 (flush) - Inspect family: #167 (show-command, show-tool) - Work-verb family: #168 (exec-command, exec-tool, route, bootstrap) ENTIRE CLI SURFACE is now machine-readable via --output-format json with typed errors, deterministic exit codes, and consistent envelope shape. Claws no longer need to regex-parse any CLI output. Related clusters: - Clawability principle: 'machine-readable in state and failure modes' (ROADMAP top-level). 9 pinpoints in this cluster; all now landed. - Typed-error envelope consistency: command_not_found / tool_not_found / session_not_found / session_load_failed all share {kind, message, retryable} shape. - Work-verb semantics: exec-* surfaces expose 'handled' boolean (not 'found') because 'not handled' is the operational signal — claws dispatch on whether the work was performed, not whether the entry exists in the inventory.
2026-04-22 18:34:26 +09:00
else:
print(result.message)
fix: #181 — envelope exit_code must match process exit code (exec-command/exec-tool) Cycle #26 dogfood found a real red-state bug in the JSON envelope contract. ## The Bug exec-command and exec-tool not-found cases return exit code 1 from the process, but the envelope reports exit_code: 0 (the default from wrap_json_envelope). This is a protocol violation. Repro (before fix): $ claw exec-command unknown-cmd test --output-format json > out.json $ echo $? 1 $ jq '.exit_code' out.json 0 # WRONG — envelope lies about exit code Claws reading the envelope's exit_code field get misinformation. A claw implementing the canonical ERROR_HANDLING.md pattern (check exit_code, then classify by error.kind) would incorrectly treat failures as successes when dispatching on the envelope alone. ## Root Cause main.py lines 687–739 (exec-command + exec-tool handlers): - Return statement: 'return 0 if result.handled else 1' (correct) - Envelope wrap: 'wrap_json_envelope(envelope, args.command)' (uses default exit_code=0, IGNORES the return value) The envelope wrap was called BEFORE the return value was computed, so the exit_code field was never synchronized with the actual exit code. ## The Fix Compute exit_code ONCE at the top: exit_code = 0 if result.handled else 1 Pass it explicitly to wrap_json_envelope: wrap_json_envelope(envelope, args.command, exit_code=exit_code) Return the same value: return exit_code This ensures the envelope's exit_code field is always truth — the SAME value the process returns. ## Tests Added (3) TestEnvelopeExitCodeMatchesProcessExit in test_exec_route_bootstrap_output_format.py: 1. test_exec_command_not_found_envelope_exit_matches: Verifies exec-command unknown-cmd returns exit 1 in both envelope and process. 2. test_exec_tool_not_found_envelope_exit_matches: Same for exec-tool. 3. test_all_commands_exit_code_invariant: Audit across 4 known non-zero cases (show-command, show-tool, exec-command, exec-tool not-found). Guards against the same bug in other surfaces. ## Impact - 206 → 209 passing tests (+3) - Zero regressions - Protocol contract now truthful: envelope.exit_code == process exit - Claws using the one-handler pattern from ERROR_HANDLING.md now get correct information ## Related - ERROR_HANDLING.md (cycle #22): Documented exit_code as machine-readable contract field - #178/#179 (cycles #19/#20): Closed parser-front-door contract - This closes a gap in the WORK PROTOCOL contract — envelope values must match reality, not just be structurally present. Classification (per cycle #24 calibration): - Red-state bug: ✓ (contract violation, claws get misinformation) - Real friction: ✓ (discovered via dogfood, not speculative) - Fix ships same-cycle: ✓ (discipline per maintainership mode) Source: Jobdori cycle #26 dogfood — ran multiple edge-case probes, noticed exec-command envelope showed exit_code: 0 while process exited 1. Investigated wrap_json_envelope default behavior, confirmed bug, fixed and tested in same cycle.
2026-04-22 21:33:57 +09:00
return exit_code
parser.error(f'unknown command: {args.command}')
return 2
if __name__ == '__main__':
raise SystemExit(main())