Skip to main content

resq-health

Version: v0.1.16 · License: Apache-2.0 · Crate: crates.io · API docs: docs.rs
Service health diagnostic dashboard for ResQ platform

Overview

Crates.io License Service health monitoring dashboard for the ResQ platform. Polls all service health endpoints concurrently and displays live status with latency in a Ratatui TUI. Also serves as a CI health gate via --check mode, and supports integration test execution via --test.

Overview

resq-health monitors five core ResQ services by issuing HTTP health checks on a configurable interval. Standard services are probed with GET /health, while the Neo N3 RPC node uses a JSON-RPC getversion call. Results are displayed in a navigable table with color-coded status indicators, latency measurements, and error diagnostics. A non-interactive --check mode provides deterministic exit codes for CI pipelines.

Architecture

Installation

# From the workspace root
cargo install --path crates/resq-health

# Or build locally
cargo build --release -p resq-health
# Binary: target/release/resq-health

CLI Arguments

FlagShortDefaultDescription
--check-coffCI mode: run a single health check cycle, print results, and exit with a status code
--interval <N>-i5Polling interval in seconds (TUI mode only)
--test <PATH>-tPath to an integration test script or directory to execute

Usage Examples

Interactive TUI (default)

# Launch the health dashboard with default 5-second polling
resq-health

# Increase polling interval to 10 seconds
resq-health --interval 10

# Short form
resq-health -i 10

CI / Health Gate Mode

# Single check, exits with status code
resq-health --check

# Use as a deployment gate
resq-health --check || { echo "Services not ready"; exit 1; }

# Combine with deploy
resq-deploy --env dev --action up
resq-health --check

Integration Tests

# Run integration test scripts
resq-health --test tests/smoke.sh

# Run tests from a directory
resq-health --test tests/integration/

Integration Test JSON Format

[
  {
    "name": "infrastructure-api health",
    "method": "GET",
    "url": "http://localhost:5000/health",
    "expect_status": 200
  },
  {
    "name": "create incident",
    "method": "POST",
    "url": "http://localhost:5000/incidents",
    "body": { "incident_type": "FLOOD", "severity": "HIGH" },
    "expect_status": 201
  }
]

TUI Layout

+-----------------------------------------------------+
|  ResQ Health Monitor          Up: 42s                |
|  3/5 SERVICES HEALTHY                                |
+-----------------------------------------------------+
|  SERVICE              STATUS      LATENCY  MESSAGE   |
|                                                      |
|  infrastructure-api   Healthy     45ms     -         |
|  coordination-hce     Healthy     23ms     -         |
|  intelligence-pdie    Degraded    1250ms   -         |
|  neo-n3-rpc           Healthy     89ms     -         |
|  web-dashboard        Unhealthy   5000ms   Timeout   |
+-----------------------------------------------------+
|  [Q] Quit   [Up/Down] Nav   [Enter] Details          |
+-----------------------------------------------------+

Detail View

Pressing Enter on a service shows an expanded detail panel with the full URL, status, latency, and diagnostic error message.

Keyboard Shortcuts

KeyAction
q / EscQuit the application
Down / jMove selection down
Up / kMove selection up
EnterToggle detail view for selected service
hToggle help popup

Health Status Levels

StatusColorDescription
HealthyGreenService responded successfully within timeout
DegradedYellowService responded but reported non-”ok” status or high latency
UnhealthyRedService timed out, returned an error HTTP status, or is unreachable
UnknownGrayService has not been checked yet (initial state)

Exit Codes (—check mode)

Exit CodeMeaning
0All services report Healthy
1One or more services are not Healthy

Monitored Services

ServiceDefault URLHealth EndpointProtocol
coordination-hcehttp://localhost:5000/healthHTTP GET
infrastructure-apihttp://localhost:8080/healthHTTP GET
intelligence-pdiehttp://localhost:8000/healthHTTP GET
neo-n3-rpchttp://localhost:20332/JSON-RPC POST (getversion)
ipfs-gatewayhttp://localhost:8081/api/v0/versionHTTP GET
All requests use a 5-second timeout. Services that do not respond within the timeout are marked Unhealthy.

Environment Variables

Service URLs can be overridden via environment variables:
VariableDefaultDescription
HCE_URLhttp://localhost:5000Base URL for the coordination-hce service
INFRA_API_URLhttp://localhost:8080Base URL for the infrastructure-api service
PDIE_URLhttp://localhost:8000Base URL for the intelligence-pdie service
NEO_RPC_URLhttp://localhost:20332URL for the Neo N3 RPC node
IPFS_URLhttp://localhost:8081Base URL for the IPFS gateway
# Example: point at a remote staging cluster
HCE_URL=http://staging.internal:5000 \
INFRA_API_URL=http://staging.internal:8080 \
resq-health --check

Configuration

resq-health does not use configuration files. All configuration is handled through CLI flags and environment variables as described above. To add a new service to the monitoring list, add it to the ServiceRegistry::new() function in src/services.rs.

Dependencies

CratePurpose
resq-tuiShared TUI components, theme, header/footer/popup widgets
clapCLI argument parsing (derive mode)
tokioAsync runtime with timer support
reqwestHTTP client for health check requests
futuresConcurrent polling via join_all
serde / serde_jsonJSON deserialization of health responses
chronoTimestamp handling
walkdirDirectory traversal for integration test discovery
anyhowError propagation

License

Licensed under the Apache License, Version 2.0. See LICENSE for details.