food-market/scripts/quality-dashboard.py
nns 019c57ae3b
Some checks are pending
Auto-tag / Create date-tag (push) Waiting to run
CI / Backend (.NET 8) (push) Waiting to run
CI / Web (React + Vite) (push) Waiting to run
CI / POS (WPF, Windows) (push) Waiting to run
Docker API / Build + push API (push) Waiting to run
Docker API / Deploy API on stage (push) Blocked by required conditions
feat(s25): autonomous continuous quality monitoring (8/8)
Hourly smoke watchdog + auto-fix loop + dashboard + multi-tenant guard
+ perf regression + cleanup job + README badge.

1. ~/quality-watchdog.sh (cron 5 * * * *) — 8 checks (~60s):
   /health/ready, signup→login→/api/me, GET products, Playwright UI
   smoke (3.1 product CRUD), /metrics format, /hubs/notifications
   negotiate with token, multi-tenant isolation, perf p95.
2. Auto-fix loop: 2× consecutive red → ~/.fm-watchdog/incident-*.txt
   + queue/0000-incident-* to bump it ahead of Server-Claude's
   sprint queue. fm-watchdog.sh sees prefix 0000- as next.
3. scripts/quality-dashboard.py — renders docs/quality-status.md
   (current emoji, 8-step table, perf baseline, 7-day history,
   24-run sparkline) + injects README badge 🟢/🟡/🔴.
4. Multi-tenant smoke: signup 2 orgs `quality-{epoch}-A/B`, create
   product in A, verify B sees 404/403 + total=0.
5. Perf regression: p95 over 10 reqs for /api/me, products,
   sales/retail/stats. Baseline = median of last 10 samples
   (robust to noise). >50% from baseline → alert. First 5 runs
   always green (warm-up).
6. HousekeepingJobs.PruneQualityTestOrgsAsync (cron 30 2 * * * UTC):
   finds orgs `quality-%` older than 24h, dynamically scans
   information_schema for tables with OrganizationId, iteratively
   DELETEs with FK-violation retry (up to 10 passes), then cleans
   AspNetUser*/OpenIddict* by email pattern `quality-%@test-fm.local`,
   finally users + organizations.
7. README badge: <!-- quality-badge --> marker updated each run.

Validated: stage deploy ✓, Hangfire job registered ✓, dry-run SQL on
24 stage candidates → 0 remaining ✓, 3 cron-triggered runs all 8/8
green (12:42/12:45/12:48 +05) ✓.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-08 12:50:35 +05:00

240 lines
8.5 KiB
Python
Executable file
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/usr/bin/env python3
"""
Sprint 25: рендерер docs/quality-status.md из ~/.fm-watchdog/quality-history.jsonl.
Запускается из ~/quality-watchdog.sh после каждого прогона.
Также обновляет статус-бейдж в README.md (🟢/🟡/🔴).
Usage: python3 scripts/quality-dashboard.py
"""
from __future__ import annotations
import json
import os
import re
import sys
from datetime import datetime, timezone, timedelta
from pathlib import Path
REPO = Path(__file__).resolve().parent.parent
HISTORY = Path.home() / ".fm-watchdog" / "quality-history.jsonl"
STATE = Path.home() / ".fm-watchdog" / "quality-state.json"
BASELINE = Path.home() / ".fm-watchdog" / "quality-perf-baseline.json"
DASHBOARD = REPO / "docs" / "quality-status.md"
README = REPO / "README.md"
STEP_NAMES = {
"health": "/health/ready",
"auth_me": "signup→login→/api/me",
"products": "GET /api/catalog/products",
"ui_flow": "Playwright UI (product CRUD)",
"metrics": "/metrics (Prometheus)",
"signalr": "/hubs/notifications/negotiate",
"multi_tenant": "Multi-tenant isolation",
"perf": "Performance p95 vs baseline",
}
def load_history() -> list[dict]:
if not HISTORY.exists():
return []
out = []
for line in HISTORY.read_text().splitlines():
line = line.strip()
if not line:
continue
try:
out.append(json.loads(line))
except Exception:
continue
return out
def load_state() -> dict:
if not STATE.exists():
return {}
try:
return json.loads(STATE.read_text())
except Exception:
return {}
def load_baseline() -> dict:
"""Возвращает dict {endpoint: median_ms} из нового формата с samples/median."""
if not BASELINE.exists():
return {}
try:
raw = json.loads(BASELINE.read_text())
except Exception:
return {}
if "median" in raw:
return raw["median"]
# Совместимость со старым форматом (просто {key: value}).
return {k: v for k, v in raw.items() if isinstance(v, (int, float))}
def parse_ts(s: str) -> datetime:
# python <3.11 не парсит +03:00 в datetime.fromisoformat для всех вариантов; чистим.
s = s.replace("Z", "+00:00")
try:
return datetime.fromisoformat(s)
except Exception:
return datetime.now(timezone.utc)
def determine_color(history: list[dict]) -> str:
"""🟢 если последний run all green; 🟡 если есть red но <consec 2;
🔴 если есть consecutive-2+ red (incident-уровень)."""
if not history:
return "🟡"
last = history[-1]
if not last.get("red"):
return "🟢"
# Считаем consecutive_fail из state.
state = load_state()
for step in last.get("red", []):
if int(state.get(step, {}).get("consecutive_fail", 0)) >= 2:
return "🔴"
return "🟡"
def render_dashboard(history: list[dict], state: dict, baseline: dict) -> str:
last = history[-1] if history else None
color = determine_color(history)
now = datetime.now(timezone.utc).isoformat(timespec="seconds")
out = ["# Quality status", "", f"_Обновлено: {now} · auto-gen из `~/quality-watchdog.sh`_", ""]
out.append(f"## {color} Текущий статус")
out.append("")
if last:
out.append(f"**Последний прогон:** `{last['ts']}` ")
out.append(f"**Зелёных шагов:** {len(last.get('green', []))}/{len(STEP_NAMES)} ")
out.append(f"**Красных шагов:** {len(last.get('red', []))} ")
else:
out.append("стория пока пуста — quality-watchdog ещё не запускался cron'ом._")
out.append("")
# Step-by-step table.
out.append("## Шаги smoke-suite")
out.append("")
out.append("| Шаг | Статус | Последнее изменение | Consecutive fail |")
out.append("|---|---|---|---|")
last_green = set(last.get("green", [])) if last else set()
last_red = set(last.get("red", [])) if last else set()
for key, label in STEP_NAMES.items():
if key in last_green:
icon = "🟢"
elif key in last_red:
icon = "🔴"
else:
icon = ""
s = state.get(key, {})
recent = s.get("last_green") if icon == "🟢" else s.get("last_red", "")
cf = s.get("consecutive_fail", 0)
out.append(f"| {label} | {icon} | `{recent or ''}` | {cf} |")
out.append("")
# Red detail для текущего прогона.
if last and last.get("details"):
out.append("## Детали падений (последний прогон)")
out.append("")
for d in last["details"]:
out.append(f"- `{d}`")
out.append("")
# Performance baseline.
if baseline:
out.append("## Performance baseline (p95, ms)")
out.append("")
out.append("| Endpoint | p95 (ms) |")
out.append("|---|---|")
for k, v in sorted(baseline.items()):
# _api_me → /api/me. Грубо, восстанавливаем читаемое имя.
pretty = k.replace("_", "/")
out.append(f"| `{pretty}` | {v} |")
out.append("")
out.append("_Регрессия = текущий p95 >50% от baseline. Baseline обновляется только когда регрессии нет (берёт min)._")
out.append("")
# История за неделю.
week_ago = datetime.now(timezone.utc) - timedelta(days=7)
week_runs = []
for h in history:
ts = parse_ts(h["ts"])
if ts.tzinfo is None:
ts = ts.replace(tzinfo=timezone.utc)
if ts >= week_ago:
week_runs.append(h)
out.append("## История за 7 дней")
out.append("")
if not week_runs:
out.append("_Нет прогонов за последнюю неделю._")
else:
red_runs = [h for h in week_runs if h.get("red")]
out.append(f"**Прогонов:** {len(week_runs)} ")
out.append(f"**С красным:** {len(red_runs)} ")
out.append(f"**Green-ratio:** {((len(week_runs)-len(red_runs))*100)//max(1,len(week_runs))}% ")
out.append("")
if red_runs:
out.append("### Прогоны с красным шагом")
out.append("")
out.append("| Время | Красные шаги |")
out.append("|---|---|")
for h in red_runs[-20:]:
out.append(f"| `{h['ts']}` | {', '.join(h.get('red', []))} |")
out.append("")
# Последние 24 прогона как sparkline.
out.append("## Последние 24 прогона")
out.append("")
spark = "".join("🟢" if not h.get("red") else "🔴" for h in history[-24:])
out.append(f"`{spark or '— нет данных —'}`")
out.append("")
out.append("---")
out.append("")
out.append("Скрипт: `~/quality-watchdog.sh` (cron `0 * * * *`). ")
out.append("Источник: `~/.fm-watchdog/quality-history.jsonl`. ")
out.append("Sprint 25 — autonomous continuous quality monitoring.")
out.append("")
return "\n".join(out)
BADGE_RE = re.compile(r"<!-- quality-badge -->.*?<!-- /quality-badge -->", re.S)
def update_readme_badge(color: str) -> None:
if not README.exists():
return
txt = README.read_text()
badge = f"<!-- quality-badge --> {color} **Quality:** [`docs/quality-status.md`](docs/quality-status.md) <!-- /quality-badge -->"
if BADGE_RE.search(txt):
new_txt = BADGE_RE.sub(badge, txt)
else:
# Вставляем после первого заголовка.
lines = txt.splitlines()
for i, ln in enumerate(lines):
if ln.startswith("# "):
lines.insert(i + 1, "")
lines.insert(i + 2, badge)
break
new_txt = "\n".join(lines)
if new_txt != txt:
README.write_text(new_txt)
def main() -> int:
history = load_history()
state = load_state()
baseline = load_baseline()
DASHBOARD.parent.mkdir(parents=True, exist_ok=True)
DASHBOARD.write_text(render_dashboard(history, state, baseline))
color = determine_color(history)
update_readme_badge(color)
print(f"dashboard updated: {DASHBOARD} (status={color})")
return 0
if __name__ == "__main__":
sys.exit(main())