Commit graph

8 commits

Author SHA1 Message Date
nurdotnet a2fa311a5d ci(docker): add retries for login and push on flaky network
Our upstream is dropping TCP SYNs to github.com/ghcr.io often enough
that single docker login/push attempts time out. Wrap in a 5-attempt
retry loop with 15s backoff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 00:49:33 +05:00
nurdotnet a17ca1b90c ci(docker): drop docker/login-action and build-push-action
These actions' tarballs are downloaded from api.github.com, and downloads
from our runner's network intermittently time out past the 100s
HttpClient limit. The job then fails after 3 retries. Replace them with
plain docker CLI commands: system docker already has buildx (via apt)
and can login + push to ghcr.io directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 00:06:29 +05:00
nurdotnet 29cefb64be ci(backend): map postgres service to host:5441 instead of 5432
The self-hosted runner host already has brew postgres@14 listening on
127.0.0.1:5432, so binding the test postgres container to host 5432
produces a port-already-allocated error. Pick 5441 and update the
connection string accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 23:55:12 +05:00
nurdotnet 8ac9e04bcf ci(docker): drop setup-buildx-action — use system buildx on self-hosted
docker/setup-buildx-action pulls the buildx binary from
release-assets.githubusercontent.com, which is unreachable from our
network (TLS handshake never completes — SNI-level block from upstream,
same host works for objects.githubusercontent.com). The self-hosted
runner already has docker-buildx 0.30.1 installed via apt, so
build-push-action can use it directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 23:44:11 +05:00
nurdotnet 5dce324f24 ci: move Linux jobs (backend, web, docker api/web) to self-hosted runner
POS stays on windows-latest (tag/manual only). Runner is registered on
the stage server, systemd-managed, labels [self-hosted, Linux, X64].
Goal: drop dependency on the 2000 GitHub-hosted minute quota — Windows
POS build now runs at most once per release tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 23:17:52 +05:00
nurdotnet 75d73b9dcd ci/deploy: stage deploy workflow + notifications + server plan
.github/workflows/deploy-stage.yml:
- Triggers on successful "Docker Images" workflow or manual dispatch.
- SSHes to stage server via STAGE_SSH_KEY, copies deploy/docker-compose.yml
  and nginx.conf, writes .env with current SHA + POSTGRES_PASSWORD.
- `docker compose pull && up -d --remove-orphans`.
- Smoke-tests /health with 5 retries (5s each).
- Pings Telegram on success/failure with commit SHA + stage URL.

.github/workflows/notify.yml:
- Separate workflow_run listener for CI/Docker failures, sends Telegram
  message with link to the failed run.

deploy/docker-compose.yml port remap (stage server already uses 80/443/5000/5432):
- API: 8080 (was 8080, confirmed free)
- Web: 8081 (was 80 — taken by legacy nginx)
- Postgres: 127.0.0.1:5434 (was 5433 — and now localhost-only, safer)

docs/stage-setup.md — one-time server setup runbook:
- Verified specs: Ubuntu 24.04, 4 CPU, 15 GB RAM, 4 GB free disk (tight).
- Step 1: `sudo usermod -aG docker nns` so deploy doesn't need sudo.
- Step 2: generate STAGE_POSTGRES_PASSWORD secret via `openssl rand`.
- Step 3: port-conflict check.
- Step 4: first manual deploy via gh workflow run.
- Disk-usage monitoring via cron → Telegram when >85%.

Secrets now in repo:
  TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID,
  STAGE_SSH_HOST, STAGE_SSH_PORT, STAGE_SSH_USER, STAGE_SSH_KEY
Still needed from user: STAGE_POSTGRES_PASSWORD (one openssl command).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 13:46:03 +05:00
nurdotnet fa2fae9503 ci: move POS (Windows, 2x multiplier) to tag/manual only; document budget
Each push previously burned ~21 billable GitHub Actions minutes because the
Windows POS build cost 10 (5 real × 2x Windows multiplier). That gives us
~95 pushes/month on the 2000-minute free tier — too tight for active dev.

- POS job now gates on `startsWith(github.ref, 'refs/tags/v')` OR
  workflow_dispatch. Every-commit CI stays Linux-only.
- CI trigger adds `tags: ['v*']` and workflow_dispatch so releases can build
  the .exe on demand.
- docs/24x7.md: new table with per-job minute/multiplier breakdown and the
  break-even point where a self-hosted runner becomes cheaper (~200 commits/mo).

Post-change estimate: ~11 billable min/commit → fits 180 commits/month in
the free tier. Windows minutes only spent when tagging a release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 11:36:29 +05:00
nurdotnet 5bcbff66de ci/deploy: GitHub Actions + Docker images + DB backup + 24x7 plan
.github/workflows/ci.yml — on push/PR:
  - backend job: dotnet restore/build/test with a live postgres service
  - web job: pnpm install + vite build + tsc, uploads dist artifact
  - pos job: windows-latest, dotnet publish self-contained win-x64
    single-file exe as artifact

.github/workflows/docker.yml — on push to main (if src changed) or manual:
  - api image → ghcr.io/nurdotnet/food-market-api:{latest,sha}
  - web image → ghcr.io/nurdotnet/food-market-web:{latest,sha}
  - uses buildx + GHA cache

deploy/Dockerfile.api — multi-stage (.NET 8 sdk → aspnet runtime),
  healthcheck on /health, App_Data + logs volumes mounted.

deploy/Dockerfile.web — node20 build → nginx 1.27 runtime; ships the
  Vite dist + nginx.conf that proxies /api, /connect, /health to api
  service and serves the SPA with fallback to index.html.

deploy/nginx.conf — SPA + API reverse-proxy configuration.

deploy/docker-compose.yml — production-shape stack: postgres 16 +
  api (from ghcr image) + web (from ghcr image), named volumes, env-
  driven tags so stage/prod can pin specific SHAs.

deploy/backup.sh — pg_dump wrapper with 3 modes: local (brew
  postgres), --docker (compose container), --remote HOST:PORT. Writes
  gzipped dumps to ~/food-market-backups, 30-day retention.

docs/24x7.md — explains where Claude/CI/stage live, which pieces
  depend on the Mac, and the exact steps to hand off secrets via
  ~/.food-market-secrets/ so I can push them into GitHub Secrets.

Next, once user supplies Proxmox + FTP + Telegram creds: stage deploy
workflow, notification workflow, and (optional) claude-runner VM so
I no longer depend on the Mac being awake.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 11:26:01 +05:00