🟢 Open to roles · Full Stack ML Data Scientist

Monitoring & Security (Reliability, Scaling, Guardrails)

Ensure the Vespa + FastAPI + OpenAI stack runs safely, predictably, and transparently once deployed beyond localhost. This page summarizes how the live system is observed, protected, and governed.

GKEGrafana / PrometheusCloud LoggingCloud ArmorJWT + OIDCSecret Manager

Infrastructure Overview

Where everything runs

Kubernetes (GKE)

Autoscaled serving + labeling

Vespa content + container nodes
FastAPI pods (search, labeling, demos)
Cloud SQL sidecars for Postgres

Networking

Ingress + Firewall

HTTPS Ingress (Managed Cert)
Cloud Armor WAF / rules
Internal Services for Vespa/DB

Storage & Logs

Durable + auditable

GCS for datasets & artifacts
Cloud Logging (Stackdriver)
Object versioning + backups

Monitoring / Dashboards

Real-time visibility

Prometheus scrape → Grafana
Alert Policies → Slack/PagerDuty
Cloud Logging → error analytics

Identity / Access

Zero-trust model

Service Accounts (least privilege)
API tokens per role
Admin via OIDC + 2FA

Pod Health & Performance

Latency, throughput, and autoscaling

Alerting

Cluster Summary: CPU/Memory per nodepool, pod status.
Vespa Query Latency: p50/p95/p99 by rank profile.
Feed Throughput: docs/sec + retries per content node.
Autoscaler Events: node adds/removes vs load.
FastAPI: requests/sec + error rate by endpoint.

Alerts: p95 > 500 ms, restarts > 3/10 min, CPU > 85% sustained.

Grafana (live)

Add NEXT_PUBLIC_GRAFANA_URL to embed

Provide a public or auth-proxied Grafana URL via NEXT_PUBLIC_GRAFANA_URL to render an embed here.

Security Controls

Perimeter → AuthZ → Data privacy

1) Perimeter

Cloud Armor WAF (IP reputation, geo)
Rate limit (e.g., 3 req/min per token)
hCaptcha/ReCAPTCHA on public forms
HTTPS-only via Managed Cert

2) AuthN/AuthZ

JWT bearer tokens for raters/admins
Anonymous read endpoints w/ caps
Admin via GCP OIDC + 2FA

3) Data Integrity & Privacy

PII-stripped logs, hashed rater IDs
Nightly GCS backups + versioning
Secrets in Secret Manager

AI Use Guardrails

Transparency, reversibility, bias checks

Transparency & Opt-Out

GPT models assist in auto-labeling & embeddings; no personal data is processed or stored. Users may pause auto-labeling to stay manual.

Ethical Guidelines

Explainability (“Show relevance math” tooltips)
Reversibility (versioned, revertible adjustments)
Non-manipulative UX (clear consent)
Bias checks (weekly category imbalance scan)

Scaling Reliability & Failover

How we grow and recover

Component	Scaling	Recovery
Vespa content nodes	HPA (1–5) + replica sync	Warm replica failover
FastAPI containers	Autoscale (GKE/Cloud Run)	0→N cold start
CloudSQL (Postgres)	Managed HA	Point-in-time restore
GCS buckets	Multi-region	Immutable history
Grafana/Prometheus	StatefulSet + PVC	Snapshot restore job

Load test (Locust) sustained ~3,000 req/min under default limits; bursts absorbed by queue buffering.

Bot Detection & Abuse Mitigation

Keep signals human and trustworthy

Behavior heuristics (keypress timing entropy)
Consensus API: ≤1 vote/sec per session
Honeytoken queries as controls
Violations → token suspension + Cloud Armor quarantine

Compliance & Logging

Retention, export, verification

GDPR/CCPA disclosure + data export endpoint
30-day user-interaction logs; aggregate metrics kept indefinitely
Access logs SHA-256 signed and verified hourly
Periodic penetration tests via Cloud Security Scanner

Grafana Embed

Pod CPU + Query latency

Add NEXT_PUBLIC_GRAFANA_URL to render a live panel here.

Firewall & Request Path

Browser → Ingress → Cloud Armor → FastAPI → Vespa

Replace image with your architecture diagram when ready.

Bot Detection Flow

Ethical AI Statement

“Our system amplifies human expertise, not replaces it.” We keep users in the loop, show how scores are computed, and allow reversions of automated decisions.

Technology Stack!

The Architecture and Tools Supporting this Project

Next Page! →

🎧 Audio Guide: Page 10 · Monitoring & Security 🎧

0:00 / 0:00