Stack Overview (Architecture, DevOps, and Technical Credits)
The full technical ecosystem powering the Scientific Search & Evaluation Platform—from data ingestion and cloud processing, to demonstratable applications, and even automated, scalable deployments. This page is both documentation and recognition: a map of every tool, service, and design choice that keeps the system reproducible, maintainable, and scalable.
Next.jsFastAPIVespaOpenAIPostgreSQLGKEGCSGrafana
High-Level System Architecture
From frontend views to back-end pipelines
Frontend Layer (User-Facing)
Static pages + interactive demos
| Component | Role |
|---|---|
| Next.js site | Informational & visualization pages (ArXiv → Vector Docs → Vespa → Auto-Labeler → Consensus → Eval → Demos) |
| FastAPI web apps | Interactive demos (single-search, fine-tune), REST + WebSocket APIs |
| Chart.js / D3.js | Live evaluation plots (P@K, MAP, nDCG) |
| TailwindCSS (select pages) | Cohesive, accessible styling |
Frontend assets deploy via Cloud Run (static) and GCS web buckets with CDN caching.
Intelligence & Data Layer
Models + storage
| Component | Description |
|---|---|
| OpenAI (GPT-4o-mini) | Auto-labeler, enrichment completions, context summaries |
| Sentence Embeddings | Fixed-length vectors (384/1536 dims) for dense retrieval |
| CloudSQL (PostgreSQL) | Queries, labels, votes, evaluation metrics, runs |
| GCS Buckets | Enriched JSONs, CSV metrics, model outputs, PDFs, diagrams |
| Redis / cache (planned) | Embed caching to reduce repeated token calls |
Search Infrastructure — Vespa on GKE
Schema + rank profiles running on a multi-node cluster
Cluster Composition
| Node Role | Purpose |
|---|---|
| Configserver | Central coordination and schema control |
| Content Nodes | Document storage + indexing for enriched arXiv vectors |
| Query Container | YQL, ANN queries, multi-vector ranking |
| Feed Container | Writes, schema refresh, ingest jobs |
| Admin / Headless | Config deploys, upgrades, CI/CD hooks |
Scaling & Health
- HorizontalPodAutoscaler scales 1–20 pods
- Pod Disruption Budgets guard against cascades
- Prometheus + Grafana track latency/throughput/utilization
- Autoscaling policy: optimize-utilization
- Based on Vespa’s multinode-ha pattern with custom GKE YAMLs
Cluster Diagram (placeholder)

North/South & Headless Services (placeholder)

DevOps & Automation
CI/CD pipelines, registries, and rollout flow
Tools
| Tool | Purpose |
|---|---|
| GitHub Actions | Build, test, push manifests |
| Cloud Build / Artifact Registry | Build + store images |
| Kubernetes YAMLs | Deployments, Services, BackendConfigs, certs |
| Helm (planned) | Parametrize Vespa & schema upgrades |
| Cloud Logging + Error Reporting | Stack traces + alerts |
Dev & Ops Safeguards
Reliability + recovery baked in
- Schema versioning (staging → prod rollouts)
- Readiness/liveness probes on every deployment
- Blue-Green for FastAPI (zero-downtime)
- Secret rotation for API keys & service accounts
- Daily PostgreSQL backups + GCS snapshots
Deployment Diagram
End-to-end data & control flow (placeholder)

User → Cloud Run → FastAPI → Vespa GKE Cluster → CloudSQL → GCS → Grafana. Sidecar services (auto-labeler, evaluator, consensus, metrics) share the namespace.
Technology Stack Summary
Layers & choices
| Layer | Technology | Description |
|---|---|---|
| Frontend | Next.js, TailwindCSS, Chart.js | UI + Visualization |
| API | FastAPI, Uvicorn, SQLAlchemy | Core web & data endpoints |
| AI Models | OpenAI GPT-4o-mini, sentence transformers | Enrichment + embeddings |
| Search Engine | Vespa.ai (ANN + BM25) | Vector + keyword ranking |
| Database | PostgreSQL (CloudSQL) | Persistent store |
| Orchestration | Kubernetes (GKE) | Containerized scaling |
| Storage | GCS Buckets | Data, logs, artifacts |
| Monitoring | Grafana + Prometheus | Pod health, latency, throughput |
| Security | Cloud Armor, IAM, JWT | Firewall + Auth |
| Automation | GitHub Actions, Cloud Build | CI/CD pipeline |
Contact Me!
See all my information and set up a meeting!
0:00 / 0:00