🟢 Open to roles · Full Stack ML Data Scientist

Stack Overview (Architecture, DevOps, and Technical Credits)

The full technical ecosystem powering the Scientific Search & Evaluation Platform—from data ingestion and cloud processing, to demonstratable applications, and even automated, scalable deployments. This page is both documentation and recognition: a map of every tool, service, and design choice that keeps the system reproducible, maintainable, and scalable.

Next.jsFastAPIVespaOpenAIPostgreSQLGKEGCSGrafana

High-Level System Architecture

From frontend views to back-end pipelines

Frontend Layer (User-Facing)

Static pages + interactive demos

Component	Role
Next.js site	Informational & visualization pages (ArXiv → Vector Docs → Vespa → Auto-Labeler → Consensus → Eval → Demos)
FastAPI web apps	Interactive demos (single-search, fine-tune), REST + WebSocket APIs
Chart.js / D3.js	Live evaluation plots (P@K, MAP, nDCG)
TailwindCSS (select pages)	Cohesive, accessible styling

Frontend assets deploy via Cloud Run (static) and GCS web buckets with CDN caching.

Intelligence & Data Layer

Models + storage

Component	Description
OpenAI (GPT-4o-mini)	Auto-labeler, enrichment completions, context summaries
Sentence Embeddings	Fixed-length vectors (384/1536 dims) for dense retrieval
CloudSQL (PostgreSQL)	Queries, labels, votes, evaluation metrics, runs
GCS Buckets	Enriched JSONs, CSV metrics, model outputs, PDFs, diagrams
Redis / cache (planned)	Embed caching to reduce repeated token calls

Search Infrastructure — Vespa on GKE

Schema + rank profiles running on a multi-node cluster

Cluster Composition

Node Role	Purpose
Configserver	Central coordination and schema control
Content Nodes	Document storage + indexing for enriched arXiv vectors
Query Container	YQL, ANN queries, multi-vector ranking
Feed Container	Writes, schema refresh, ingest jobs
Admin / Headless	Config deploys, upgrades, CI/CD hooks

Scaling & Health

HorizontalPodAutoscaler scales 1–20 pods
Pod Disruption Budgets guard against cascades
Prometheus + Grafana track latency/throughput/utilization
Autoscaling policy: optimize-utilization
Based on Vespa’s multinode-ha pattern with custom GKE YAMLs

Cluster Diagram (placeholder)

North/South & Headless Services (placeholder)

DevOps & Automation

CI/CD pipelines, registries, and rollout flow

Tools

Tool	Purpose
GitHub Actions	Build, test, push manifests
Cloud Build / Artifact Registry	Build + store images
Kubernetes YAMLs	Deployments, Services, BackendConfigs, certs
Helm (planned)	Parametrize Vespa & schema upgrades
Cloud Logging + Error Reporting	Stack traces + alerts

Dev & Ops Safeguards

Reliability + recovery baked in

Schema versioning (staging → prod rollouts)
Readiness/liveness probes on every deployment
Blue-Green for FastAPI (zero-downtime)
Secret rotation for API keys & service accounts
Daily PostgreSQL backups + GCS snapshots

Deployment Diagram

End-to-end data & control flow (placeholder)

User → Cloud Run → FastAPI → Vespa GKE Cluster → CloudSQL → GCS → Grafana. Sidecar services (auto-labeler, evaluator, consensus, metrics) share the namespace.

Technology Stack Summary

Layers & choices

Layer	Technology	Description
Frontend	Next.js, TailwindCSS, Chart.js	UI + Visualization
API	FastAPI, Uvicorn, SQLAlchemy	Core web & data endpoints
AI Models	OpenAI GPT-4o-mini, sentence transformers	Enrichment + embeddings
Search Engine	Vespa.ai (ANN + BM25)	Vector + keyword ranking
Database	PostgreSQL (CloudSQL)	Persistent store
Orchestration	Kubernetes (GKE)	Containerized scaling
Storage	GCS Buckets	Data, logs, artifacts
Monitoring	Grafana + Prometheus	Pod health, latency, throughput
Security	Cloud Armor, IAM, JWT	Firewall + Auth
Automation	GitHub Actions, Cloud Build	CI/CD pipeline

Contact Me!

See all my information and set up a meeting!

About the Author! →

🎧 Audio Guide: Page 11 · Stack Overview 🎧

0:00 / 0:00