🟢 Open to roles · Full Stack ML Data Scientist

Single Search Evaluation Playground

Search anything in the embedded app, toggle relevance on results, and watch Precision, Recall, MAP, and nDCG update live. Powered by Vespa (ann_summary_2), FastAPI, and PostgreSQL!

Vespa + ANNLive metricsUser A/B TestsSearch History

How to use this page

Enter a search. Use Popular or Recent queries in the app to explore.
Decide what’s relevant. Check the Relevant box for results that truly answer your query.
Watch metrics update. Precision, Recall, MAP, and nDCG all recalc live at k = 3, 5, 10.
Repeat with new queries to see how consistent the model is.

Goal: make the top results both accurate and complete.

What the metrics mean

Precision@k

(# relevant in top k) / k

Higher ⇒ fewer junk results up top.

Recall@k

(# relevant ≤ k) / (total relevant)

Checks coverage; rises with k.

MAP

avg precision at each relevant hit

Rewards early, consistent hits.

nDCG

Σ(rel_i / log₂(i+1)) normalized

Rewards good ordering by rank.

Embedded App

🎧 Audio Guide: Page 8 · Single Search Demo 🎧

0:00 / 0:00

Another Demo! Correcting Results with Keywords!

Have important searches giving the wrong results? Embed a relevance increase or decrease!

Fine-Tuning Search Results! →