Evals First, Code Later: A Practical Guide to Evaluations, Rerankers & Caches PyCon India 2025

Evals First, Code Later: A Practical Guide to Evaluations, Rerankers & Caches
.ical

2025-09-14 10:50–11:20, Track 3

Many retrieval-augmented generation (RAG) and code-search pipelines rely on ad-hoc checks and break when deployed at scale.

This talk presents an evaluation-first development workflow applied to a production code-search engine built with Python, PostgreSQL (pgvector), and OpenAI rerankers. Introducing automated evaluation suites before optimisation cut average query latency from 20 min to 30 s, delivered a 40 × speed-up, and raised relevance by ≈ 30 %. We will cover:

Constructing task-specific evaluation datasets and metrics
Hybrid (lexical + ANN) retrieval
Cross-encoder reranking for precision boosts
Semantic caching strategies that keep indexes fresh and queries fast

The session includes benchmark results, a live demonstration, and an MIT-licensed reference implementation that attendees can clone and extend.

Time	Section
0–3 min	Introduction
3–6 min	Why Evals First: avoiding invisible failures
6–9 min	Chunking strategies: myths, length / overlap, heuristics, & findings
9–12 min	Hybrid retrieval: BM25 + ANN + score fusion
12–16 min	Rerankers: what works, when, and why
16–19 min	Semantic caching: avoid redundant computation, stay fresh
19–23 min	Live demo: see relevance jump.
23–25 min	Cheatsheet + Repo walkthrough
25–30 min	Q\&A

Prerequisites –

Basic proficiency in Python
Prior use of a database (exposure to vector DB is helpful but not required)

Target Audience –

Beginner

Saksham Aggarwal

Saksham Aggarwal is a founder and engineer building AI agents that automate engineering grunt work, starting with SDK integrations so that product teams can move faster.

He was the #1 engineer / founding member at PYOR, a Castle Island–backed financial data startup, where he built an enterprise data terminal for on-chain analytics.

Saksham has also driven growth at Flint (now LogX) and scaled Conquest, India’s largest student-run startup accelerator.

He’s passionate about: Agentic retrieval systems, Programmable prompts, Evals (Text/Image/Video), Interaction design, and Synthetic data pipelines.

Evals First, Code Later: A Practical Guide to Evaluations, Rerankers & Caches .ical 2025-09-14 10:50–11:20, Track 3

Evals First, Code Later: A Practical Guide to Evaluations, Rerankers & Caches
.ical

2025-09-14 10:50–11:20, Track 3