Rutvik Acharya PyCon India 2025

Large language models (LLMs) don’t always need to be bigger to be better—sometimes, they just need to think more efficiently. Inference-time compute scaling improves LLM reasoning by dynamically increasing computational effort during inference, similar to how humans perform better when given more time to think. This session will explore cutting-edge techniques like Chain-of-Thought (CoT) prompting, voting-based search, and the latest research in test-time scaling. We’ll dive into methods like “Wait” tokens, preference optimization, and dynamic depth scaling, which allow models to refine responses on the fly. Whether you're interested in boosting LLM accuracy, improving robustness, or optimizing compute budgets, this talk will provide key insights into the future of smarter, more adaptable AI.

AI, ML, Data Science

Track 1

Rutvik Acharya .ical

Session

Rutvik Acharya
.ical