PyCon India 2025

Dating, Dining, and Doomscrolling: The Algorithms Behind Modern Choice
2025-09-12 , Room 4

Abstract

Recommendation systems are no longer just about suggesting the next movie, they now shape everything from what we eat and wear to who we date, what we watch, and even what we believe. In this hands-on workshop, we’ll explore the powerful systems behind these “invisible puppeteers” of modern digital life.

You’ll learn not just how recommendation systems work, but how they shape your choices, reinforce (or challenge) your biases, and how companies fine-tune them to maximize retention, revenue, and engagement. We’ll build and break down a recommendation pipeline from retrieval to ranking and dive into production challenges, optimization tricks, and the ethical trade-offs involved.

We’ll also discuss how users can hack these systems to personalize their feeds and how developers can design fairer, more diverse systems. Whether you're into dating apps, content feeds, or building ML pipelines, this workshop will help you understand the full spectrum of RecSys.


1. Introduction (10 mins)

  • Welcome and Overview

  • Introduction to the workshop: What are recommendation systems and why do they matter?

  • Real-world examples:

    • Dating apps (e.g., Tinder)
    • Content feeds (e.g., Instagram)
    • Food recommendations (e.g., Zomato)
    • Goals of the workshop:

    • Learn how RecSys work

    • Dive into their challenges
    • Understand their ethical implications
  • Audience Poll / Engagement

  • Quick questions to gauge familiarity with RecSys

  • Brief survey:

    • Who uses these systems?
    • Who’s familiar with the tech behind them?

2. Foundations of Recommendation Systems (30 mins)

Goal: Give a strong conceptual grounding before diving into implementation.

  • Types of Training Data

  • Explicit Feedback: Ratings, thumbs-up/down

  • Implicit Feedback: Clicks, dwell time, views, purchases

  • Interaction Types

  • Positive-only vs. Positive & Negative Feedback

  • System Categories by Use-Case

  • General-purpose, Context-aware, Sequential, Knowledge-based RecSys

  • Core Approaches

  • Collaborative Filtering (User-based vs. Item-based)

  • Content-Based Filtering
  • Hybrid Models

  • Application Examples

  • Netflix → Hybrid

  • Amazon → Item-based CF
  • Tinder → Context-aware
  • Spotify → Sequential
  • Zomato → Content + CF
  • Pratilipi → Long-tail engagement via diversity-aware models

  • Recommendation Pipeline

  • Retrieval Stage: Candidate generation

  • Ranking Stage: Sorting by relevance
  • Exploration vs. Exploitation

  • Ranking Methods

  • Pointwise / Pairwise / Listwise

  • Metrics that Matter

  • Offline Metrics: Precision\@K, Recall, MAP, NDCG

  • Business Goal Alignment:

    • YouTube → Watch Time
    • Tinder → Match Rate

3. Hands-On Session: Building a Recommendation System (40 mins)

Goal: Implement and understand the inner workings of a real model.

  • Exercise 1: Building a Two-Tower Model (30 mins)

  • Use PyTorch to build a simple two-tower architecture

  • Demonstrate user-item embedding and dot-product similarity
  • Step-by-step: Data loading → Model → Loss → Training

  • Exercise 2: Evaluating Your Model (10 mins)

  • Use standard metrics: Precision, Recall, MAP

  • Compare online metrics vs. offline metrics

Break (10 mins)


4. Challenges in Recommendation Systems (30 mins)

Real-World Problems

  • Bias in Training Data: Echo chambers, skewed preferences
  • Cold Start: New users/items with little interaction data
  • Scalability: Latency and compute at scale

Ethical Challenges

  • Algorithmic Bias & Fairness: Impact on sensitive attributes
  • Diversity vs. Personalization: Avoiding homogenized content
  • Debiasing Techniques:

  • Exposure Bias: Overexposure to certain items

  • Popularity Bias: Trend dominance
  • Selection Bias: Skewed relevance due to interaction patterns

Case Study: Pratilipi

  • How an Indian storytelling platform balances fairness, diversity, and engagement

5. Hands-On Recap & Final Exercise (30 mins)

Mini Lab: Use your model to explore bias, diversity, and fairness.

  • Apply:

  • Diversity boosting

  • Popularity debiasing
  • Freshness filters

  • Re-evaluate:

  • Compare fairness, diversity, and relevance metrics before and after

  • Group Reflection:

  • Discuss improvements, regressions, and trade-offs


6. Gaming the System: How Users Can Personalize Their Feeds (10 mins)

  • How users can influence their recommendation feed (e.g., YouTube, Spotify)
  • Implications of over-personalization (filter bubbles, doomscrolling)
  • Tips for better and diverse recommendations

7. Conclusion & Q\&A (20 mins)

  • Wrap-Up
  • Q\&A: Open floor for questions on theory, implementation, or ethics
  • Additional Resources:

  • GitHub repo with code, datasets, and readings

  • Encourage continued learning and experimentation

Target Audience

Intermediate

Prerequisites
  • Proficiency in Python, pandas, NumPy, and PyTorch
  • A solid understanding of machine learning fundamentals, model evaluation, and training loops
  • Familiarity with basic data pipelines and handling model performance metrics

Audience –
This workshop is designed for:
- Machine Learning and Deep Learning practitioners with some experience in building models
- Students and data scientists curious about algorithmic personalization and its impact on user behavior
- Engineers focused on content feeds, recommendation engines, or user engagement systems

Expected Outcomes –
- Understand the core components of recommendation systems, including algorithmic selection and ranking mechanisms, and their business impact
- Build and train a two-tower recommendation model, understanding why it’s used in content recommendation
- Gain hands on experience in evaluating recommendation systems
- Tackle real-world challenges like algorithmic bias, fairness, and diversity using techniques
- Explore case studies like Pratilipi to understand the practical implications of ethical decision making in recommendation models
- Walk away with a GitHub repository containing code, datasets, and additional learning resources

About the Speaker

Aman Kumar Pandey is a Data Scientist specializing in NLP, Recommender Systems, and Applied AI, with 6+ years of experience across diverse industries.He has developed large-scale recommendation engines, fine-tuned LLMs, and worked on model monitoring, explainable AI, and multilingual NLP solutions.

Currently working at Pratilipi building Recommendation system at scale.

His work spans industry and conservation tech, using AI for wildlife monitoring via computer vision and deep learning. Passionate about open science and AI for social impact, he collaborates with researchers and NGOs to drive real-world innovation.