PyCon India 2025

RapidDoc - A rule-based AI model fine-tuned model to enforce style guide rules
2025-09-14 , Track 1

Technical writers spend countless hours manually editing content to aling with organizational style guides. What if an AI could do it for you - accurately and consistently?

In this talk, I'll introduce RapidDoc, an open-source AI model fine-tuned on real-world sofwtare documentation rules to flag and rewrite violations. For example, inconsistency in terminologies used, passive vioce, ambigguous phrasing. I'll walk through how I built a premium dataset, fine-tuned flan-ts-base model, and designed a gradio-based frontend for instant feedback on .md, .docx, .adoc, .pdf files.

Expect, live demos, insights into model training challenges, and a peek on hiw GenAI is reshaping the future of editorial workflows.

Whether you'r an AI practitioner, a documentation lead, or a tooling enthusiast, this talk offers an edd-to-end blueprint on how I'm fine-tuning a model on real-world sofwtare documentation rules.


The Problem: Time-consuming, inconsistent manual editing of style violations in technical documents.
The Solution: RapidDoc — a fine-tuned flan-t5-base model that understands and enforces writing rules.
Dataset Curation: Crafting 1,000+ labeled examples with edge cases, ambiguity handling, and diverse tone/structure.
Model Training: Techniques used on Colab/Kaggle, quantization choices, and tokenizer strategies.
Demo Time: Gradio interface that accepts multiple file formats and outputs clean, rule-based suggestions.
The Future: Expanding to 20+ writing rules, style guide customization, and enterprise integration possibilities.


Target Audience

Beginner

Gaurav Trivedi is a Principal Technical Writer specializing in developer-focused content, API documentation, and AI-assisted editorial tools. He is currently building AI-powered systems that enforce style guide rules through fine-tuned language models. Gaurav is passionate about simplifying complex ideas and frequently speaks about the intersection of technical writing, AI, and automation.