2025-09-15 –, Track 1
Ragas is a comprehensive evaluation toolkit designed to supercharge Large Language Model (LLM) application evaluations. Ragas provides objective metrics, intelligent test generation capabilities, and data-driven insights to help developers move away from time-consuming, subjective assessments toward efficient, quantifiable evaluation workflows.
Setup
- Environment Setup
git clone https://github.com/explodinggradients/ragas.git
cd ragas
Install uv (recommended package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
- Installation
make install
- Run tests to ensure everything works
make test
- Explore Available Commands
make help
Important Links
- GitHub Repository to fork and clone
- Documentation to go through in depth.
- Issues & Bug to work on
- Discord Community
- Blog
Development Workflow
- Daily Development:
make format
,make type
,make test
- Before PR Submission:
make run-ci
- Code Quality: Automated formatting with ruff, type checking with pyright
make check
- Testing: Comprehensive unit tests, e2e tests, and benchmarks
make test
Key Development Areas to focus on:
- Core Metrics: Evaluation metrics for LLM applications
- Test Generation: Automated test data creation
- Integrations: Framework connectors and observability tools
- Documentation: Tutorials, guides, and examples
- Python 3.9+ installed on your system
- Git for version control
- Basic Python knowledge and familiarity with virtual environments
- LLM/ML concepts understanding (helpful but not required)
- Optional: Experience with evaluation frameworks, testing, or LLM applications
Intermediate
Product and Tech consultant
maintainer from Ragas