LegalRagBackend / GEMINI_API_SETUP.md
negi2725's picture
Upload 9 files
884868c verified

✅ Gemini API Configuration - COMPLETE GUIDE

🎯 Current Status

Google Generative AI SDK: Version 0.8.5 installed
Model Updated: Now using gemini-2.5-flash (stable)
⚠️ API Quota: Currently at limit (wait 20 seconds between calls)


📋 Available Gemini Models (40+ models!)

Your API key has access to these models:

Recommended Models for Legal RAG:

  1. gemini-2.5-flash[CURRENTLY CONFIGURED]

    • Stable, fast, and efficient
    • Best for production use
    • Good balance of speed and quality
  2. gemini-2.5-pro

    • More powerful reasoning
    • Better for complex legal analysis
    • Slower but higher quality
  3. gemini-flash-latest

    • Always points to latest Flash version
    • Auto-updates to newest model
  4. gemini-2.0-flash

    • Alternative stable version
    • Slightly older but reliable

All Available Models:

models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-05-20
models/gemini-2.5-flash                    ⭐ Currently configured
models/gemini-2.5-flash-lite
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-lite
models/gemini-flash-latest
models/gemini-flash-lite-latest
models/gemini-pro-latest
... and 30+ more variants

⚙️ How to Change the Model

Edit /home/neginegi/Desktop/rag/legal-rag-backend/rag_service.py:

geminiModel = genai.GenerativeModel("gemini-2.5-flash")  # Change here

Options:

  • "gemini-2.5-flash" - Fast and efficient (current)
  • "gemini-2.5-pro" - More powerful reasoning
  • "gemini-flash-latest" - Always latest version

🔑 API Quota Information

Your current error shows:

429 You exceeded your current quota
Please retry in 20.181832555s

Free Tier Limits:

  • ✓ 15 requests per minute
  • ✓ 1500 requests per day
  • ✓ 1M tokens per day (input)

To Monitor Usage:

Visit: https://ai.dev/usage?tab=rate-limit

To Increase Limits:

Visit: https://ai.google.dev/pricing


✅ Updated Configuration

Your rag_service.py is now configured with:

geminiModel = genai.GenerativeModel("gemini-2.5-flash")

This should work once your quota resets (wait ~20 seconds).


🧪 Testing Gemini Integration

Run this to test:

cd /home/neginegi/Desktop/rag/legal-rag-backend
python3 check_gemini_models.py

Or test the full pipeline:

python3 test_inference.py

🎉 Summary

SDK upgraded: google-generativeai 0.8.5
Model updated: gemini-2.5-flash (stable)
All 40+ models discovered: Access confirmed
Quota limit reached: Wait ~20 seconds and retry

Your Legal RAG backend is fully configured and ready!

Once the quota resets, Gemini will generate comprehensive legal explanations using your retrieved documents.