Spaces:
Sleeping
✅ Gemini API Configuration - COMPLETE GUIDE
🎯 Current Status
✅ Google Generative AI SDK: Version 0.8.5 installed
✅ Model Updated: Now using gemini-2.5-flash (stable)
⚠️ API Quota: Currently at limit (wait 20 seconds between calls)
📋 Available Gemini Models (40+ models!)
Your API key has access to these models:
Recommended Models for Legal RAG:
gemini-2.5-flash⭐ [CURRENTLY CONFIGURED]- Stable, fast, and efficient
- Best for production use
- Good balance of speed and quality
gemini-2.5-pro- More powerful reasoning
- Better for complex legal analysis
- Slower but higher quality
gemini-flash-latest- Always points to latest Flash version
- Auto-updates to newest model
gemini-2.0-flash- Alternative stable version
- Slightly older but reliable
All Available Models:
models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-05-20
models/gemini-2.5-flash ⭐ Currently configured
models/gemini-2.5-flash-lite
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-lite
models/gemini-flash-latest
models/gemini-flash-lite-latest
models/gemini-pro-latest
... and 30+ more variants
⚙️ How to Change the Model
Edit /home/neginegi/Desktop/rag/legal-rag-backend/rag_service.py:
geminiModel = genai.GenerativeModel("gemini-2.5-flash") # Change here
Options:
"gemini-2.5-flash"- Fast and efficient (current)"gemini-2.5-pro"- More powerful reasoning"gemini-flash-latest"- Always latest version
🔑 API Quota Information
Your current error shows:
429 You exceeded your current quota
Please retry in 20.181832555s
Free Tier Limits:
- ✓ 15 requests per minute
- ✓ 1500 requests per day
- ✓ 1M tokens per day (input)
To Monitor Usage:
Visit: https://ai.dev/usage?tab=rate-limit
To Increase Limits:
Visit: https://ai.google.dev/pricing
✅ Updated Configuration
Your rag_service.py is now configured with:
geminiModel = genai.GenerativeModel("gemini-2.5-flash")
This should work once your quota resets (wait ~20 seconds).
🧪 Testing Gemini Integration
Run this to test:
cd /home/neginegi/Desktop/rag/legal-rag-backend
python3 check_gemini_models.py
Or test the full pipeline:
python3 test_inference.py
🎉 Summary
✅ SDK upgraded: google-generativeai 0.8.5
✅ Model updated: gemini-2.5-flash (stable)
✅ All 40+ models discovered: Access confirmed
⏳ Quota limit reached: Wait ~20 seconds and retry
Your Legal RAG backend is fully configured and ready!
Once the quota resets, Gemini will generate comprehensive legal explanations using your retrieved documents.