Spaces:

negi2725
/

LegalRagBackend

Sleeping

App Files Files Community

LegalRagBackend / README.md

negi2725

Update README.md

c64e820 verified about 1 month ago

preview code

raw

history blame contribute delete

5.76 kB

metadata

title: LegalRagBackend
emoji: ⚖️
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
app_file: main.py

Legal RAG Backend

A complete FastAPI-based backend for a Legal RAG (Retrieval Augmented Generation) AI system that provides intelligent legal verdict predictions with comprehensive explanations backed by constitutional provisions, IPC sections, case law, and statutes.

Overview

This system combines:

LegalBERT fine-tuned model for verdict prediction
FAISS vector search for legal document retrieval
Sentence Transformers (BGE-Large) for semantic embeddings
Google Gemini (optional) for generating detailed explanations
HuggingFace Hub for model and dataset management

Project Structure

legal-rag-backend/
│
├── main.py              # FastAPI application with REST endpoints
├── model_loader.py      # LegalBERT model loading and inference
├── rag_loader.py        # FAISS indices and chunk loading from HuggingFace
├── rag_service.py       # Core service orchestrating prediction and RAG
├── prompt_builder.py    # Constructs prompts for LLM with legal context
├── utils.py             # Helper utilities for chunk processing
├── requirements.txt     # Python dependencies
├── Dockerfile           # Container configuration
├── .gitignore          # Git ignore patterns
├── README.md           # This file
└── start.sh            # Launch script

Features

API Endpoints

`GET /health`

Health check endpoint.

Response:

{
  "status": "ok"
}

`POST /predict`

Get a quick verdict prediction with confidence score.

Request:

{
  "text": "Case description and facts..."
}

Response:

{
  "verdict": "guilty",
  "confidence": 0.8734
}

`POST /explain`

Get comprehensive legal analysis with retrieved supporting documents.

Request:

{
  "text": "Case description and facts..."
}

Response:

{
  "verdict": "guilty",
  "confidence": 0.8734,
  "explanation": "Detailed legal analysis...",
  "retrievedChunks": {
    "constitution": [...],
    "ipc": [...],
    "ipcCase": [...],
    "statute": [...],
    "qa": [...],
    "case": [...]
  }
}

Installation & Setup

Local Development

Clone or create the project:

cd /home/neginegi/Desktop/rag/legal-rag-backend

Create a virtual environment:

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Configure environment (optional): Create a .env file for Gemini API integration:
```
GEMINI_API_KEY=your_api_key_here
```

Run the server:

chmod +x start.sh
./start.sh

Or directly:

uvicorn main:app --host 0.0.0.0 --port 7860

Access the API:
- API Documentation: http://localhost:7860/docs
- Health Check: http://localhost:7860/health

Docker Deployment

Build the image:
```
docker build -t legal-rag-backend .
```

Run the container:

docker run -p 7860:7860 -e GEMINI_API_KEY=your_key legal-rag-backend

Access at: http://localhost:7860

How It Works

Model Loading: On startup, the system loads:
- LegalBERT model (negi2725/LegalBertNew)
- 6 FAISS indices (Constitution, IPC, Cases, Statutes, QA)
- Corresponding text chunks
- BGE-Large sentence transformer for embeddings
Prediction Flow:
- Input text is tokenized and passed through LegalBERT
- Softmax applied to get "guilty" or "not guilty" verdict
- Confidence score extracted from probabilities
RAG Retrieval:
- Query text embedded using BGE-Large
- Top-K similar chunks retrieved from each FAISS index
- Results organized by legal category
Explanation Generation:
- Structured prompt built with case facts, verdict, and retrieved context
- Optional Gemini API call for natural language explanation
- Fallback to prompt template if API not configured

Models & Datasets

LegalBERT Model: negi2725/LegalBertNew (HuggingFace)
RAG Dataset: negi2725/dataRag (HuggingFace)
Embedding Model: BAAI/bge-large-en-v1.5 (Sentence Transformers)

Performance Notes

All models and indices are preloaded at import time for fast inference
Async endpoints ensure non-blocking I/O operations
FAISS uses normalized L2 search for efficient similarity matching
Typical response time: 1-3 seconds for /explain endpoint

Requirements

Python 3.10+
4GB+ RAM (8GB+ recommended for smooth operation)
Internet connection for first-time model/dataset downloads

Troubleshooting

Models not downloading:

Ensure internet connectivity
Check HuggingFace Hub access
Models cache in ~/.cache/huggingface/

Out of memory:

Reduce batch size or top-K retrieval count
Use CPU-only torch installation
Consider using smaller embedding models

Gemini API errors:

Verify API key in .env file
System works without Gemini (returns structured prompt)
Check API quota and rate limits

Development

The codebase follows these conventions:

CamelCase for variable names
Minimal inline comments (self-documenting code)
Async/await for all FastAPI endpoints
Type hints for function signatures

License

This project is for educational and research purposes.

Support

For issues or questions, please refer to the HuggingFace model and dataset pages: