Spaces:

negi2725
/

LegalRagBackend

Sleeping

File size: 5,764 Bytes

---
title: LegalRagBackend
emoji: "⚖️"
colorFrom: "blue"
colorTo: "purple"
sdk: "docker"
pinned: false
app_file: main.py
---






# Legal RAG Backend

A complete FastAPI-based backend for a Legal RAG (Retrieval Augmented Generation) AI system that provides intelligent legal verdict predictions with comprehensive explanations backed by constitutional provisions, IPC sections, case law, and statutes.

## Overview

This system combines:
- **LegalBERT** fine-tuned model for verdict prediction
- **FAISS** vector search for legal document retrieval
- **Sentence Transformers** (BGE-Large) for semantic embeddings
- **Google Gemini** (optional) for generating detailed explanations
- **HuggingFace Hub** for model and dataset management

## Project Structure

```
legal-rag-backend/
│
├── main.py              # FastAPI application with REST endpoints
├── model_loader.py      # LegalBERT model loading and inference
├── rag_loader.py        # FAISS indices and chunk loading from HuggingFace
├── rag_service.py       # Core service orchestrating prediction and RAG
├── prompt_builder.py    # Constructs prompts for LLM with legal context
├── utils.py             # Helper utilities for chunk processing
├── requirements.txt     # Python dependencies
├── Dockerfile           # Container configuration
├── .gitignore          # Git ignore patterns
├── README.md           # This file
└── start.sh            # Launch script
```

## Features

### API Endpoints

#### `GET /health`
Health check endpoint.

**Response:**
```json
{
  "status": "ok"
}
```

#### `POST /predict`
Get a quick verdict prediction with confidence score.

**Request:**
```json
{
  "text": "Case description and facts..."
}
```

**Response:**
```json
{
  "verdict": "guilty",
  "confidence": 0.8734
}
```

#### `POST /explain`
Get comprehensive legal analysis with retrieved supporting documents.

**Request:**
```json
{
  "text": "Case description and facts..."
}
```

**Response:**
```json
{
  "verdict": "guilty",
  "confidence": 0.8734,
  "explanation": "Detailed legal analysis...",
  "retrievedChunks": {
    "constitution": [...],
    "ipc": [...],
    "ipcCase": [...],
    "statute": [...],
    "qa": [...],
    "case": [...]
  }
}
```

## Installation & Setup

### Local Development

1. **Clone or create the project:**
   ```bash
   cd /home/neginegi/Desktop/rag/legal-rag-backend
   ```

2. **Create a virtual environment:**
   ```bash
   python3 -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate
   ```

3. **Install dependencies:**
   ```bash
   pip install -r requirements.txt
   ```

4. **Configure environment (optional):**
   Create a `.env` file for Gemini API integration:
   ```bash
   GEMINI_API_KEY=your_api_key_here
   ```

5. **Run the server:**
   ```bash
   chmod +x start.sh
   ./start.sh
   ```
   
   Or directly:
   ```bash
   uvicorn main:app --host 0.0.0.0 --port 7860
   ```

6. **Access the API:**
   - API Documentation: http://localhost:7860/docs
   - Health Check: http://localhost:7860/health

### Docker Deployment

1. **Build the image:**
   ```bash
   docker build -t legal-rag-backend .
   ```

2. **Run the container:**
   ```bash
   docker run -p 7860:7860 -e GEMINI_API_KEY=your_key legal-rag-backend
   ```

3. **Access at:**
   http://localhost:7860

## How It Works

1. **Model Loading**: On startup, the system loads:
   - LegalBERT model (`negi2725/LegalBertNew`)
   - 6 FAISS indices (Constitution, IPC, Cases, Statutes, QA)
   - Corresponding text chunks
   - BGE-Large sentence transformer for embeddings

2. **Prediction Flow**:
   - Input text is tokenized and passed through LegalBERT
   - Softmax applied to get "guilty" or "not guilty" verdict
   - Confidence score extracted from probabilities

3. **RAG Retrieval**:
   - Query text embedded using BGE-Large
   - Top-K similar chunks retrieved from each FAISS index
   - Results organized by legal category

4. **Explanation Generation**:
   - Structured prompt built with case facts, verdict, and retrieved context
   - Optional Gemini API call for natural language explanation
   - Fallback to prompt template if API not configured

## Models & Datasets

- **LegalBERT Model**: `negi2725/LegalBertNew` (HuggingFace)
- **RAG Dataset**: `negi2725/dataRag` (HuggingFace)
- **Embedding Model**: `BAAI/bge-large-en-v1.5` (Sentence Transformers)

## Performance Notes

- All models and indices are preloaded at import time for fast inference
- Async endpoints ensure non-blocking I/O operations
- FAISS uses normalized L2 search for efficient similarity matching
- Typical response time: 1-3 seconds for `/explain` endpoint

## Requirements

- Python 3.10+
- 4GB+ RAM (8GB+ recommended for smooth operation)
- Internet connection for first-time model/dataset downloads

## Troubleshooting

**Models not downloading:**
- Ensure internet connectivity
- Check HuggingFace Hub access
- Models cache in `~/.cache/huggingface/`

**Out of memory:**
- Reduce batch size or top-K retrieval count
- Use CPU-only torch installation
- Consider using smaller embedding models

**Gemini API errors:**
- Verify API key in `.env` file
- System works without Gemini (returns structured prompt)
- Check API quota and rate limits

## Development

The codebase follows these conventions:
- CamelCase for variable names
- Minimal inline comments (self-documenting code)
- Async/await for all FastAPI endpoints
- Type hints for function signatures

## License

This project is for educational and research purposes.

## Support

For issues or questions, please refer to the HuggingFace model and dataset pages:
- https://huggingface.co/negi2725/LegalBertNew
- https://huggingface.co/datasets/negi2725/dataRag