File size: 5,764 Bytes
c64e820
 
 
 
 
 
 
 
 
 
 
 
 
 
 
07ea386
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
---
title: LegalRagBackend
emoji: "⚖️"
colorFrom: "blue"
colorTo: "purple"
sdk: "docker"
pinned: false
app_file: main.py
---






# Legal RAG Backend

A complete FastAPI-based backend for a Legal RAG (Retrieval Augmented Generation) AI system that provides intelligent legal verdict predictions with comprehensive explanations backed by constitutional provisions, IPC sections, case law, and statutes.

## Overview

This system combines:
- **LegalBERT** fine-tuned model for verdict prediction
- **FAISS** vector search for legal document retrieval
- **Sentence Transformers** (BGE-Large) for semantic embeddings
- **Google Gemini** (optional) for generating detailed explanations
- **HuggingFace Hub** for model and dataset management

## Project Structure

```
legal-rag-backend/

├── main.py              # FastAPI application with REST endpoints
├── model_loader.py      # LegalBERT model loading and inference
├── rag_loader.py        # FAISS indices and chunk loading from HuggingFace
├── rag_service.py       # Core service orchestrating prediction and RAG
├── prompt_builder.py    # Constructs prompts for LLM with legal context
├── utils.py             # Helper utilities for chunk processing
├── requirements.txt     # Python dependencies
├── Dockerfile           # Container configuration
├── .gitignore          # Git ignore patterns
├── README.md           # This file
└── start.sh            # Launch script
```

## Features

### API Endpoints

#### `GET /health`
Health check endpoint.

**Response:**
```json
{
  "status": "ok"
}
```

#### `POST /predict`
Get a quick verdict prediction with confidence score.

**Request:**
```json
{
  "text": "Case description and facts..."
}
```

**Response:**
```json
{
  "verdict": "guilty",
  "confidence": 0.8734
}
```

#### `POST /explain`
Get comprehensive legal analysis with retrieved supporting documents.

**Request:**
```json
{
  "text": "Case description and facts..."
}
```

**Response:**
```json
{
  "verdict": "guilty",
  "confidence": 0.8734,
  "explanation": "Detailed legal analysis...",
  "retrievedChunks": {
    "constitution": [...],
    "ipc": [...],
    "ipcCase": [...],
    "statute": [...],
    "qa": [...],
    "case": [...]
  }
}
```

## Installation & Setup

### Local Development

1. **Clone or create the project:**
   ```bash
   cd /home/neginegi/Desktop/rag/legal-rag-backend
   ```

2. **Create a virtual environment:**
   ```bash
   python3 -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate
   ```

3. **Install dependencies:**
   ```bash
   pip install -r requirements.txt
   ```

4. **Configure environment (optional):**
   Create a `.env` file for Gemini API integration:
   ```bash
   GEMINI_API_KEY=your_api_key_here
   ```

5. **Run the server:**
   ```bash
   chmod +x start.sh
   ./start.sh
   ```
   
   Or directly:
   ```bash
   uvicorn main:app --host 0.0.0.0 --port 7860
   ```

6. **Access the API:**
   - API Documentation: http://localhost:7860/docs
   - Health Check: http://localhost:7860/health

### Docker Deployment

1. **Build the image:**
   ```bash
   docker build -t legal-rag-backend .
   ```

2. **Run the container:**
   ```bash
   docker run -p 7860:7860 -e GEMINI_API_KEY=your_key legal-rag-backend
   ```

3. **Access at:**
   http://localhost:7860

## How It Works

1. **Model Loading**: On startup, the system loads:
   - LegalBERT model (`negi2725/LegalBertNew`)
   - 6 FAISS indices (Constitution, IPC, Cases, Statutes, QA)
   - Corresponding text chunks
   - BGE-Large sentence transformer for embeddings

2. **Prediction Flow**:
   - Input text is tokenized and passed through LegalBERT
   - Softmax applied to get "guilty" or "not guilty" verdict
   - Confidence score extracted from probabilities

3. **RAG Retrieval**:
   - Query text embedded using BGE-Large
   - Top-K similar chunks retrieved from each FAISS index
   - Results organized by legal category

4. **Explanation Generation**:
   - Structured prompt built with case facts, verdict, and retrieved context
   - Optional Gemini API call for natural language explanation
   - Fallback to prompt template if API not configured

## Models & Datasets

- **LegalBERT Model**: `negi2725/LegalBertNew` (HuggingFace)
- **RAG Dataset**: `negi2725/dataRag` (HuggingFace)
- **Embedding Model**: `BAAI/bge-large-en-v1.5` (Sentence Transformers)

## Performance Notes

- All models and indices are preloaded at import time for fast inference
- Async endpoints ensure non-blocking I/O operations
- FAISS uses normalized L2 search for efficient similarity matching
- Typical response time: 1-3 seconds for `/explain` endpoint

## Requirements

- Python 3.10+
- 4GB+ RAM (8GB+ recommended for smooth operation)
- Internet connection for first-time model/dataset downloads

## Troubleshooting

**Models not downloading:**
- Ensure internet connectivity
- Check HuggingFace Hub access
- Models cache in `~/.cache/huggingface/`

**Out of memory:**
- Reduce batch size or top-K retrieval count
- Use CPU-only torch installation
- Consider using smaller embedding models

**Gemini API errors:**
- Verify API key in `.env` file
- System works without Gemini (returns structured prompt)
- Check API quota and rate limits

## Development

The codebase follows these conventions:
- CamelCase for variable names
- Minimal inline comments (self-documenting code)
- Async/await for all FastAPI endpoints
- Type hints for function signatures

## License

This project is for educational and research purposes.

## Support

For issues or questions, please refer to the HuggingFace model and dataset pages:
- https://huggingface.co/negi2725/LegalBertNew
- https://huggingface.co/datasets/negi2725/dataRag