Experimental Models hasankursun/Lumees-3.8B-Reasoning Text Generation • 4B • Updated Nov 23, 2025 • 5 • 2
Global Corpus hasankursun/turkish-corpus-100b Viewer • Updated about 12 hours ago • 107M • 1.21k • 7 hasankursun/multilingual-safety-classification-dataset Viewer • Updated about 11 hours ago • 213k • 6.1k • 4 hasankursun/bulgarian-corpus-33b Viewer • Updated about 11 hours ago • 34.9M • 271 • 4 hasankursun/dutch-corpus-200b Viewer • Updated about 12 hours ago • 170M • 452 • 4
hasankursun/multilingual-safety-classification-dataset Viewer • Updated about 11 hours ago • 213k • 6.1k • 4
Turkish Retrieval Datasets lumees/ms-marco-tr-hard-negatives Viewer • Updated Nov 27, 2025 • 786k • 22 • 2 hasankursun/wikipedia-turkish-synthetic-query Viewer • Updated Nov 28, 2025 • 19.8k • 10 • 3
Retrieval Models hasankursun/matryoshka-embedding-v1 Sentence Similarity • 0.6B • Updated about 12 hours ago • 224 • 3 lumees/lumees-matryoshka-vision-embedding-v1 Feature Extraction • Updated Nov 26, 2025 • 4 • 3 lumees/aethel-reranker-en-v1 Text Ranking • 0.1B • Updated Nov 20, 2025 • 6 • 3
hasankursun/matryoshka-embedding-v1 Sentence Similarity • 0.6B • Updated about 12 hours ago • 224 • 3
Code Retrieval Datasets hasankursun/codesearchnet-hard-negatives Viewer • Updated Nov 28, 2025 • 955k • 40 • 3
Safety & Moderation Datasets Comprehensive collection of high-quality multilingual datasets for NLP research and production. hasankursun/multilingual-safety-classification-dataset Viewer • Updated about 11 hours ago • 213k • 6.1k • 4 hasankursun/age-specific-text-simplification Viewer • Updated Aug 13, 2025 • 17.2k • 48 • 2
hasankursun/multilingual-safety-classification-dataset Viewer • Updated about 11 hours ago • 213k • 6.1k • 4
Experimental Models hasankursun/Lumees-3.8B-Reasoning Text Generation • 4B • Updated Nov 23, 2025 • 5 • 2
Retrieval Models hasankursun/matryoshka-embedding-v1 Sentence Similarity • 0.6B • Updated about 12 hours ago • 224 • 3 lumees/lumees-matryoshka-vision-embedding-v1 Feature Extraction • Updated Nov 26, 2025 • 4 • 3 lumees/aethel-reranker-en-v1 Text Ranking • 0.1B • Updated Nov 20, 2025 • 6 • 3
hasankursun/matryoshka-embedding-v1 Sentence Similarity • 0.6B • Updated about 12 hours ago • 224 • 3
Global Corpus hasankursun/turkish-corpus-100b Viewer • Updated about 12 hours ago • 107M • 1.21k • 7 hasankursun/multilingual-safety-classification-dataset Viewer • Updated about 11 hours ago • 213k • 6.1k • 4 hasankursun/bulgarian-corpus-33b Viewer • Updated about 11 hours ago • 34.9M • 271 • 4 hasankursun/dutch-corpus-200b Viewer • Updated about 12 hours ago • 170M • 452 • 4
hasankursun/multilingual-safety-classification-dataset Viewer • Updated about 11 hours ago • 213k • 6.1k • 4
Code Retrieval Datasets hasankursun/codesearchnet-hard-negatives Viewer • Updated Nov 28, 2025 • 955k • 40 • 3
Turkish Retrieval Datasets lumees/ms-marco-tr-hard-negatives Viewer • Updated Nov 27, 2025 • 786k • 22 • 2 hasankursun/wikipedia-turkish-synthetic-query Viewer • Updated Nov 28, 2025 • 19.8k • 10 • 3
Safety & Moderation Datasets Comprehensive collection of high-quality multilingual datasets for NLP research and production. hasankursun/multilingual-safety-classification-dataset Viewer • Updated about 11 hours ago • 213k • 6.1k • 4 hasankursun/age-specific-text-simplification Viewer • Updated Aug 13, 2025 • 17.2k • 48 • 2
hasankursun/multilingual-safety-classification-dataset Viewer • Updated about 11 hours ago • 213k • 6.1k • 4