Spaces:
Running
on
Zero
Running
on
Zero
A newer version of the Gradio SDK is available:
6.2.0
metadata
title: LEMM Training Data & LoRA Storage
tags:
- music-generation
- audio
- lora
- training-data
- diffrhythm2
license: mit
LEMM Dataset Storage
This dataset repository stores training data and LoRA adapters for LEMM (Let Everyone Make Music) - an advanced AI music generation system.
π― Purpose
This repository serves as persistent storage for:
- LoRA Adapters: Fine-tuned music generation models
- Prepared Datasets: Training data extracted from various music datasets
- Cross-rebuild Persistence: Data survives HuggingFace Space rebuilds
π Repository Structure
lemm-dataset/
βββ loras/ # LoRA adapter storage
β βββ {lora_name}/ # Each LoRA in its own folder
β β βββ final_model.pt # Trained LoRA weights
β β βββ config.yaml # Training configuration
β βββ ...
β
βββ datasets/ # Prepared training datasets
βββ {dataset_key}/ # Each dataset in its own folder
β βββ train/ # Training samples
β βββ val/ # Validation samples
β βββ metadata.json # Dataset metadata
βββ ...
π Automatic Sync
The LEMM Space automatically:
- Downloads all LoRAs and datasets on startup
- Uploads newly trained LoRAs after training completes
- Uploads newly prepared datasets after preparation
π Access Control
- Visibility: Public (anyone can view)
- Access Requests: Enabled with automatic approval
- Purpose: Allows LEMM Space to read/write data
π Usage
From LEMM Space
Data syncs automatically - no manual intervention needed.
From Your Own Code
from huggingface_hub import hf_hub_download, snapshot_download
# Download a specific LoRA
lora_path = snapshot_download(
repo_id="Gamahea/lemm-dataset",
repo_type="dataset",
allow_patterns="loras/your_lora_name/*"
)
# Download all datasets
datasets_path = snapshot_download(
repo_id="Gamahea/lemm-dataset",
repo_type="dataset",
allow_patterns="datasets/*"
)
π Supported Datasets
LEMM can prepare and train on:
- GTZAN: Music genre classification dataset
- MusicCaps: Google's music captioning dataset
- Free Music Archive (FMA): Large-scale music dataset
- Custom datasets: Upload your own music collections
π΅ LoRA Training
LoRA (Low-Rank Adaptation) allows efficient fine-tuning of DiffRhythm2 for:
- Specific music styles
- Genre specialization
- Artist emulation
- Custom sound aesthetics
π οΈ Related Projects
- LEMM Space: Gamahea/lemm-test-100
- DiffRhythm2: Advanced music generation with built-in vocals
π License
MIT License - Feel free to use and modify
π€ Contributing
This is a storage repository. To contribute to LEMM:
- Visit the LEMM Space
- Train your own LoRAs
- Share your results with the community
β οΈ Notes
- Data is organized for LEMM's automatic sync system
- Manual edits may be overwritten by Space operations
- Each LoRA/dataset includes configuration metadata
- Storage persists across Space rebuilds