A modular Retrieval-based Voice Conversion framework with Gradio UI, training capabilities, and audio processing tools
- Voice Conversion: High-quality voice conversion with multiple pitch extraction methods
- Model Training: Complete training pipeline for creating custom RVC models
- Real-time Processing: Low-latency real-time voice conversion support
- Web UI: Intuitive Gradio-based web interface
- CLI Support: Command-line interface for scripting and automation
- API Access: Python API for programmatic access
- Audio Separation: Built-in tools for vocal/instrument separation
- Text-to-Speech: Integration with edge-tts for TTS-based voice conversion
pip install git+https://github.com/ArkanDash/Advanced-RVC-Inference.gitFor CUDA-enabled GPUs:
pip install git+https://github.com/ArkanDash/Advanced-RVC-Inference.git#egg=advanced-rvc-inference[gpu]git clone https://github.com/ArkanDash/Advanced-RVC-Inference.git
cd Advanced-RVC-Inference
pip install -e .pip install git+https://github.com/ArkanDash/Advanced-RVC-Inference.git#egg=advanced-rvc-inference[dev]Launch the Gradio web UI:
rvc-gui
# or
python -m advanced_rvc_inference.guiThe web interface will be available at http://localhost:7860
Run voice conversion from the command line:
rvc-cli infer --model path/to/model.pth --input audio.wav --output converted.wav --pitch 0View help:
rvc-cli --help
rvc-cli infer --helpfrom advanced_rvc_inference import RVCInference
# Initialize the inference engine
rvc = RVCInference(device="cuda:0")
# Load a model
rvc.load_model("path/to/model.pth")
# Run inference
audio = rvc.infer("input.wav", pitch_change=0, output_path="output.wav")
# Or use batch processing
audio_files = rvc.infer_batch(
input_dir="input_folder",
output_dir="output_folder",
pitch_change=2,
format="wav"
)
# Cleanup
rvc.unload_model()| Command | Description |
|---|---|
rvc-cli infer |
Run voice conversion inference |
rvc-cli train |
Train RVC models (use web UI) |
rvc-cli serve |
Launch the web interface |
rvc-cli version |
Show version information |
rvc-cli info |
Show system information |
rvc-cli infer \
--model MODEL.pth \
--input input.wav \
--output output.wav \
--pitch 0 \
--format wav \
--index INDEX.index| Variable | Description | Default |
|---|---|---|
ARVC_ASSETS_PATH |
Path to asset directory | Package assets folder |
ARVC_CONFIGS_PATH |
Path to configs directory | Package configs folder |
ARVC_WEIGHTS_PATH |
Path to model weights | assets/weights |
ARVC_LOGS_PATH |
Path to logs directory | assets/logs |
Configuration is managed through advanced_rvc_inference/configs/config.json:
{
"device": "cuda:0",
"fp16": true,
"app_port": 7860,
"language": "vi-VN",
"theme": "NoCrypt/miku"
}- Python 3.10+
- PyTorch 2.3.1+
- torchaudio 2.3.1+
- NumPy, SciPy
- librosa (audio processing)
- Gradio (web UI)
- onnxruntime-gpu (GPU inference acceleration)
- faiss-gpu (vector similarity search)
- tensorboard (training visualization)
See pyproject.toml for the complete dependency list.
Ensure you have CUDA installed and PyTorch with CUDA support:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118Reduce batch size or use CPU mode:
rvc = RVCInference(device="cpu")Contributions are welcome! Please read our Contributing Guide for details.
This project is licensed under the MIT License - see the LICENSE file for details.
The use of the converted voice for the following purposes is prohibited:
- Criticizing or attacking individuals
- Advocating for or opposing specific political positions, religions, or ideologies
- Publicly displaying strongly stimulating expressions without proper zoning
- Selling of voice models and generated voice clips
- Impersonation of the original owner of the voice with malicious intentions
- Fraudulent purposes that lead to identity theft or fraudulent phone calls
| Repository | Owner |
|---|---|
| Vietnamese-RVC | PhamHuynhAnh16 |
| Applio | IAHispano |
For issues and feature requests, please use the GitHub Issues page.
Made with by ArkanDash