AI learning — UVR Ultimate Vocal Remover vocal removal

Recently, I came across a YouTube video featuring voice replacement. Although the replacement effect is not very good., the separation of voice effect is still good, so I decided to save it. Since this is a GUI-based program, I can run this in docker via VNC。

About running GPU acceleration in NVIDIA Docker, after multiple attempts, I found that using Nvidia docker nvcr.io/nvidia/pytorch 22.12-py3 is working, while 23.08-py3 does not work. I haven’t tried other versions. In addition to running the GPU acceleration, you also need to install a desktop environment and VNC server in Docker. I will explain this later.

Basic Environment Setup

Some basic environment (such as anaconda and shared scripts) have already been set up in【 Shared Operations 】 article, and you can refer to it to ensure that all instructions work correctly.

Create conda environment

Since the dependencies of each project are different, a separate environment will be created for each case here.

UVR – Ultimate Vocal Remover

UVR The project provides the function to remove the human voice and other background sounds, which is the main part of this article to be described.

Install the following package.

Start UVR and model download

Enter GUI environment and start UVR program



Click the “Start Processing” left-side icon to start downloading the model.First, download the VR model

Next, download the Demusc model

 

Speech separation

Afterward, go back to the main screen, select the music file you want to separate, and select the model.

Choose the input file and output directory. Choose the model DEMUCS, method is UVR_MODEL_1, open the GPU acceleration, and then you can start the speech separation process. The separation will generate two files, one for the instrument and one for the voice.

Further separation

Next, you need to remove some echoes from the voice part. Use the following figure and method.

Installation inside Docker

Here is the construction process of the environment in the NVIDIA Docker image “nvcr.io/nvidia/pytorch:22.12-py3”, with the same installation steps as before. It should be noted that you need to use tigervnc-standalone-server and not tightvnc because it does not support xrandr.

In ~/.vnc/xstartup, enter the following content

Then run the command to start the VNC Server

Thus, you can use your VNC client to connect to the docker.

Conclusion

This is the prerequisite part of the voice replacement process, where the voices are separated first. Since the results of the voice replacement are not very good, it may not be necessary to explain it in detail.

 

 

Leave a Reply(Name請以user_開頭,否則會被判定會垃圾息)

請輸入答案 18 ÷ = 2