Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UltraSinger uppercase letter sentencing and batch GUI's #117

Open
1Larsolof opened this issue Jan 12, 2024 · 6 comments
Open

UltraSinger uppercase letter sentencing and batch GUI's #117

1Larsolof opened this issue Jan 12, 2024 · 6 comments

Comments

@1Larsolof
Copy link

1Larsolof commented Jan 12, 2024

I've been having issues with weird sentencing results from UltraSinger where if there are too many words together, where it makes it all one sentence.

I've "made" a really simple python script using tkinter for a gui simply to check if there is a line with a capital letter and if there is a "-" already placed a couple of lines above, if there is it won't add a break but if there isn't it will. There is also an option to delete every line with a "-" so it can start out fresh. You can input multiple files at the same time and all modified files will be saved as "Modified_name.txt" (EDIT: Forgot to mention it get's the timing from the previous start time + Duration)

The batch gui simply runs the command "scr/ultrasinger.py" -i -o and waits for end promt. make sure to place it above the src folder.

Cap and Batch.zip

CAP2:

#Cap V2
import tkinter as tk
from tkinter import filedialog, messagebox
from pathlib import Path

def calculate_time_variable(line):
    _, time_info = line.split(" ", 1)
    time_parts = time_info.split()
    if len(time_parts) >= 2:
        start_time, duration = map(int, time_parts[:2])
        return start_time + duration
    return 0  # Return 0 if there are not enough values

def process_file(file_path, delete_lines_flag):
    with open(file_path, 'r') as file:
        lines = file.readlines()

    modified_lines = []
    check_next_line = False
    last_hyphen_index = -3  # Initialize to a value that won't interfere with the first check

    for i, line in enumerate(lines):
        if delete_lines_flag and line.startswith('-'):
            continue  # Skip lines starting with "-" if the checkbox is selected

        if check_next_line and '-' not in line and i - last_hyphen_index >= 3:
            if any(char.isupper() for char in line.strip()):
                modified_lines.append('- ' + str(calculate_time_variable(modified_lines[-1])) + "\n")
                last_hyphen_index = i

        modified_lines.append(line)
        check_next_line = False

        if line.startswith(':'):
            check_next_line = True

    output_path = Path(file_path).parent / ('Modified_' + Path(file_path).name)
    with open(output_path, 'w') as modified_file:
        modified_file.write(''.join(modified_lines))

    return output_path

def browse_files():
    file_paths = filedialog.askopenfilenames(filetypes=[("Text files", "*.txt")])
    for file_path in file_paths:
        files_listbox.insert(tk.END, file_path)

def show_notification(output_path):
    messagebox.showinfo("Processing Completed", f"Processing completed. Output file: {output_path}")

def process_files():
    selected_files = files_listbox.get(0, tk.END)
    delete_lines_flag = delete_lines_var.get()  # Get the state of the checkbox

    for file_path in selected_files:
        output_path = process_file(file_path, delete_lines_flag)
        show_notification(output_path)

# GUI setup
root = tk.Tk()
root.title("Karaoke Song Processor")

# Listbox to display selected files
files_listbox = tk.Listbox(root, selectmode=tk.MULTIPLE, width=50)
files_listbox.pack(pady=10)

# Checkbox to enable or disable deletion of lines starting with "-"
delete_lines_var = tk.BooleanVar()
delete_lines_checkbox = tk.Checkbutton(root, text="Delete lines starting with '-'", variable=delete_lines_var)
delete_lines_checkbox.pack()

# Browse button to select files
browse_button = tk.Button(root, text="Browse Files", command=browse_files)
browse_button.pack()

# Process button to modify files
process_button = tk.Button(root, text="Process Files", command=process_files)
process_button.pack(pady=10)

# Start the Tkinter event loop
root.mainloop()

image

Batch:

#LiteGUIbatch V3
import tkinter as tk
from tkinter import filedialog, ttk, messagebox
import os
import subprocess
import threading

def run_ultra_singer():
    def process_files():
        input_files = input_file_text.get("1.0", tk.END).splitlines()
        output_folder = output_folder_entry.get()

        # Configure the determinate progress bar
        progress_bar["value"] = 0
        progress_bar["maximum"] = len(input_files)

        # Configure the indeterminate progress bar
        indeterminate_progress_bar.start()

        for index, input_file in enumerate(input_files, start=1):
            command = [
                "python3",
                "src/UltraSinger.py",
                "-i",
                input_file,
                "-o",
                output_folder
            ]

            process = subprocess.Popen(command, shell=False)
            process.wait()

            # Remove the processed file from the text box
            input_file_text.delete("1.0", "2.0")

            # Update the determinate progress bar
            progress_bar.step(1)

            # Update the percentage label
            percentage_text.set(f"{(index / len(input_files)) * 100:.2f}%")

        # All files processed, stop both progress bars
        indeterminate_progress_bar.stop()
        progress_bar.stop()

        # Reset the percentage label
        percentage_text.set("")

        # Show notification
        show_notification("Processing Complete", "All files have been processed.")

    # Create a separate thread for file processing
    processing_thread = threading.Thread(target=process_files)
    processing_thread.start()

def show_notification(title, message):
    messagebox.showinfo(title, message)

def browse_input_file():
    file_paths = filedialog.askopenfilenames(
        initialdir=os.getcwd(),
        title="Select Input Files",
        filetypes=(("Audio Files", ".mp3 .wav .ogg"),)
    )
    
    for file_path in file_paths:
        input_file_text.insert(tk.END, file_path + "\n")

def browse_output_folder():
    folder_path = filedialog.askdirectory(
        initialdir=os.getcwd(),
        title="Select Output Folder"
    )
    output_folder_entry.delete(0, tk.END)
    output_folder_entry.insert(0, folder_path)

root = tk.Tk()
root.title("UltraSinger GUI Lite Batch")

# Set the window size with a 16:10 aspect ratio
window_width = 800
window_height = int(window_width * 10 / 16)
root.geometry(f"{window_width}x{window_height}")

# Determinate Progress Bar
progress_bar = ttk.Progressbar(root, mode="determinate", length=780)
progress_bar.grid(row=0, column=0, padx=10, pady=10, columnspan=3, sticky="w")  # Adjusted row

# Percentage Label
Percentageadd_label = tk.Label(root, 
	text="              Finnished", 
	font=("Helvetica", 16)
)
Percentageadd_label.grid(row=1, column=1, padx=0, pady=10, sticky="w")
percentage_text = tk.StringVar()
percentage_text.set("0.00%")
percentage_label = tk.Label(
    root,
    textvariable=percentage_text,
    font=("Helvetica", 16),  # Change the font size and family as needed
    width=5  # Adjust width to center the label
)
percentage_label.grid(row=1, column=1, padx=10, pady=10, sticky="w")  # Adjusted properties

# Input File
input_file_label = tk.Label(root, text="Input Files/URLs:")
input_file_label.grid(row=2, column=0, padx=10, pady=5, sticky="nw")
input_file_text = tk.Text(root, wrap="none", height=20, width=59)
input_file_text.grid(row=2, column=1, padx=0, pady=0, sticky="w")
browse_input_button = tk.Button(root, text="Browse", command=browse_input_file, width=15, height=4)
browse_input_button.grid(row=2, column=2, padx=30, pady=0, sticky="w")

# Output Folder
output_folder_label = tk.Label(root, text="Output Folder:")
output_folder_label.grid(row=3, column=0, padx=10, pady=5, sticky="w")
output_folder_entry = tk.Entry(root, width=45)
output_folder_entry.grid(row=3, column=1, padx=0, pady=0, sticky="w")
browse_output_button = tk.Button(root, text="Browse", command=browse_output_folder, width=15, height=4)
browse_output_button.grid(row=3, column=2, padx=30, pady=20, sticky="w")

# Indeterminate Progress Bar
indeterminate_progress_bar = ttk.Progressbar(root, mode="indeterminate", length=500)
indeterminate_progress_bar.grid(row=4, column=0, padx=0, pady=0, columnspan=2, sticky="ns")  # Adjusted row

# Run Button
run_button = tk.Button(root, text="Run UltraSinger", command=run_ultra_singer, width=20, height=2)
run_button.grid(row=4, column=2, padx=30, pady=0, columnspan=2, sticky="e")  # Adjusted row

# Start the Tkinter main loop
root.mainloop()

image

@rakuri255
Copy link
Owner

rakuri255 commented Jan 13, 2024

Thanks for gui and your work!

Can you give an example to better understand the problem? For example, you could add an snippet from the plot. Just use --plot Ture

@1Larsolof
Copy link
Author

1Larsolof commented Jan 13, 2024

Thanks for gui and your work!

Can you give an example to better understand the problem? For example, you could add an snippet from the plot. Just use --plot Ture

01 - Lace It.txt
plot
[https://drive.google.com/file/d/163ZLa46EB27Rfss4Q7PDYlT8_Pve1EFm/view?usp=sharing](https://drive.google.com/file/d/163ZLa46EB27Rfss4Q7PDYlT8_Pve1EFm/view?usp=drive_link)

I'm using an m1 mac by the way

@1Larsolof
Copy link
Author

Also here's the output from the console:

(ultrasinger) albinandreasson@albins-MBP UltraSinger % python3 src/ultrasinger.py -i input/"01 - Lace It.mp3" --plot True
[UltraSinger] 0:00:05.131 - Initialized...
/Users/albinandreasson/anaconda3/envs/ultrasinger/lib/python3.10/site-packages/pyannote/audio/core/io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
torchaudio.set_audio_backend("soundfile")
/Users/albinandreasson/anaconda3/envs/ultrasinger/lib/python3.10/site-packages/torch_audiomentations/utils/io.py:27: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
torchaudio.set_audio_backend("soundfile")
torchvision is not available - cannot save figures

[UltraSinger] *****************************
[UltraSinger] UltraSinger Version: 0.0.3
[UltraSinger] *****************************
[UltraSinger] Checking GPU support for tensorflow and pytorch.
[UltraSinger] tensorflow - there are no cuda devices available -> Using cpu.
[UltraSinger] pytorch - there are no cuda devices available -> Using cpu.
[UltraSinger] full automatic mode
[UltraSinger] Searching song in musicbrainz
[UltraSinger] cant find title lace in 01 it
[UltraSinger] No match found
[UltraSinger] Creating output folder. -> input/output/01 - Lace It
[UltraSinger] Creating output folder. -> input/output/01 - Lace It/cache
[UltraSinger] Separating vocals from audio with demucs and cpu as worker.
Important: the default model was recently changed to htdemucs the latest Hybrid Transformer Demucs model. In some cases, this model can actually perform worse than previous models. To get back the old default model use -n mdx_extra_q.
Selected model is a bag of 1 models. You will see that many progress bars per track.
Separated tracks will be stored in /Users/albinandreasson/Documents/Melody Mania Songs/UltraSinger/separated/htdemucs
Separating track input/output/01 - Lace It/01 - Lace It.mp3
0%| | 0.0/ 3%|█▌ | 5.85/222.2999 5%|███▏ | 11.7/222.2999 8%|███▋ | 17.549999999999997/222.2999 11%|██████▎ | 23.4/222.2999 13%|███████▊ | 29.25/222.2999 16%|███████▎ | 35.099999999999994/222.2999 18%|████████▍ | 40.949999999999996/222.2999 21%|████████████▋ | 46.8/222.2999 24%|█████████████▉ | 52.65/222.2999 26%|███████████████▊ | 58.5/222.2999 29%|█████████████████ | 64.35/222.2999 32%|██████████████▊ | 70.19999999999999/222.2999 34%|████████████████████▏ | 76.05/222.2999 37%|█████████████████▎ | 81.89999999999999/222.2999 39%|███████████████████████▎ | 87.75/222.2999 42%|█████████████████████████▎ | 93.6/222.2999 45%|█████████████████████ | 99.44999999999999/222.2999 47%|███████████████████████████▉ | 105.3/222.2999 50%|███████████████████████ | 111.14999999999999/222.2999 53%|███████████████████████████████ | 117.0/222.2999 55%|████████████████████████████████ | 122.85/222.2999 58%|██████████████████████████████████▏ | 128.7/222.2999 61%|███████████████████████████▊ | 134.54999999999998/222.2999 63%|█████████████████████████████ | 140.39999999999998/222.2999 66%|██████████████████████████████████████▏ | 146.25/222.2999 68%|████████████████████████████████████████▎ | 152.1/222.2999 71%|█████████████████████████████████████████▏ | 157.95/222.2999 74%|█████████████████████████████████▉ | 163.79999999999998/222.2999 76%|███████████████████████████████████ | 169.64999999999998/222.2999 79%|██████████████████████████████████████████████▌ | 175.5/222.2999 82%|███████████████████████████████████████████████▎ | 181.35/222.2999 84%|█████████████████████████████████████████████████▋ | 187.2/222.2999 87%|███████████████████████████████████████▉ | 193.04999999999998/222.2999 89%|█████████████████████████████████████████▏ | 198.89999999999998/222.2999 92%|█████████████████████████████████████████████████████▍ | 204.75/222.2999 95%|███████████████████████████████████████████████████████▉ | 210.6/222.2999 97%|████████████████████████████████████████████████████████▍ | 216.45/222.2999100%|██████████████████████████████████████████████| 222.29999999999998/222.2999100%|██████████████████████████████████████████████| 222.29999999999998/222.29999999999998 [01:48<00:00, 2.05seconds/s]
[UltraSinger] Converting audio for AI
[UltraSinger] Reduce noise from vocal audio with ffmpeg.
[UltraSinger] Loading whisper with model large-v2 and cpu as worker
No language specified, language will be first be detected for each audio file (increases inference time).
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.1.2. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../../../.cache/torch/whisperx-vad-segmentation.bin
Model was trained with pyannote.audio 0.0.1, yours is 3.1.0. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.1.1. Bad things might happen unless you revert torch to 1.x.
[UltraSinger] Transcribing input/output/01 - Lace It/cache/01 - Lace It_denoised.wav
Detected language: en (0.94) in first 30s of audio...
[UltraSinger] Removing silent start and ending, from transcription data
[UltraSinger] Hyphenate using language code: en_IL
662it [00:00, 536132.31it/s]
[UltraSinger] Pitching with crepe and model full and cpu as worker
2024-01-13 11:52:31.766392: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
680/680 [==============================] - 143s 211ms/step
[UltraSinger] Creating midi notes from pitched data
[UltraSinger] Creating Ultrastar notes from midi data
[UltraSinger] Creating plot: Spectrogram
[UltraSinger] Creating plot
[UltraSinger] BPM is 107.67
[UltraSinger] Creating input/output/01 - Lace It/01 - Lace It.txt from transcription.
[UltraSinger] Converting wav to mp3
[UltraSinger] Creating input/output/01 - Lace It/01 - Lace It [Karaoke].txt from transcription.
[UltraSinger] Parse ultrastar txt -> input/output/01 - Lace It/01 - Lace It.txt
[UltraSinger] Calculating Ultrastar Points
[UltraSinger] Simple (octave high ignored) points
[UltraSinger] Total: 7911, notes: 7157, line bonus: 754, golden notes: 0
[UltraSinger] Accurate (octave high matches) points:
[UltraSinger] Total: 7884, notes: 7134, line bonus: 750, golden notes: 0
[UltraSinger] Creating Midi with pretty_midi
[UltraSinger] Creating midi instrument from Ultrastar txt
[UltraSinger] Creating midi file -> input/output/01 - Lace It/01 - Lace It.mid

[UltraSinger] Do you like UltraSinger? Want it to be even better? Then help with your support!
[UltraSinger] See project page -> https://github.com/rakuri255/UltraSinger
[UltraSinger] This will help a lot to keep this project alive and improved.
[UltraSinger] 0:27:56.069 - End Program
(ultrasinger) albinandreasson@albins-MBP UltraSinger %

@rakuri255
Copy link
Owner

You are using an old version. Your are on 0.0.3 while the source is on 0.0.8.
Please update your sources.

@rakuri255
Copy link
Owner

We already have an issue regarding too many words #45 .
The problem is a little more complicated here. Yes, you could look for capitalization in English, but what about Asian language, where there are no capital letters.

It might be better to check the total duration of the sentence. In other words, specify a fixed maximum duration until when a separation should take place. Then you need also have to check the word times, i.e. Rap has less duration for each word than Pop, in order to scale the number of words.

@1Larsolof
Copy link
Author

You are using an old version. Your are on 0.0.3 while the source is on 0.0.8. Please update your sources.

Done, sorry. just started looking at this recently again

We already have an issue regarding too many words #45 . The problem is a little more complicated here. Yes, you could look for capitalization in English, but what about Asian language, where there are no capital letters.

It might be better to check the total duration of the sentence. In other words, specify a fixed maximum duration until when a separation should take place. Then you need also have to check the word times, i.e. Rap has less duration for each word than Pop, in order to scale the number of words.

Yes, i get that. this was just a quick and dirty fix to make easier editing of the song in other softwares a bit easier for me.
In fact my first attempt was to check each line, count to 10 or so and if there was a break already placed ahead, but sadly it requires a lot of tinkering for each song/language.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants