Skip to content

Incorrect segregation of voiced and unvoiced segments #31

@tumul-80

Description

@tumul-80

Hello,

I will like to get the voiced segments from any audio file ( .wav format) and plot it against the time series of the original audio. I modified your code a bit and ran it on a simple audio file. For instance, I recorded a simple audio file with just my voice and tried to find voiced segments, but the code mistakenly gets voiced segments and classifies most of actual human voice as "Unvoiced segments"

What should I do?

audio_data, sampling_rate = librosa.load('try_voice.wav')
plt.figure(figsize=(14, 5))
librosa.display.waveplot(audio_data, sr=sampling_rate)

vad=wb.Vad()
filename= 'try_voice.wav'
audio= audiosegment.from_file(filename)

seg = audio.resample(sample_rate_Hz=32000, sample_width=2, channels=1)
results = seg.detect_voice()
voiced = [tup[1] for tup in results if tup[0] == 'v']
unvoiced = [tup[1] for tup in results if tup[0] == 'u']

voiced_segment = voiced[0].reduce(voiced[1:])
voiced_segment.export("voiced.wav", format="WAV")
voiced, sampling_rate_v= librosa.load('voiced.wav')

duration = len(voiced)/sampling_rate_v
time = np.arange(0,duration,1/sampling_rate_v) #time vector
plt.figure()
librosa.display.waveplot(voiced, sr=sampling_rate_v)
plt.show()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions