LocalVocal: Local Live Captions & Translation On-the-Go

LocalVocal: Local Live Captions & Translation On-the-Go v0.3.6

Danielca90

New Member
Hi, im new, so i have tried to install the pluging, i went a hard way because i needed to install Xcode and such. I succeded in installing it throught the terminal (on mac) but when i opened the plugin obs crashed. I delete everything and star over, but now i don't have the aplicaction suport/obs-studio/plugins folder. Im confused o how it is actually installed, i watch your video and you said you just had to install the pkg. but, that didn't work either. Can you shed some light on me please?
 

royshilkrot

Member
Hi, im new, so i have tried to install the pluging, i went a hard way because i needed to install Xcode and such. I succeded in installing it throught the terminal (on mac) but when i opened the plugin obs crashed. I delete everything and star over, but now i don't have the aplicaction suport/obs-studio/plugins folder. Im confused o how it is actually installed, i watch your video and you said you just had to install the pkg. but, that didn't work either. Can you shed some light on me please?
Happy to help debug and get it up and running, please make contact on https://discord.gg/N9N3WCrp9y so you can share logs etc
 
I have an application where I would like to run this in a web browser, or at least use the speech recognition engine, as the built-in Chrome speech recognition (based on "Dictation") has gone south on me -- and i don't want to try to rely on it anymore.

Thoughts?
 

royshilkrot

Member
I have an application where I would like to run this in a web browser, or at least use the speech recognition engine, as the built-in Chrome speech recognition (based on "Dictation") has gone south on me -- and i don't want to try to rely on it anymore.

Thoughts?
for a standalone tool (outside OBS) I've created LexiSynth https://github.com/occ-ai/lexisynth it's not complete yet but it might work well enough for your uses...
 
In OBS, using your plugin, should I expect to be able to see captions recorded from more thanone audio source? (For example, I have a mic, and also "desktop audio" -- and want to be able to capture captions from both sources, while in a group meeting, or a live virtual training session.).

Also, is there a way to minimize the long delay in capturing captions, so the SRT timing matches the video better?
 

petesavage

New Member
Thanks Roy for making this available. I have been trying to setup the LocalVocal transcription (English audio --> English text) on a mac instance of OBS. Installation and configuration are apparently fine (your youtube video is very clear) but I keep getting a crash of OBS which looks like it occurs when there is any input into the mic. The crash report if I am reading it correctly occurs on the thread that is initiating the Whisper.cpp interaction.

I have tried on 3 macs (2 intel and 1 M1) and I have tried with previous versions of OBS as well but all with the same result. Not sure if there are any pointers.. wondering if it's a permission thing? Any advice happily received!
 

royshilkrot

Member
Thanks Roy for making this available. I have been trying to setup the LocalVocal transcription (English audio --> English text) on a mac instance of OBS. Installation and configuration are apparently fine (your youtube video is very clear) but I keep getting a crash of OBS which looks like it occurs when there is any input into the mic. The crash report if I am reading it correctly occurs on the thread that is initiating the Whisper.cpp interaction.

I have tried on 3 macs (2 intel and 1 M1) and I have tried with previous versions of OBS as well but all with the same result. Not sure if there are any pointers.. wondering if it's a permission thing? Any advice happily received!
Sorry to hear about the crashing. I'm happy to look at logs if you can share. Join us on https://discord.gg/WdsVPfjXNv
 

royshilkrot

Member
royshilkrot updated LocalVocal: Local Live Captions & Translation On-the-Go with a new update entry:

v0.3.0 Big release! Far better experience, more translation, bugs squashed!

Big release! Many improvements, solving a lot of bugs and introducing far better performance.
In points:
  • VAD based segmentation - no more "3 seconds segments"! when speech is detected the transcription kicks in
  • Incremental output: The captions will appear continuously as you speak, a live transcription effect
  • Bumping the version of whisper.cpp
  • Many more options for real-time translation
  • A lot of bug fixing...
Download
  • ...

Read the rest of this update entry...
 

adamesek

New Member
@adamesek strange - i thought i took care of those UTF8 problems! do you want to open another issue and provide some examples? you can also enable the logging level to be INFO and send over the log which provides a lot more information
..I know, I know.. $1,200 an hour.
After all... there is a new version, unfortunately it is still unchanged for us. It looks bad.
 

royshilkrot

Member
..I know, I know.. $1,200 an hour.
After all... there is a new version, unfortunately it is still unchanged for us. It looks bad.
did you try the latest version? it shuold not disturb the UTF8 outputs now
before, i was running "to lowercase" on the output but i'm not doing that anymore
for any language that's not english it should decode the UTF8 characters correctly now.
if this still persists with v0.3.0 i'm glad to look at logs etc.
 

adamesek

New Member
did you try the latest version? it shuold not disturb the UTF8 outputs now
before, i was running "to lowercase" on the output but i'm not doing that anymore
for any language that's not english it should decode the UTF8 characters correctly now.
if this still persists with v0.3.0 i'm glad to look at logs etc.
MacPRO M1, VERSION 0.3.0
..I threw away previous versions.. and installed 0.3.0
logs in the attachment
I use videojs player for the test. I will add... the google transcription works very nicely - another project
LocalVocal ..this problem from the beginning.
After so many weeks, I don't know... problem with OSA(?) - sometimes it dies; problem in input (LocalVocl); problem on output (player).
If you can't find a solution... I will understand. Maybe I have a bad configuration, I don't know...
 

Attachments

  • 2024-06-06 21-11-04.txt
    25.5 KB · Views: 12

royshilkrot

Member
MacPRO M1, VERSION 0.3.0
..I threw away previous versions.. and installed 0.3.0
logs in the attachment
I use videojs player for the test. I will add... the google transcription works very nicely - another project
LocalVocal ..this problem from the beginning.
After so many weeks, I don't know... problem with OSA(?) - sometimes it dies; problem in input (LocalVocl); problem on output (player).
If you can't find a solution... I will understand. Maybe I have a bad configuration, I don't know...
the log unfortunately doesn't show the detections - can you please enable "Internal Log level" advanced setting to INFO?
then send another log with some spoken output
 

royshilkrot

Member
i can see the sepcial characters e.g.

18:01:48.668: [obs-localvocal] Decoded sentence: ' Żołnierze zatrzymań po akcji przy granicy z Białorusią według śledczych bez uzasadnienia strzelali w kierunku migrantów, którzy usiłowali sforsować zaporę na granicy'

one thing i suggest you do is bring the VAD threshold up to 0.95 (but not 1.0)

do you see the captions properly on screen?
 
Top