LocalVocal: Local Live Captions & Translation On-the-Go

Supported Bit Versions

64-bit

Source Code URL: https://github.com/occ-ai/obs-localvocal

Minimum OBS Studio Version: 29.0.0

Supported Platforms

Windows
Mac OS X
Linux

LocalVocal plugin allows you to transcribe & translate speech into text locally on your machine in real time.

No GPU required*,

no cloud costs,

no network and

minimal lag! Privacy first - all data stays on your machine.
(* GPU acceleration via CUDA or AMD is supported!)

If this plugin has been valuable to you consider adding a

to the GH repo, rating it here on OBS, subscribing to my YouTube channel, and supporting my work: GitHub, Patreon or OpenCollective. Check out the Home for Open Source Content Creators AI.
Need help setting up? Contact live support https://discord.gg/J5RgzMmPqM

Do more with LocalVocal:

https://youtu.be/Q34LQsx-nlg | https://youtu.be/4BTmoKr0YMw | https://youtu.be/E7HKbO6CP_c
Realtime Translation with DeepL | Translate Apps and Videos

The plugin adds an Audio Filter - use it on a speech source (mic, video) to get a transcription. Send the captions to a Text Source to show on scene.

Current Features:

Transcribe audio to text in real time in 100 languages
Display captions on screen using text sources
Send captions to a .txt or .srt file (to read by external sources or video playback) with and without aggregation option
Sync'ed captions with OBS recording timestamps
Send captions on a RTMP stream to e.g. YouTube, Twitch
Bring your own Whisper model (any GGML)
Translate captions in real time to major languages (both Whisper built-in translation as well as NMT models with CTranslate2)
CUDA, OpenCL, Apple Arm64, AVX & SSE acceleration support

Roadmap:

More robust built-in translation options
Additional output options: .vtt, .ssa, .sub, etc.
Speaker diarization (detecting speakers in a multi-person audio stream)

Internally the plugin is running a neural network (OpenAI Whisper) locally to predict in real time the speech and provide captions.

It's using the Whisper.cpp project from ggerganov to run the Whisper network in a very efficient way on CPUs and GPUs. For translation it's using CTranslate2 and the M2M100 model.

If you use this plugin - let us know! We would love to feature your work/vids and showcase your success.

Check out our other plugins:

Background Removal removes background from webcam without a green screen.
Detect will detect and track >80 types of objects in real-time inside OBS
URL/API Source that allows fetching live data from an API and displaying it in OBS.

If you are a broadcasting company or service looking to integrate local AI technology into your pipelines - reach out to inquire about our enterprise services.

Reactions: WhyGizmo, dimacapt, OpenFields and 4 others

LocalVocal: Local Live Captions & Translation On-the-Go v0.3.6

More resources from royshilkrot

Share this resource

Latest updates

v0.3.6 Whisper Large v3 Turbo! Super-fast and powerful model

v0.3.5 - Upgrades, improvements - the good stuff!

v0.3.4 AMD GPU! CoreML and Metal! Accelerate!

Latest reviews