18:00:07.250: [macOS] Permission for audio device access granted. 18:00:07.253: [macOS] Permission for video device access granted. 18:00:07.259: [macOS] Permission for accessibility denied. 18:00:07.264: [macOS] Permission for screen capture granted. 18:00:07.264: CPU Name: Apple M1 Pro 18:00:07.264: Physical Cores: 10, Logical Cores: 10 18:00:07.264: Physical Memory: 16384MB Total 18:00:07.264: Model Identifier: MacBookPro18,3 18:00:07.264: OS Name: macOS 18:00:07.264: OS Version: Version 14.5 (Build 23F79) 18:00:07.264: Rosetta translation used: false 18:00:07.264: Kernel Version: 23.5.0 18:00:07.265: hotkeys-cocoa: Using layout 'com.apple.keylayout.USInternational-PC' 18:00:07.265: Current Date/Time: 2024-06-07, 18:00:07 18:00:07.265: Browser Hardware Acceleration: true 18:00:07.265: Qt Version: 6.6.2 (runtime), 6.6.2 (compiled) 18:00:07.265: Portable mode: false 18:00:07.368: OBS 30.1.2 (mac) 18:00:07.368: --------------------------------- 18:00:07.369: --------------------------------- 18:00:07.369: audio settings reset: 18:00:07.369: samples per sec: 48000 18:00:07.369: speakers: 2 18:00:07.369: max buffering: 960 milliseconds 18:00:07.369: buffering type: dynamically increasing 18:00:07.369: --------------------------------- 18:00:07.369: Initializing OpenGL... 18:00:07.402: Loading up OpenGL on adapter Apple Apple M1 Pro 18:00:07.402: OpenGL loaded successfully, version 4.1 Metal - 88.1, shading language 4.10 18:00:07.580: --------------------------------- 18:00:07.580: video settings reset: 18:00:07.580: base resolution: 1920x1080 18:00:07.580: output resolution: 1920x1080 18:00:07.580: downscale filter: Bicubic 18:00:07.580: fps: 60/1 18:00:07.580: format: NV12 18:00:07.580: YUV mode: Rec. 709/Partial 18:00:07.580: NV12 texture support not available 18:00:07.580: P010 texture support not available 18:00:07.582: Audio monitoring device: 18:00:07.582: name: Domyślne 18:00:07.582: id: default 18:00:07.582: --------------------------------- 18:00:07.584: No AJA devices found, skipping loading AJA UI plugin 18:00:07.584: Failed to initialize module 'aja-output-ui' 18:00:07.586: No AJA devices found, skipping loading AJA plugin 18:00:07.586: Failed to initialize module 'aja' 18:00:07.587: Failed to load 'en-US' text for module: 'decklink-captions' 18:00:07.588: Failed to load 'en-US' text for module: 'decklink-output-ui' 18:00:07.592: A DeckLink iterator could not be created. The DeckLink drivers may not be installed 18:00:07.592: Failed to initialize module 'decklink' 18:00:07.704: [obs-browser]: Version 2.23.4 18:00:07.704: [obs-browser]: CEF Version 103.0.5060.134 (runtime), 103.61.26+g3630089+chromium-103.0.5060.134 (compiled) 18:00:07.714: [obs-websocket] [obs_module_load] you can haz websockets (Version: 5.4.2 | RPC Version: 1) 18:00:07.714: [obs-websocket] [obs_module_load] Qt version (compile-time): 6.6.2 | Qt version (run-time): 6.6.2 18:00:07.714: [obs-websocket] [obs_module_load] Linked ASIO Version: 102900 18:00:07.716: [obs-websocket] [obs_module_load] Module loaded. 18:00:07.722: [vlc-video]: VLC 3.0.20 Vetinari found, VLC video source enabled 18:00:07.725: google_s2t_caption_plugin 0.28 obs_module_load 1273606138 18:00:07.734: [obs-localvocal] plugin loaded successfully (version 0.3.0) 18:00:07.735: QLayout: Attempting to add QLayout "" to QWidget "", which already has a layout 18:00:07.735: [obs-multi-rtmp] Load 4 targets, 0 video configs, 0 audio configs 18:00:07.735: [obs-multi-rtmp] Load config from /Users/stream360/Library/Application Support/obs-studio/basic/profiles/Bez tytułu/obs-multi-rtmp.json 18:00:07.790: [obs-multi-rtmp] version: 0.5.0.4 by SoraYuki https://github.com/sorayuki/obs-multi-rtmp/ 18:00:07.790: --------------------------------- 18:00:07.790: Loaded Modules: 18:00:07.790: obs-multi-rtmp 18:00:07.790: obs-localvocal 18:00:07.790: cloud-closed-captions 18:00:07.790: vlc-video 18:00:07.790: text-freetype2 18:00:07.790: rtmp-services 18:00:07.790: obs-x264 18:00:07.790: obs-websocket 18:00:07.790: obs-webrtc 18:00:07.790: obs-vst 18:00:07.790: obs-transitions 18:00:07.790: obs-outputs 18:00:07.790: obs-filters 18:00:07.790: obs-ffmpeg 18:00:07.790: obs-browser 18:00:07.790: mac-virtualcam 18:00:07.790: mac-videotoolbox 18:00:07.790: mac-syphon 18:00:07.790: mac-capture 18:00:07.790: mac-avcapture 18:00:07.790: mac-avcapture-legacy 18:00:07.790: image-source 18:00:07.790: frontend-tools 18:00:07.790: decklink-output-ui 18:00:07.790: decklink-captions 18:00:07.790: coreaudio-encoder 18:00:07.790: --------------------------------- 18:00:07.790: google_s2t_caption_plugin 0.28 obs_module_post_load 18:00:07.790: [VideoToolbox encoder]: Added VideoToolbox encoders 18:00:07.791: QWidget::setTabOrder: 'first' and 'second' must be in the same window 18:00:07.791: ==== Startup complete =============================================== 18:00:07.821: All scene data cleared 18:00:07.821: ------------------------------------------------ 18:00:07.897: Downmix enabled: 1 to 2 channels. 18:00:07.934: coreaudio: Device 'MacBook Pro Microphone' [48000 Hz] initialized 18:00:07.934: [Loaded global audio device]: 'Mikrofon/Wejście liniowe' 18:00:08.064: [obs-localvocal] filter defaults 18:00:08.064: [obs-localvocal] LocalVocal filter create 18:00:08.065: [obs-localvocal] LocalVocal filter update 18:00:08.065: [obs-localvocal] buffered_output disable 18:00:08.065: [obs-localvocal] Checking if model 'Whisper Tiny En' exists in data... 18:00:08.065: [obs-localvocal] Model folder found in data: /Users/stream360/Library/Application Support/obs-studio/plugins/obs-localvocal.plugin/Contents/Resources/models/ggml-model-whisper-tiny-en 18:00:08.065: [obs-localvocal] Model bin file found in folder: /Users/stream360/Library/Application Support/obs-studio/plugins/obs-localvocal.plugin/Contents/Resources/models/ggml-model-whisper-tiny-en/ggml-model-whisper-tiny.en.bin 18:00:08.108: [obs-localvocal] Loading whisper model from /Users/stream360/Library/Application Support/obs-studio/plugins/obs-localvocal.plugin/Contents/Resources/models/ggml-model-whisper-tiny-en/ggml-model-whisper-tiny.en.bin 18:00:08.108: [obs-localvocal] Using CPU for inference 18:00:08.108: [obs-localvocal] DTW token timestamps disabled 18:00:08.161: [obs-localvocal] Whisper model loaded: AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 0 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 18:00:08.161: [obs-localvocal] starting whisper thread 18:00:08.161: [obs-localvocal] filter defaults 18:00:08.171: [Media Source 'wrocradio']: settings: 18:00:08.171: input: https://1808506050.rsc.cdn77.org/live/studiorw/playlist.m3u8 18:00:08.171: input_format: 18:00:08.171: speed: 100 18:00:08.171: is_looping: no 18:00:08.171: is_linear_alpha: no 18:00:08.171: is_hw_decoding: yes 18:00:08.171: is_clear_on_media_end: yes 18:00:08.171: restart_on_activate: yes 18:00:08.171: close_when_inactive: no 18:00:08.171: full_decode: no 18:00:08.171: ffmpeg_options: 18:00:08.181: [Media Source 'szczradio']: settings: 18:00:08.181: input: https://1604878388.rsc.cdn77.org/live/live.sdp/playlist.m3u8 18:00:08.181: input_format: 18:00:08.181: speed: 100 18:00:08.181: is_looping: no 18:00:08.181: is_linear_alpha: no 18:00:08.181: is_hw_decoding: yes 18:00:08.181: is_clear_on_media_end: yes 18:00:08.181: restart_on_activate: yes 18:00:08.181: close_when_inactive: no 18:00:08.181: full_decode: no 18:00:08.181: ffmpeg_options: 18:00:08.192: [Media Source 'TOK']: settings: 18:00:08.192: input: https://radiostream.pl/tuba10-1.mp3 18:00:08.192: input_format: 18:00:08.192: speed: 100 18:00:08.192: is_looping: no 18:00:08.192: is_linear_alpha: no 18:00:08.192: is_hw_decoding: yes 18:00:08.192: is_clear_on_media_end: yes 18:00:08.192: restart_on_activate: yes 18:00:08.192: close_when_inactive: no 18:00:08.192: full_decode: no 18:00:08.192: ffmpeg_options: 18:00:08.203: [obs-localvocal] filter defaults 18:00:08.203: [obs-localvocal] LocalVocal filter create 18:00:08.203: [obs-localvocal] channels 2, frames 528000, sample_rate 48000 18:00:08.203: [obs-localvocal] setup audio resampler 18:00:08.203: [obs-localvocal] clear text source data 18:00:08.203: [obs-localvocal] clear paths and whisper context 18:00:08.203: [obs-localvocal] run update 18:00:08.203: [obs-localvocal] LocalVocal filter update 18:00:08.203: [obs-localvocal] buffered_output disable 18:00:08.203: [obs-localvocal] update text source 18:00:08.203: [obs-localvocal] update whisper model 18:00:08.203: [obs-localvocal] model path changed from to Whisper Large q5 (1Gb) 18:00:08.203: [obs-localvocal] shutdown_whisper_thread 18:00:08.203: [obs-localvocal] Checking if model 'Whisper Large q5' exists in data... 18:00:08.203: [obs-localvocal] Model not found in data: /Users/stream360/Library/Application Support/obs-studio/plugins/obs-localvocal.plugin/Contents/Resources/models/ggml-model-whisper-large-q5_0 18:00:08.203: [obs-localvocal] Checking if model 'Whisper Large q5' exists in config... 18:00:08.203: [obs-localvocal] Model path in config: /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-large-q5_0 18:00:08.203: [obs-localvocal] Model exists in config folder: /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-large-q5_0 18:00:08.203: [obs-localvocal] Model bin file found in folder: /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-large-q5_0/ggml-model-whisper-large-q5_0.bin 18:00:08.204: [obs-localvocal] start_whisper_thread_with_path: /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-large-q5_0/ggml-model-whisper-large-q5_0.bin 18:00:08.239: [obs-localvocal] Loading whisper model from /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-large-q5_0/ggml-model-whisper-large-q5_0.bin 18:00:08.239: [obs-localvocal] Using CPU for inference 18:00:08.239: [obs-localvocal] DTW token timestamps disabled 18:00:08.239: [obs-localvocal] Whisper: whisper_init_from_file_with_params_no_state: loading model from '/Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-large-q5_0/ggml-model-whisper-large-q5_0.bin' 18:00:08.239: [obs-localvocal] Whisper: whisper_init_with_params_no_state: use gpu = 0 18:00:08.239: [obs-localvocal] Whisper: whisper_init_with_params_no_state: flash attn = 0 18:00:08.239: [obs-localvocal] Whisper: whisper_init_with_params_no_state: gpu_device = 0 18:00:08.239: [obs-localvocal] Whisper: whisper_init_with_params_no_state: dtw = 0 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: loading model 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_vocab = 51865 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_audio_ctx = 1500 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_audio_state = 1280 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_audio_head = 20 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_audio_layer = 32 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_text_ctx = 448 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_text_state = 1280 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_text_head = 20 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_text_layer = 32 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: n_mels = 80 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: ftype = 8 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: qntvr = 1 18:00:08.239: [obs-localvocal] Whisper: whisper_model_load: type = 5 (large) 18:00:08.260: [obs-localvocal] Whisper: whisper_model_load: adding 1608 extra tokens 18:00:08.261: [obs-localvocal] Whisper: whisper_model_load: n_langs = 99 18:00:08.261: [obs-localvocal] Whisper: whisper_model_load: CPU total size = 1080.10 MB 18:00:08.585: [obs-localvocal] Whisper: whisper_model_load: model size = 1080.10 MB 18:00:08.602: [obs-localvocal] Whisper: whisper_init_state: kv self size = 251.66 MB 18:00:08.618: [obs-localvocal] Whisper: whisper_init_state: kv cross size = 251.66 MB 18:00:08.618: [obs-localvocal] Whisper: whisper_init_state: kv pad size = 7.86 MB 18:00:08.618: [obs-localvocal] Whisper: whisper_init_state: compute buffer (conv) = 34.82 MB 18:00:08.619: [obs-localvocal] Whisper: whisper_init_state: compute buffer (encode) = 926.66 MB 18:00:08.619: [obs-localvocal] Whisper: whisper_init_state: compute buffer (cross) = 9.38 MB 18:00:08.620: [obs-localvocal] Whisper: whisper_init_state: compute buffer (decode) = 213.19 MB 18:00:08.620: [obs-localvocal] Whisper model loaded: AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 0 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 18:00:08.620: [obs-localvocal] update whisper params 18:00:08.620: [obs-localvocal] filter created. 18:00:08.620: [obs-localvocal] starting whisper thread 18:00:08.620: [obs-localvocal] filter defaults 18:00:08.620: Source ID 'url_source' not found 18:00:08.620: Failed to create source 'URLtrans'! 18:00:08.623: [obs-localvocal] filter activated 18:00:08.624: Switched to scene 'Scena' 18:00:08.624: save_or_load_event_callback 0, 1273606138 18:00:08.635: unknown mute when setting: '' 18:00:08.638: dock: 260 260 fs: 12 18:00:08.638: target: 254 0 18:00:08.639: ------------------------------------------------ 18:00:08.639: Loaded scenes: 18:00:08.639: - scene 'Scena': 18:00:08.639: - source: 'TOK' (ffmpeg_source) 18:00:08.639: - monitoring: monitor and output 18:00:08.639: - filter: 'LocalVocal' (transcription_filter_audio_filter) 18:00:08.639: - source: 'YT01' (browser_source) 18:00:08.639: - monitoring: monitor and output 18:00:08.639: - filter: 'TranskrypcjaLV' (transcription_filter_audio_filter) 18:00:08.639: - source: 'YT03' (browser_source) 18:00:08.639: - monitoring: monitor and output 18:00:08.639: - source: 'wrocradio' (ffmpeg_source) 18:00:08.639: - monitoring: monitor and output 18:00:08.639: - source: 'szczradio' (ffmpeg_source) 18:00:08.639: - monitoring: monitor and output 18:00:08.639: - source: 'CC1213' (browser_source) 18:00:08.639: - source: 'OVER1' (browser_source) 18:00:08.639: - source: 'URLtrans' (url_source) 18:00:08.639: - source: 'LocalVocal Subtitles' (text_ft2_source_v2) 18:00:08.639: ------------------------------------------------ 18:00:08.709: OBS_FRONTEND_EVENT_FINISHED_LOADING, plugin_manager loaded: 1, 6.6.2 18:00:08.709: enabled: 0, is_streaming 0, streaming_output_enabled 1, streaming_transcripts_enabled 0, is_streaming_relevant: 0, is_recording 0, recording_output_enabled 0, recording_transcripts_enabled 0, is_recording_relevant: 0, is_virtualcam_on 0, virtualcam_transcripts_enabled 0, is_virtualcam_relevant 0, is_preview_open 0, is_text_output_relevant 0, scene_collection_name: , source: 'TOK', equal_settings 1, do_captioning 0 18:00:08.709: settings changed, disabling captioning 18:00:08.828: [mac-virtualcam] macOS Camera Extension activated successfully. 18:00:13.447: [obs-localvocal] vad based segmentation. currently 105260 bytes in the audio input buffer 18:00:13.448: [obs-localvocal] found 26315 frames from info buffer. 0 in overlap 18:00:13.449: [obs-localvocal] resampled: 2 channels, 8755 frames, 547.187500 ms 18:00:13.451: [obs-localvocal] VAD segment 0. pushed 448 to 8755 (8307 frames / 519 ms). current size: 33228 bytes / 8307 frames / 519 ms 18:00:13.451: [obs-localvocal] end not reached. vad state: start ts: 18446742355933548174, end ts: 18446742355933548668 18:00:13.985: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:13.985: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:13.986: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:13.989: [obs-localvocal] VAD detected no speech in 8777 frames 18:00:13.989: [obs-localvocal] Last VAD was ON: segment end -> send to inference 18:00:13.989: [obs-localvocal] run_whisper_inference: processing 8627 samples, 0.539 sec, 4 threads 18:00:13.989: [obs-localvocal] Speech segment is less than 1 second, padding with zeros to 1 second 18:00:18.003: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:00:18.003: [obs-localvocal] S 0, Token 1: 42185 Nast p: 1.000 [keep: 1] 18:00:18.003: [obs-localvocal] S 0, Token 2: 64 a p: 1.000 [keep: 1] 18:00:18.003: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 1.010. Ratio: 1.960. 18:00:18.003: [obs-localvocal] S 0, Token 3: 50464 [_TT_100] p: 0.438 [keep: 0] 18:00:18.003: [obs-localvocal] Decoded sentence: ' Nasta' 18:00:18.058: [obs-localvocal] vad based segmentation. currently 782420 bytes in the audio input buffer 18:00:18.058: [obs-localvocal] found 195605 frames from info buffer. 0 in overlap 18:00:18.061: [obs-localvocal] resampled: 2 channels, 65202 frames, 4075.125244 ms 18:00:18.069: [obs-localvocal] VAD segment 0. pushed 2496 to 65202 (62706 frames / 3919 ms). current size: 250824 bytes / 62706 frames / 3919 ms 18:00:18.069: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933553292 18:00:18.596: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:00:18.596: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:00:18.597: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:18.599: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 284260 bytes / 71065 frames / 4441 ms 18:00:18.599: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933553814 18:00:19.132: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:19.133: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:19.134: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:19.137: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 319368 bytes / 79842 frames / 4990 ms 18:00:19.137: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933554363 18:00:19.382: User is ignoring service bitrate limits. 18:00:19.461: [VideoToolbox simple_video_stream: 'h264']: session created with hardware encoding 18:00:19.482: [VideoToolbox simple_video_stream: 'h264']: settings: 18:00:19.482: vt_encoder_id com.apple.videotoolbox.videoencoder.ave.avc 18:00:19.482: rate_control: CBR 18:00:19.482: bitrate: 4000 (kbps) 18:00:19.482: quality: 0.600000 18:00:19.482: fps_num: 60 18:00:19.482: fps_den: 1 18:00:19.482: width: 1920 18:00:19.482: height: 1080 18:00:19.482: keyint: 2 (s) 18:00:19.482: limit_bitrate: off 18:00:19.482: rc_max_bitrate: 2500 (kbps) 18:00:19.482: rc_max_bitrate_window: 1.500000 (s) 18:00:19.482: hw_enc: on 18:00:19.482: profile: high 18:00:19.482: codec_type: h264 18:00:19.482: 18:00:19.485: [CoreAudio AAC: 'simple_aac']: settings: 18:00:19.485: mode: AAC 18:00:19.485: bitrate: 160 18:00:19.485: sample rate: 48000 18:00:19.485: cbr: on 18:00:19.485: output buffer: 1536 18:00:19.485: [rtmp stream: 'multi-output'] Connecting to RTMP URL rtmp://7001.szczecin.pl/0live... 18:00:19.485: [rtmp stream: 'simple_stream'] Connecting to RTMP URL rtmp://7001.szczecin.pl/0live... 18:00:19.486: save_or_load_event_callback 1, 1273606138 18:00:19.486: obs save event 18:00:19.676: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:19.676: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:19.676: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:19.677: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 354476 bytes / 88619 frames / 5538 ms 18:00:19.677: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933554911 18:00:19.758: [rtmp stream: 'simple_stream'] Connection to rtmp://7001.szczecin.pl/0live (57.128.199.81) successful 18:00:19.758: [rtmp stream: 'multi-output'] Connection to rtmp://7001.szczecin.pl/0live (57.128.199.81) successful 18:00:19.760: stream_started_event 18:00:19.760: enabled: 0, is_streaming 1, streaming_output_enabled 1, streaming_transcripts_enabled 0, is_streaming_relevant: 1, is_recording 0, recording_output_enabled 0, recording_transcripts_enabled 0, is_recording_relevant: 0, is_virtualcam_on 0, virtualcam_transcripts_enabled 0, is_virtualcam_relevant 0, is_preview_open 0, is_text_output_relevant 0, scene_collection_name: , source: 'TOK', equal_settings 1, do_captioning 0 18:00:19.760: settings changed, disabling captioning 18:00:19.760: caption_output_writer_loop streaming starting 18:00:19.766: ==== Streaming Start =============================================== 18:00:20.216: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:20.216: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:20.217: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:00:20.219: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 387916 bytes / 96979 frames / 6061 ms 18:00:20.219: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933555434 18:00:20.746: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:20.746: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:20.747: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:20.751: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 423024 bytes / 105756 frames / 6609 ms 18:00:20.751: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933555982 18:00:21.286: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:21.286: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:21.287: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:21.290: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 456460 bytes / 114115 frames / 7132 ms 18:00:21.290: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933556505 18:00:21.819: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:21.819: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:21.820: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:21.823: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 491568 bytes / 122892 frames / 7680 ms 18:00:21.823: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933557053 18:00:22.360: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:22.360: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:22.361: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:22.364: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 525004 bytes / 131251 frames / 8203 ms 18:00:22.364: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933557576 18:00:22.888: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:22.888: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:22.889: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:22.892: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 560112 bytes / 140028 frames / 8751 ms 18:00:22.892: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933558124 18:00:23.427: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:23.427: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:23.428: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:00:23.431: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 593552 bytes / 148388 frames / 9274 ms 18:00:23.431: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933558647 18:00:23.961: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:23.962: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:23.962: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:23.966: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 628660 bytes / 157165 frames / 9822 ms 18:00:23.966: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933559195 18:00:24.512: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:24.512: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:24.512: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:24.514: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 663768 bytes / 165942 frames / 10371 ms 18:00:24.514: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933559744 18:00:25.048: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:00:25.048: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:00:25.049: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:25.052: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 697204 bytes / 174301 frames / 10893 ms 18:00:25.052: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933560266 18:00:25.590: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:25.590: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:25.591: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:25.595: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 732312 bytes / 183078 frames / 11442 ms 18:00:25.595: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933560815 18:00:26.125: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:00:26.125: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:00:26.126: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:26.129: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 765748 bytes / 191437 frames / 11964 ms 18:00:26.129: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933561337 18:00:26.671: [obs-localvocal] vad based segmentation. currently 110340 bytes in the audio input buffer 18:00:26.671: [obs-localvocal] found 27585 frames from info buffer. 0 in overlap 18:00:26.672: [obs-localvocal] resampled: 2 channels, 9195 frames, 574.687500 ms 18:00:26.673: [obs-localvocal] VAD segment 0. pushed 0 to 9195 (9195 frames / 574 ms). current size: 802528 bytes / 200632 frames / 12539 ms 18:00:26.673: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933561912 18:00:27.208: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:27.208: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:27.209: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:00:27.213: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 835968 bytes / 208992 frames / 13062 ms 18:00:27.213: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933562435 18:00:27.747: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:27.747: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:27.748: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:27.751: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 871076 bytes / 217769 frames / 13610 ms 18:00:27.751: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933562983 18:00:28.288: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:28.289: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:28.290: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:28.297: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 904512 bytes / 226128 frames / 14133 ms 18:00:28.298: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933563506 18:00:28.835: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:28.835: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:28.836: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:28.838: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 939620 bytes / 234905 frames / 14681 ms 18:00:28.838: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933564054 18:00:29.376: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:29.376: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:29.381: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:29.384: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 974728 bytes / 243682 frames / 15230 ms 18:00:29.384: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933564603 18:00:29.916: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:29.916: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:29.917: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:29.920: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1009836 bytes / 252459 frames / 15778 ms 18:00:29.920: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933565151 18:00:30.462: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:30.462: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:30.463: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:00:30.464: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 1044948 bytes / 261237 frames / 16327 ms 18:00:30.465: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933565700 18:00:30.996: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:00:30.996: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:00:30.997: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:31.000: [obs-localvocal] VAD segment 0. pushed 448 to 8359 (7911 frames / 494 ms). current size: 1076592 bytes / 269148 frames / 16821 ms 18:00:31.000: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933566222 18:00:31.533: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:31.533: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:31.534: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:31.537: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1111700 bytes / 277925 frames / 17370 ms 18:00:31.537: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933566771 18:00:32.086: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:32.087: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:32.087: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:32.090: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1146808 bytes / 286702 frames / 17918 ms 18:00:32.090: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933567319 18:00:32.626: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:32.626: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:32.630: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:32.634: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 1180244 bytes / 295061 frames / 18441 ms 18:00:32.634: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933567842 18:00:33.180: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:33.181: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:33.182: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:33.184: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1215352 bytes / 303838 frames / 18989 ms 18:00:33.184: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933568390 18:00:33.719: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:33.719: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:33.720: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:33.723: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1250460 bytes / 312615 frames / 19538 ms 18:00:33.723: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933568939 18:00:34.251: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:34.251: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:34.251: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:00:34.254: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 1285572 bytes / 321393 frames / 20087 ms 18:00:34.254: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933569488 18:00:34.793: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:00:34.793: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:00:34.794: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:34.796: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 1319008 bytes / 329752 frames / 20609 ms 18:00:34.796: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933570010 18:00:35.335: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:35.335: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:35.336: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:35.339: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1354116 bytes / 338529 frames / 21158 ms 18:00:35.339: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933570559 18:00:35.883: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:35.883: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:35.884: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:35.887: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1389224 bytes / 347306 frames / 21706 ms 18:00:35.887: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933571107 18:00:36.416: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:36.416: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:36.417: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:36.420: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 1422660 bytes / 355665 frames / 22229 ms 18:00:36.420: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933571630 18:00:36.961: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:36.962: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:36.963: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:36.967: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1457768 bytes / 364442 frames / 22777 ms 18:00:36.967: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933572178 18:00:37.502: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:37.502: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:37.503: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:00:37.506: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 1492880 bytes / 373220 frames / 23326 ms 18:00:37.506: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933572727 18:00:38.043: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:38.043: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:38.043: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:38.046: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1527988 bytes / 381997 frames / 23874 ms 18:00:38.046: [obs-localvocal] end not reached. vad state: start ts: 18446742355933549399, end ts: 18446742355933573275 18:00:38.580: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:38.580: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:38.581: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:38.583: [obs-localvocal] VAD segment 0. pushed 0 to 6720 (6720 frames / 420 ms). current size: 1554868 bytes / 388717 frames / 24294 ms 18:00:38.583: [obs-localvocal] VAD segment end -> send to inference 18:00:38.584: [obs-localvocal] run_whisper_inference: processing 389037 samples, 24.315 sec, 4 threads 18:00:42.592: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:00:42.592: [obs-localvocal] S 0, Token 1: 8232 TO p: 0.978 [keep: 1] 18:00:42.592: [obs-localvocal] S 0, Token 2: 42 K p: 1.000 [keep: 1] 18:00:42.592: [obs-localvocal] S 0, Token 3: 13898 360 p: 1.000 [keep: 1] 18:00:42.592: [obs-localvocal] S 0, Token 4: 13 . p: 0.984 [keep: 1] 18:00:42.592: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 24.314. Ratio: 12.280. 18:00:42.592: [obs-localvocal] Time token ratio too high, skipping 18:00:42.648: [obs-localvocal] vad based segmentation. currently 782420 bytes in the audio input buffer 18:00:42.648: [obs-localvocal] found 195605 frames from info buffer. 0 in overlap 18:00:42.650: [obs-localvocal] resampled: 2 channels, 65202 frames, 4075.125244 ms 18:00:42.658: [obs-localvocal] VAD segment 0. pushed 2496 to 65202 (62706 frames / 3919 ms). current size: 250824 bytes / 62706 frames / 3919 ms 18:00:42.658: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933577873 18:00:43.190: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:43.190: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:43.191: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:43.192: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 285932 bytes / 71483 frames / 4467 ms 18:00:43.192: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933578421 18:00:43.733: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:43.733: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:43.733: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:43.735: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 321040 bytes / 80260 frames / 5016 ms 18:00:43.735: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933578970 18:00:44.274: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:44.274: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:44.275: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:44.278: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 354476 bytes / 88619 frames / 5538 ms 18:00:44.278: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933579492 18:00:44.815: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:44.815: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:44.816: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:44.819: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 389584 bytes / 97396 frames / 6087 ms 18:00:44.819: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933580041 18:00:45.353: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:45.353: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:45.353: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:45.355: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 424692 bytes / 106173 frames / 6635 ms 18:00:45.355: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933580590 18:00:45.888: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:00:45.888: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:00:45.889: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:45.892: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 458128 bytes / 114532 frames / 7158 ms 18:00:45.892: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933581112 18:00:46.431: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:46.432: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:46.432: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:00:46.435: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 493240 bytes / 123310 frames / 7706 ms 18:00:46.435: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933581661 18:00:46.975: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:46.975: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:46.976: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:46.979: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 528348 bytes / 132087 frames / 8255 ms 18:00:46.979: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933582209 18:00:47.525: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:47.525: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:47.526: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:47.529: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 563456 bytes / 140864 frames / 8804 ms 18:00:47.529: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933582758 18:00:48.079: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:48.079: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:48.081: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:48.084: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 598564 bytes / 149641 frames / 9352 ms 18:00:48.084: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933583306 18:00:48.614: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:48.614: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:48.615: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:48.616: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 633672 bytes / 158418 frames / 9901 ms 18:00:48.616: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933583855 18:00:49.154: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:00:49.155: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:00:49.155: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:49.158: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 667108 bytes / 166777 frames / 10423 ms 18:00:49.158: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933584377 18:00:49.695: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:49.695: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:49.697: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:00:49.700: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 702220 bytes / 175555 frames / 10972 ms 18:00:49.700: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933584926 18:00:50.245: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:50.246: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:50.246: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:50.250: [obs-localvocal] VAD segment 0. pushed 5568 to 8777 (3209 frames / 200 ms). current size: 715056 bytes / 178764 frames / 11172 ms 18:00:50.250: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933585475 18:00:50.782: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:50.782: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:50.783: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:50.786: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 748492 bytes / 187123 frames / 11695 ms 18:00:50.786: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933585997 18:00:51.331: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:51.331: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:51.333: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:51.336: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 783600 bytes / 195900 frames / 12243 ms 18:00:51.336: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933586546 18:00:51.879: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:51.879: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:51.880: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:51.883: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 818708 bytes / 204677 frames / 12792 ms 18:00:51.883: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933587094 18:00:52.425: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:52.425: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:52.428: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:52.431: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 853816 bytes / 213454 frames / 13340 ms 18:00:52.431: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933587643 18:00:52.974: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:52.974: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:52.975: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:52.978: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 888924 bytes / 222231 frames / 13889 ms 18:00:52.978: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933588191 18:00:53.516: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:53.516: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:53.517: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:00:53.520: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 924036 bytes / 231009 frames / 14438 ms 18:00:53.520: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933588740 18:00:54.049: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:00:54.050: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:00:54.050: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:54.053: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 957472 bytes / 239368 frames / 14960 ms 18:00:54.053: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933589262 18:00:54.593: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:54.595: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:54.596: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:54.599: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 992580 bytes / 248145 frames / 15509 ms 18:00:54.599: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933589811 18:00:55.097: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:00:55.097: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:00:55.099: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:55.105: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 1026016 bytes / 256504 frames / 16031 ms 18:00:55.105: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933590333 18:00:55.638: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:00:55.639: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:00:55.639: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:00:55.643: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 1059452 bytes / 264863 frames / 16553 ms 18:00:55.643: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933590856 18:00:56.184: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:00:56.184: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:00:56.185: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:00:56.188: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1094560 bytes / 273640 frames / 17102 ms 18:00:56.188: [obs-localvocal] end not reached. vad state: start ts: 18446742355933573980, end ts: 18446742355933591404 18:00:56.731: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:00:56.731: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:00:56.732: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:00:56.735: [obs-localvocal] VAD segment 0. pushed 0 to 8768 (8768 frames / 548 ms). current size: 1129632 bytes / 282408 frames / 17650 ms 18:00:56.735: [obs-localvocal] VAD segment end -> send to inference 18:00:56.736: [obs-localvocal] run_whisper_inference: processing 282728 samples, 17.670 sec, 4 threads 18:01:02.532: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:02.532: [obs-localvocal] S 0, Token 1: 5041 Tom p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 2: 19601 asz p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 3: 5637 Ber p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 4: 74 k p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 5: 1684 iet p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 6: 64 a p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 7: 1714 aw p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 8: 599 ans p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 9: 30105 ował p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 10: 360 do p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 11: 962 fin p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 12: 64 a p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 13: 24066 łu p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 14: 8156 jun p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 15: 9337 iors p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 16: 42349 kiego p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 17: 5522 French p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 18: 7238 Open p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 19: 11 , p: 0.916 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 20: 17335 mamy p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 21: 16677 więc p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 22: 274 d p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 23: 6120 wo p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 24: 2884 je p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 25: 45002 naszych p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 26: 1085 rep p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 27: 265 re p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 28: 14185 zent p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 29: 394 ant p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 30: 3901 ów p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 31: 261 w p: 1.000 [keep: 1] 18:01:02.532: [obs-localvocal] S 0, Token 32: 962 fin p: 1.000 [keep: 1] 18:01:02.532: Last log entry repeated for 60 more lines 18:01:02.532: [obs-localvocal] Time token found 51244 -> 17.580. Duration: 17.670. Ratio: 1.005. 18:01:02.532: [obs-localvocal] S 0, Token 93: 51244 [_TT_880] p: 0.816 [keep: 0] 18:01:02.532: [obs-localvocal] Decoded sentence: ' Tomasz Berkieta awansował do finału juniorskiego French Open, mamy więc dwoje naszych reprezentantów w finałach i o tym prosto z kortów Rolanda Garosa w Sporcie 360. Po 19 dziś Przemysław Pozowski, a to jest Tok 360. Podsumowanie dnia w Radiu Tok FM, pierwszym radiu informacyjnym' 18:01:02.588: [obs-localvocal] vad based segmentation. currently 1123472 bytes in the audio input buffer 18:01:02.588: [obs-localvocal] found 280868 frames from info buffer. 0 in overlap 18:01:02.591: [obs-localvocal] resampled: 2 channels, 93622 frames, 5851.375000 ms 18:01:02.602: [obs-localvocal] VAD segment 0. pushed 448 to 21504 (21056 frames / 1316 ms). current size: 84224 bytes / 21056 frames / 1316 ms 18:01:02.602: [obs-localvocal] VAD segment end -> send to inference 18:01:02.602: [obs-localvocal] run_whisper_inference: processing 21376 samples, 1.336 sec, 4 threads 18:01:06.606: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:06.606: [obs-localvocal] S 0, Token 1: 7938 Adam p: 1.000 [keep: 1] 18:01:06.606: [obs-localvocal] S 0, Token 2: 29843 Oz p: 1.000 [keep: 1] 18:01:06.606: [obs-localvocal] S 0, Token 3: 3680 ga p: 1.000 [keep: 1] 18:01:06.606: [obs-localvocal] S 0, Token 4: 11 , p: 1.000 [keep: 1] 18:01:06.606: [obs-localvocal] S 0, Token 5: 47568 dzień p: 1.000 [keep: 1] 18:01:06.606: [obs-localvocal] S 0, Token 6: 35884 dobry p: 1.000 [keep: 1] 18:01:06.606: [obs-localvocal] S 0, Token 7: 13 . p: 1.000 [keep: 0] 18:01:06.606: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 1.336. Ratio: 1.482. 18:01:06.606: [obs-localvocal] S 0, Token 8: 50464 [_TT_100] p: 1.000 [keep: 0] 18:01:06.606: [obs-localvocal] Decoded sentence: ' Adam Ozga, dzień dobry' 18:01:06.606: [obs-localvocal] VAD segment 1. pushed 21504 to 34816 (13312 frames / 832 ms). current size: 53248 bytes / 13312 frames / 832 ms 18:01:06.606: [obs-localvocal] VAD segment end -> send to inference 18:01:06.606: [obs-localvocal] run_whisper_inference: processing 13632 samples, 0.852 sec, 4 threads 18:01:06.606: [obs-localvocal] Speech segment is less than 1 second, padding with zeros to 1 second 18:01:19.881: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 0.999 [keep: 0] 18:01:19.881: [obs-localvocal] S 0, Token 1: 220 p: 0.193 [keep: 1] 18:01:19.881: [obs-localvocal] S 0, Token 2: 172 p: 0.453 [keep: 1] 18:01:19.881: [obs-localvocal] S 0, Token 3: 253 p: 1.000 [keep: 1] 18:01:19.881: [obs-localvocal] S 0, Token 4: 97 p: 0.235 [keep: 1] 18:01:19.881: [obs-localvocal] S 0, Token 5: 105 p: 0.495 [keep: 1] 18:01:19.881: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 1.010. Ratio: 1.960. 18:01:19.881: [obs-localvocal] S 0, Token 6: 50464 [_TT_100] p: 0.808 [keep: 0] 18:01:19.881: [obs-localvocal] Decoded sentence: ' 🤬' 18:01:19.881: [obs-localvocal] VAD segment 2. pushed 34816 to 93622 (58806 frames / 3675 ms). current size: 235224 bytes / 58806 frames / 3675 ms 18:01:19.881: [obs-localvocal] VAD segment end -> send to inference 18:01:19.881: [obs-localvocal] run_whisper_inference: processing 59126 samples, 3.695 sec, 4 threads 18:01:24.577: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:24.577: [obs-localvocal] S 0, Token 1: 426 N p: 0.998 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 2: 4715 AP p: 0.946 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 3: 2343 IS p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 4: 56 Y p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 5: 413 D p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 6: 11435 LA p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 7: 18482 NI p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 8: 2358 ES p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 9: 129 p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 10: 223 p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 11: 56 Y p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 12: 50 S p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 13: 57 Z p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 14: 128 p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 15: 226 p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 16: 34 C p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 17: 56 Y p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 18: 5462 CH p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 19: 11 , p: 0.999 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 20: 316 A p: 0.997 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 21: 343 W p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 22: 40 I p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 23: 128 p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 24: 246 p: 0.998 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 25: 4969 CE p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 26: 41 J p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 27: 430 P p: 0.965 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 28: 3750 RA p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 29: 54 W p: 1.000 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 30: 34891 DY p: 0.982 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 31: 11 , p: 0.998 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 32: 591 K p: 0.848 [keep: 1] 18:01:24.577: Last log entry repeated for 4 more lines 18:01:24.577: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 3.695. Ratio: 1.866. 18:01:24.577: [obs-localvocal] S 0, Token 37: 50464 [_TT_100] p: 0.993 [keep: 0] 18:01:24.577: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 3.695. Ratio: 1.866. 18:01:24.577: [obs-localvocal] S 0, Token 38: 50464 [_TT_100] p: 0.571 [keep: 0] 18:01:24.577: [obs-localvocal] S 0, Token 39: 479 F p: 0.972 [keep: 1] 18:01:24.577: [obs-localvocal] S 0, Token 40: 9443 OK p: 0.999 [keep: 1] 18:01:24.577: [obs-localvocal] Time token found 50564 -> 3.980. Duration: 3.695. Ratio: 1.077. 18:01:24.577: [obs-localvocal] S 0, Token 41: 50564 [_TT_200] p: 0.698 [keep: 0] 18:01:24.577: [obs-localvocal] Decoded sentence: ' NAPISY DLA NIESŁYSZĄCYCH, A WIĘCEJ PRAWDY, KONIEC! FOK' 18:01:24.632: [obs-localvocal] vad based segmentation. currently 4233092 bytes in the audio input buffer 18:01:24.633: [obs-localvocal] found 478981 frames from info buffer. 0 in overlap 18:01:24.641: [obs-localvocal] resampled: 2 channels, 159661 frames, 9978.812500 ms 18:01:24.659: [obs-localvocal] VAD segment 0. pushed 1472 to 22528 (21056 frames / 1316 ms). current size: 84224 bytes / 21056 frames / 1316 ms 18:01:24.659: [obs-localvocal] VAD segment end -> send to inference 18:01:24.659: [obs-localvocal] run_whisper_inference: processing 21376 samples, 1.336 sec, 4 threads 18:01:28.561: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:28.561: [obs-localvocal] S 0, Token 1: 13898 360 p: 1.000 [keep: 1] 18:01:28.561: [obs-localvocal] S 0, Token 2: 13 . p: 1.000 [keep: 0] 18:01:28.561: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 1.336. Ratio: 1.482. 18:01:28.561: [obs-localvocal] S 0, Token 3: 50464 [_TT_100] p: 1.000 [keep: 0] 18:01:28.561: [obs-localvocal] Decoded sentence: ' 360' 18:01:28.561: [obs-localvocal] VAD segment 1. pushed 22528 to 111616 (89088 frames / 5568 ms). current size: 356352 bytes / 89088 frames / 5568 ms 18:01:28.561: [obs-localvocal] VAD segment end -> send to inference 18:01:28.562: [obs-localvocal] run_whisper_inference: processing 89408 samples, 5.588 sec, 4 threads 18:01:33.117: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:33.117: [obs-localvocal] S 0, Token 1: 6001 Pre p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 2: 1229 zy p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 3: 67 d p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 4: 317 ent p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 5: 400 And p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 6: 13503 rze p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 7: 73 j p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 8: 413 D p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 9: 11152 uda p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 10: 261 w p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 11: 8107 tym p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 12: 40883 momencie p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 13: 2838 nie p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 14: 27486 widz p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 15: 8908 iał p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 16: 44585 zasad p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 17: 77 n p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 18: 18811 ienia p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 19: 12285 dla p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 20: 3611 od p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 21: 6120 wo p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 22: 1221 ł p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 23: 5609 ania p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 24: 447 pro p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 25: 33503 kur p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 26: 1639 ator p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 27: 64 a p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 28: 5041 Tom p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 29: 296 as p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 30: 2394 za p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 31: 13048 Jane p: 1.000 [keep: 1] 18:01:33.117: [obs-localvocal] S 0, Token 32: 3689 cz p: 1.000 [keep: 1] 18:01:33.117: Last log entry repeated for 2 more lines 18:01:33.117: [obs-localvocal] Time token found 50644 -> 5.580. Duration: 5.588. Ratio: 1.001. 18:01:33.117: [obs-localvocal] S 0, Token 35: 50644 [_TT_280] p: 0.615 [keep: 0] 18:01:33.117: [obs-localvocal] Decoded sentence: ' Prezydent Andrzej Duda w tym momencie nie widział zasadnienia dla odwołania prokuratora Tomasza Janeczka' 18:01:33.117: [obs-localvocal] VAD segment 2. pushed 111616 to 159661 (48045 frames / 3002 ms). current size: 192180 bytes / 48045 frames / 3002 ms 18:01:33.117: [obs-localvocal] end not reached. vad state: start ts: 18446742355933604806, end ts: 0 18:01:33.169: [obs-localvocal] vad based segmentation. currently 3957240 bytes in the audio input buffer 18:01:33.170: [obs-localvocal] found 478982 frames from info buffer. 0 in overlap 18:01:33.176: [obs-localvocal] resampled: 2 channels, 159660 frames, 9978.750000 ms 18:01:33.194: [obs-localvocal] VAD segment 0. pushed 0 to 63488 (63488 frames / 3968 ms). current size: 446132 bytes / 111533 frames / 6970 ms 18:01:33.194: [obs-localvocal] VAD segment end -> send to inference 18:01:33.194: [obs-localvocal] run_whisper_inference: processing 111853 samples, 6.991 sec, 4 threads 18:01:38.145: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:38.145: [obs-localvocal] S 0, Token 1: 1346 De p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 2: 1344 cy p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 3: 89 z p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 4: 11115 ję p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 5: 277 o p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 6: 3611 od p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 7: 6120 wo p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 8: 1221 ł p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 9: 25849 aniu p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 10: 5360 og p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 11: 1221 ł p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 12: 21521 osi p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 13: 1221 ł p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 14: 31981 dzi p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 15: 1788 ś p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 16: 12689 premier p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 17: 8632 Donald p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 18: 42026 Tus p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 19: 74 k p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 20: 26949 reag p: 0.708 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 21: 44733 ując p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 22: 1667 na p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 23: 22734 spraw p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 24: 1274 ę p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 25: 35802 zat p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 26: 13047 rzy p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 27: 37268 mania p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 28: 741 i p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 29: 2183 post p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 30: 1607 aw p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 31: 18811 ienia p: 1.000 [keep: 1] 18:01:38.145: [obs-localvocal] S 0, Token 32: 22675 zar p: 1.000 [keep: 1] 18:01:38.145: Last log entry repeated for 21 more lines 18:01:38.145: [obs-localvocal] Time token found 50714 -> 6.980. Duration: 6.990. Ratio: 1.001. 18:01:38.145: [obs-localvocal] S 0, Token 54: 50714 [_TT_350] p: 1.000 [keep: 0] 18:01:38.145: [obs-localvocal] Decoded sentence: ' Decyzję o odwołaniu ogłosił dziś premier Donald Tusk reagując na sprawę zatrzymania i postawienia zarzutów żołnierzom na polsko-białoruskiej granicy' 18:01:38.145: [obs-localvocal] VAD segment 1. pushed 63488 to 159660 (96172 frames / 6010 ms). current size: 384688 bytes / 96172 frames / 6010 ms 18:01:38.145: [obs-localvocal] end not reached. vad state: start ts: 18446742355933611777, end ts: 0 18:01:38.200: [obs-localvocal] vad based segmentation. currently 3009304 bytes in the audio input buffer 18:01:38.200: [obs-localvocal] found 478981 frames from info buffer. 0 in overlap 18:01:38.206: [obs-localvocal] resampled: 2 channels, 159661 frames, 9978.812500 ms 18:01:38.224: [obs-localvocal] VAD segment 0. pushed 0 to 129024 (129024 frames / 8064 ms). current size: 900784 bytes / 225196 frames / 14074 ms 18:01:38.224: [obs-localvocal] VAD segment end -> send to inference 18:01:38.224: [obs-localvocal] run_whisper_inference: processing 225516 samples, 14.095 sec, 4 threads 18:01:43.569: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:43.569: [obs-localvocal] S 0, Token 1: 12646 Pod p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 2: 8555 ją p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 3: 11126 łem p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 4: 979 dec p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 5: 37433 yz p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 6: 11115 ję p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 7: 1667 na p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 8: 261 w p: 0.783 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 9: 3722 ni p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 10: 541 ose p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 11: 74 k p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 12: 16182 minist p: 0.968 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 13: 424 ra p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 14: 22734 spraw p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 15: 1091 ied p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 16: 2081 li p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 17: 36476 wości p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 18: 277 o p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 19: 3611 od p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 20: 6120 wo p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 21: 1221 ł p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 22: 25849 aniu p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 23: 447 pro p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 24: 33503 kur p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 25: 1639 ator p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 26: 64 a p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 27: 5041 Tom p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 28: 296 as p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 29: 2394 za p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 30: 13048 Jane p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 31: 3689 cz p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 32: 2330 ka p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 33: 11 , p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 34: 36746 zast p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 35: 18085 ęp p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 36: 37965 cę p: 0.996 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 37: 447 pro p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 38: 33503 kur p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 39: 1639 ator p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 40: 64 a p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 41: 2674 general p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 42: 11858 nego p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 43: 11 , p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 44: 24314 odpow p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 45: 15338 iedz p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 46: 831 ial p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 47: 11858 nego p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 48: 7949 za p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 49: 447 pro p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 50: 33503 kur p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 51: 19493 atur p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 52: 1274 ę p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 53: 6020 wo p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 54: 32625 jsk p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 55: 30297 ową p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 56: 13 . p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 57: 1407 To p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 58: 3611 od p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 59: 6120 wo p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 60: 1221 ł p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 61: 7155 anie p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 62: 29764 wym p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 63: 9286 aga p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 64: 14168 jeszcze p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 65: 40948 zg p: 0.999 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 66: 843 ody p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 67: 47296 pana p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 68: 659 pre p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 69: 1229 zy p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 70: 67 d p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 71: 8938 enta p: 1.000 [keep: 1] 18:01:43.569: [obs-localvocal] S 0, Token 72: 13 . p: 1.000 [keep: 0] 18:01:43.569: [obs-localvocal] Time token found 51064 -> 13.980. Duration: 14.094. Ratio: 1.008. 18:01:43.569: [obs-localvocal] S 0, Token 73: 51064 [_TT_700] p: 1.000 [keep: 0] 18:01:43.569: [obs-localvocal] Decoded sentence: ' Podjąłem decyzję na wniosek ministra sprawiedliwości o odwołaniu prokuratora Tomasza Janeczka, zastępcę prokuratora generalnego, odpowiedzialnego za prokuraturę wojskową. To odwołanie wymaga jeszcze zgody pana prezydenta' 18:01:43.569: [obs-localvocal] VAD segment 1. pushed 129024 to 159661 (30637 frames / 1914 ms). current size: 122548 bytes / 30637 frames / 1914 ms 18:01:43.569: [obs-localvocal] end not reached. vad state: start ts: 18446742355933625852, end ts: 0 18:01:43.624: [obs-localvocal] vad based segmentation. currently 2131592 bytes in the audio input buffer 18:01:43.624: [obs-localvocal] found 478981 frames from info buffer. 0 in overlap 18:01:43.631: [obs-localvocal] resampled: 2 channels, 159660 frames, 9978.750000 ms 18:01:43.649: [obs-localvocal] VAD segment 0. pushed 0 to 102400 (102400 frames / 6400 ms). current size: 532148 bytes / 133037 frames / 8314 ms 18:01:43.649: [obs-localvocal] VAD segment end -> send to inference 18:01:43.649: [obs-localvocal] run_whisper_inference: processing 133357 samples, 8.335 sec, 4 threads 18:01:48.667: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:48.667: [obs-localvocal] S 0, Token 1: 29804 Ż p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 2: 78 o p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 3: 1221 ł p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 4: 19165 nier p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 5: 1381 ze p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 6: 35802 zat p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 7: 13047 rzy p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 8: 1696 ma p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 9: 5248 ń p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 10: 714 po p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 11: 9308 ak p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 12: 19649 cji p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 13: 6501 przy p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 14: 9370 gran p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 15: 2632 icy p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 16: 710 z p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 17: 363 B p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 18: 8908 iał p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 19: 26867 orus p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 20: 11404 ią p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 21: 6393 wed p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 22: 34077 ług p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 23: 8299 ś p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 24: 1493 led p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 25: 6522 czy p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 26: 339 ch p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 27: 10782 bez p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 28: 16851 uz p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 29: 296 as p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 30: 345 ad p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 31: 77 n p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 32: 18811 ienia p: 1.000 [keep: 1] 18:01:48.667: Last log entry repeated for 6 more lines 18:01:48.667: [obs-localvocal] S 0, Token 39: 38547 migrant p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 40: 3901 ów p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 41: 11 , p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 42: 25382 którzy p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 43: 505 us p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 44: 40622 ił p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 45: 305 ow p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 46: 5103 ali p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 47: 262 s p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 48: 2994 for p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 49: 539 so p: 0.982 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 50: 25234 wać p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 51: 710 z p: 0.670 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 52: 13345 apor p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 53: 1274 ę p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 54: 1667 na p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 55: 9370 gran p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 56: 2632 icy p: 1.000 [keep: 1] 18:01:48.667: [obs-localvocal] S 0, Token 57: 13 . p: 1.000 [keep: 0] 18:01:48.667: [obs-localvocal] Time token found 50764 -> 7.980. Duration: 8.334. Ratio: 1.044. 18:01:48.668: [obs-localvocal] S 0, Token 58: 50764 [_TT_400] p: 0.514 [keep: 0] 18:01:48.668: [obs-localvocal] Decoded sentence: ' Żołnierze zatrzymań po akcji przy granicy z Białorusią według śledczych bez uzasadnienia strzelali w kierunku migrantów, którzy usiłowali sforsować zaporę na granicy' 18:01:48.668: [obs-localvocal] VAD segment 1. pushed 102400 to 159660 (57260 frames / 3578 ms). current size: 229040 bytes / 57260 frames / 3578 ms 18:01:48.668: [obs-localvocal] end not reached. vad state: start ts: 18446742355933634167, end ts: 0 18:01:48.723: [obs-localvocal] vad based segmentation. currently 1198708 bytes in the audio input buffer 18:01:48.723: [obs-localvocal] found 299677 frames from info buffer. 0 in overlap 18:01:48.727: [obs-localvocal] resampled: 2 channels, 99892 frames, 6243.250000 ms 18:01:48.738: [obs-localvocal] VAD segment 0. pushed 0 to 52224 (52224 frames / 3264 ms). current size: 437936 bytes / 109484 frames / 6842 ms 18:01:48.738: [obs-localvocal] VAD segment end -> send to inference 18:01:48.738: [obs-localvocal] run_whisper_inference: processing 109804 samples, 6.863 sec, 4 threads 18:01:53.554: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:53.554: [obs-localvocal] S 0, Token 1: 10223 Post p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 2: 1274 ę p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 3: 14701 pow p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 4: 7155 anie p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 5: 41518 osob p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 6: 468 ist p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 7: 4199 ym p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 8: 12617 nad p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 9: 89 z p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 10: 37956 orem p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 11: 1111 ob p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 12: 8555 ją p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 13: 1221 ł p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 14: 14234 właśnie p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 15: 447 pro p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 16: 33503 kur p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 17: 1639 ator p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 18: 5041 Tom p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 19: 19601 asz p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 20: 13048 Jane p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 21: 3689 cz p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 22: 916 ek p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 23: 11 , p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 24: 3388 pow p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 25: 78 o p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 26: 1221 ł p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 27: 1325 any p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 28: 2604 tu p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 29: 1427 ż p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 30: 18334 przed p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 31: 7401 odd p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 32: 282 an p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 33: 4907 iem p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 34: 261 w p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 35: 10358 ład p: 1.000 [keep: 1] 18:01:53.554: [obs-localvocal] S 0, Token 36: 1229 zy p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 37: 14064 przez p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 38: 1176 Z p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 39: 65 b p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 40: 788 ign p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 41: 1093 iew p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 42: 64 a p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 43: 26190 Zi p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 44: 996 ob p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 45: 81 r p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 46: 1274 ę p: 1.000 [keep: 1] 18:01:53.555: [obs-localvocal] S 0, Token 47: 13 . p: 1.000 [keep: 0] 18:01:53.555: [obs-localvocal] Time token found 50708 -> 6.860. Duration: 6.862. Ratio: 1.000. 18:01:53.555: [obs-localvocal] S 0, Token 48: 50708 [_TT_344] p: 0.398 [keep: 0] 18:01:53.555: [obs-localvocal] Decoded sentence: ' Postępowanie osobistym nadzorem objął właśnie prokurator Tomasz Janeczek, powołany tuż przed oddaniem władzy przez Zbigniewa Ziobrę' 18:01:53.555: [obs-localvocal] VAD segment 1. pushed 52224 to 99892 (47668 frames / 2979 ms). current size: 190672 bytes / 47668 frames / 2979 ms 18:01:53.555: [obs-localvocal] end not reached. vad state: start ts: 18446742355933641010, end ts: 0 18:01:53.606: [obs-localvocal] vad based segmentation. currently 932884 bytes in the audio input buffer 18:01:53.606: [obs-localvocal] found 233221 frames from info buffer. 0 in overlap 18:01:53.609: [obs-localvocal] resampled: 2 channels, 77741 frames, 4858.812500 ms 18:01:53.619: [obs-localvocal] VAD segment 0. pushed 0 to 77741 (77741 frames / 4858 ms). current size: 501636 bytes / 125409 frames / 7838 ms 18:01:53.619: [obs-localvocal] end not reached. vad state: start ts: 18446742355933641010, end ts: 18446742355933648821 18:01:54.152: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:01:54.152: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:01:54.153: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:01:54.156: [obs-localvocal] VAD segment 0. pushed 0 to 8768 (8768 frames / 548 ms). current size: 536708 bytes / 134177 frames / 8386 ms 18:01:54.156: [obs-localvocal] VAD segment end -> send to inference 18:01:54.156: [obs-localvocal] run_whisper_inference: processing 134497 samples, 8.406 sec, 4 threads 18:01:59.271: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:01:59.271: [obs-localvocal] S 0, Token 1: 1176 Z p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 2: 67 d p: 0.973 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 3: 3782 ani p: 0.999 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 4: 443 em p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 5: 400 And p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 6: 13503 rze p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 7: 2938 ja p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 8: 42622 Dud p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 9: 88 y p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 10: 367 r p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 11: 23876 ząd p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 12: 261 w p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 13: 32686 ostat p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 14: 77 n p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 15: 480 ich p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 16: 3044 god p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 17: 23584 zin p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 18: 608 ach p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 19: 45369 kamp p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 20: 3782 ani p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 21: 72 i p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 22: 4628 wy p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 23: 3918 bor p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 24: 9680 cze p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 25: 73 j p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 26: 7870 sz p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 27: 13599 uka p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 28: 8384 ko p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 29: 89 z p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 30: 5024 ła p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 31: 295 of p: 1.000 [keep: 1] 18:01:59.271: [obs-localvocal] S 0, Token 32: 9448 iar p: 0.986 [keep: 1] 18:01:59.271: Last log entry repeated for 28 more lines 18:01:59.271: [obs-localvocal] Time token found 50764 -> 7.980. Duration: 8.406. Ratio: 1.053. 18:01:59.271: [obs-localvocal] S 0, Token 61: 50764 [_TT_400] p: 0.725 [keep: 0] 18:01:59.271: [obs-localvocal] Decoded sentence: ' Zdaniem Andrzeja Dudy rząd w ostatnich godzinach kampanii wyborczej szuka kozła ofiarnego. Prezydent zapowiedział, że zanim podejmie jakąkolwiek decyzję w tej sprawie...' 18:01:59.327: [obs-localvocal] vad based segmentation. currently 993072 bytes in the audio input buffer 18:01:59.327: [obs-localvocal] found 248268 frames from info buffer. 0 in overlap 18:01:59.330: [obs-localvocal] resampled: 2 channels, 82756 frames, 5172.250000 ms 18:01:59.340: [obs-localvocal] VAD segment 0. pushed 0 to 53248 (53248 frames / 3328 ms). current size: 212992 bytes / 53248 frames / 3328 ms 18:01:59.340: [obs-localvocal] VAD segment end -> send to inference 18:01:59.340: [obs-localvocal] run_whisper_inference: processing 53568 samples, 3.348 sec, 4 threads 18:02:03.561: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:02:03.561: [obs-localvocal] S 0, Token 1: 37587 musi p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 2: 21281 poz p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 3: 629 na p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 4: 2162 ć p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 5: 45864 dokład p: 0.999 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 6: 716 ne p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 7: 16851 uz p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 8: 296 as p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 9: 345 ad p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 10: 77 n p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 11: 27385 ienie p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 12: 5277 ze p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 13: 32406 strony p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 14: 367 r p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 15: 8925 zą p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 16: 769 du p: 1.000 [keep: 1] 18:02:03.561: [obs-localvocal] S 0, Token 17: 13 . p: 1.000 [keep: 0] 18:02:03.561: [obs-localvocal] Time token found 50530 -> 3.300. Duration: 3.348. Ratio: 1.015. 18:02:03.561: [obs-localvocal] S 0, Token 18: 50530 [_TT_166] p: 0.532 [keep: 0] 18:02:03.561: [obs-localvocal] Decoded sentence: ' musi poznać dokładne uzasadnienie ze strony rządu' 18:02:03.561: [obs-localvocal] VAD segment 1. pushed 53248 to 82756 (29508 frames / 1844 ms). current size: 118032 bytes / 29508 frames / 1844 ms 18:02:03.561: [obs-localvocal] end not reached. vad state: start ts: 18446742355933652724, end ts: 0 18:02:03.616: [obs-localvocal] vad based segmentation. currently 827560 bytes in the audio input buffer 18:02:03.616: [obs-localvocal] found 206890 frames from info buffer. 0 in overlap 18:02:03.619: [obs-localvocal] resampled: 2 channels, 68963 frames, 4310.187500 ms 18:02:03.627: [obs-localvocal] VAD segment 0. pushed 0 to 68963 (68963 frames / 4310 ms). current size: 393884 bytes / 98471 frames / 6154 ms 18:02:03.627: [obs-localvocal] end not reached. vad state: start ts: 18446742355933652724, end ts: 18446742355933658852 18:02:04.162: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:02:04.163: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:02:04.163: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:02:04.166: [obs-localvocal] VAD segment 0. pushed 0 to 3072 (3072 frames / 192 ms). current size: 406172 bytes / 101543 frames / 6346 ms 18:02:04.166: [obs-localvocal] VAD segment end -> send to inference 18:02:04.166: [obs-localvocal] run_whisper_inference: processing 101863 samples, 6.366 sec, 4 threads 18:02:08.887: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:02:08.887: [obs-localvocal] S 0, Token 1: 3530 Ja p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 2: 277 o p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 3: 3689 cz p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 4: 916 ek p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 5: 18258 uję p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 6: 4628 wy p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 7: 2938 ja p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 8: 12221 śnie p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 9: 5248 ń p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 10: 261 w p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 11: 12573 tej p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 12: 42035 kwest p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 13: 5597 ii p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 14: 11 , p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 15: 748 bo p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 16: 3611 od p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 17: 6120 wo p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 18: 1221 ł p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 19: 7155 anie p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 20: 47296 pana p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 21: 13048 Jane p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 22: 3689 cz p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 23: 2330 ka p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 24: 29764 wym p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 25: 9286 aga p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 26: 40948 zg p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 27: 843 ody p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 28: 659 pre p: 0.940 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 29: 1229 zy p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 30: 67 d p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 31: 8938 enta p: 1.000 [keep: 1] 18:02:08.887: [obs-localvocal] S 0, Token 32: 497 R p: 1.000 [keep: 1] 18:02:08.887: Last log entry repeated for 8 more lines 18:02:08.887: [obs-localvocal] Time token found 50684 -> 6.380. Duration: 6.366. Ratio: 1.002. 18:02:08.887: [obs-localvocal] S 0, Token 41: 50684 [_TT_320] p: 0.342 [keep: 0] 18:02:08.887: [obs-localvocal] Decoded sentence: ' Ja oczekuję wyjaśnień w tej kwestii, bo odwołanie pana Janeczka wymaga zgody prezydenta Rzeczypospolitej' 18:02:08.887: [obs-localvocal] VAD segment 1. pushed 3072 to 8359 (5287 frames / 330 ms). current size: 21148 bytes / 5287 frames / 330 ms 18:02:08.887: [obs-localvocal] end not reached. vad state: start ts: 18446742355933659071, end ts: 0 18:02:08.942: [obs-localvocal] vad based segmentation. currently 917836 bytes in the audio input buffer 18:02:08.942: [obs-localvocal] found 229459 frames from info buffer. 0 in overlap 18:02:08.945: [obs-localvocal] resampled: 2 channels, 76487 frames, 4780.437500 ms 18:02:08.954: [obs-localvocal] VAD segment 0. pushed 0 to 67584 (67584 frames / 4224 ms). current size: 291484 bytes / 72871 frames / 4554 ms 18:02:08.954: [obs-localvocal] VAD segment end -> send to inference 18:02:08.955: [obs-localvocal] run_whisper_inference: processing 73191 samples, 4.574 sec, 4 threads 18:02:13.402: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:02:13.402: [obs-localvocal] S 0, Token 1: 4042 Ma p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 2: 15069 być p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 3: 25984 wyd p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 4: 2095 ana p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 5: 14064 przez p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 6: 659 pre p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 7: 1229 zy p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 8: 67 d p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 9: 8938 enta p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 10: 40948 zg p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 11: 13449 oda p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 12: 11 , p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 13: 281 to p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 14: 37587 musi p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 15: 15069 być p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 16: 281 to p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 17: 979 dec p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 18: 37433 yz p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 19: 2938 ja p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 20: 11 , p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 21: 19456 która p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 22: 3492 jest p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 23: 16851 uz p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 24: 296 as p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 25: 345 ad p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 26: 77 n p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 27: 21758 iona p: 1.000 [keep: 1] 18:02:13.402: [obs-localvocal] S 0, Token 28: 13 . p: 1.000 [keep: 0] 18:02:13.402: [obs-localvocal] Time token found 50614 -> 4.980. Duration: 4.574. Ratio: 1.089. 18:02:13.402: [obs-localvocal] S 0, Token 29: 50614 [_TT_250] p: 0.971 [keep: 0] 18:02:13.402: [obs-localvocal] Decoded sentence: ' Ma być wydana przez prezydenta zgoda, to musi być to decyzja, która jest uzasadniona' 18:02:13.402: [obs-localvocal] VAD segment 1. pushed 67584 to 76487 (8903 frames / 556 ms). current size: 35612 bytes / 8903 frames / 556 ms 18:02:13.402: [obs-localvocal] end not reached. vad state: start ts: 18446742355933663625, end ts: 0 18:02:13.457: [obs-localvocal] vad based segmentation. currently 867684 bytes in the audio input buffer 18:02:13.457: [obs-localvocal] found 216921 frames from info buffer. 0 in overlap 18:02:13.460: [obs-localvocal] resampled: 2 channels, 72307 frames, 4519.187500 ms 18:02:13.468: [obs-localvocal] VAD segment 0. pushed 0 to 72307 (72307 frames / 4519 ms). current size: 324840 bytes / 81210 frames / 5075 ms 18:02:13.468: [obs-localvocal] end not reached. vad state: start ts: 18446742355933663625, end ts: 18446742355933668675 18:02:14.003: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:14.003: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:14.005: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:14.009: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 359948 bytes / 89987 frames / 5624 ms 18:02:14.009: [obs-localvocal] end not reached. vad state: start ts: 18446742355933663625, end ts: 18446742355933669223 18:02:14.535: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:14.535: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:14.536: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:14.539: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 395056 bytes / 98764 frames / 6172 ms 18:02:14.539: [obs-localvocal] end not reached. vad state: start ts: 18446742355933663625, end ts: 18446742355933669772 18:02:15.074: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:02:15.075: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:02:15.075: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:02:15.078: [obs-localvocal] VAD segment 0. pushed 0 to 4096 (4096 frames / 256 ms). current size: 411440 bytes / 102860 frames / 6428 ms 18:02:15.078: [obs-localvocal] VAD segment end -> send to inference 18:02:15.078: [obs-localvocal] run_whisper_inference: processing 103180 samples, 6.449 sec, 4 threads 18:02:19.612: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:02:19.612: [obs-localvocal] S 0, Token 1: 39448 Dz p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 2: 22356 isiaj p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 3: 261 w p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 4: 48569 moim p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 5: 29785 przek p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 6: 266 on p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 7: 25849 aniu p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 8: 16851 uz p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 9: 296 as p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 10: 345 ad p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 11: 77 n p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 12: 18811 ienia p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 13: 12285 dla p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 14: 979 dec p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 15: 37433 yz p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 16: 4013 ji p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 17: 277 o p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 18: 3611 od p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 19: 6120 wo p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 20: 1221 ł p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 21: 25849 aniu p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 22: 8627 tego p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 23: 36500 konkret p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 24: 11858 nego p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 25: 26476 funk p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 26: 45677 cjon p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 27: 27440 arius p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 28: 2394 za p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 29: 28757 polsk p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 30: 12200 iego p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 31: 43289 państwa p: 1.000 [keep: 1] 18:02:19.612: [obs-localvocal] S 0, Token 32: 485 ... p: 0.999 [keep: 1] 18:02:19.612: [obs-localvocal] Time token found 50686 -> 6.420. Duration: 6.448. Ratio: 1.004. 18:02:19.612: [obs-localvocal] S 0, Token 33: 50686 [_TT_322] p: 0.926 [keep: 0] 18:02:19.612: [obs-localvocal] Decoded sentence: ' Dzisiaj w moim przekonaniu uzasadnienia dla decyzji o odwołaniu tego konkretnego funkcjonariusza polskiego państwa...' 18:02:19.612: [obs-localvocal] VAD segment 1. pushed 4096 to 8359 (4263 frames / 266 ms). current size: 17052 bytes / 4263 frames / 266 ms 18:02:19.612: [obs-localvocal] end not reached. vad state: start ts: 18446742355933670054, end ts: 0 18:02:19.663: [obs-localvocal] vad based segmentation. currently 882732 bytes in the audio input buffer 18:02:19.663: [obs-localvocal] found 220683 frames from info buffer. 0 in overlap 18:02:19.666: [obs-localvocal] resampled: 2 channels, 73561 frames, 4597.562500 ms 18:02:19.675: [obs-localvocal] VAD segment 0. pushed 0 to 28672 (28672 frames / 1792 ms). current size: 131740 bytes / 32935 frames / 2058 ms 18:02:19.675: [obs-localvocal] VAD segment end -> send to inference 18:02:19.675: [obs-localvocal] run_whisper_inference: processing 33255 samples, 2.078 sec, 4 threads 18:02:23.738: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:02:23.738: [obs-localvocal] S 0, Token 1: 8627 tego p: 1.000 [keep: 1] 18:02:23.738: [obs-localvocal] S 0, Token 2: 447 pro p: 1.000 [keep: 1] 18:02:23.738: [obs-localvocal] S 0, Token 3: 33503 kur p: 1.000 [keep: 1] 18:02:23.738: [obs-localvocal] S 0, Token 4: 1639 ator p: 1.000 [keep: 1] 18:02:23.738: [obs-localvocal] S 0, Token 5: 64 a p: 1.000 [keep: 1] 18:02:23.738: [obs-localvocal] S 0, Token 6: 261 w p: 1.000 [keep: 1] 18:02:23.738: [obs-localvocal] S 0, Token 7: 12573 tej p: 1.000 [keep: 1] 18:02:23.738: [obs-localvocal] S 0, Token 8: 22734 spraw p: 1.000 [keep: 1] 18:02:23.738: [obs-localvocal] S 0, Token 9: 414 ie p: 1.000 [keep: 1] 18:02:23.738: [obs-localvocal] S 0, Token 10: 13 . p: 1.000 [keep: 0] 18:02:23.738: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 2.078. Ratio: 1.049. 18:02:23.738: [obs-localvocal] S 0, Token 11: 50464 [_TT_100] p: 0.984 [keep: 0] 18:02:23.738: [obs-localvocal] Decoded sentence: ' tego prokuratora w tej sprawie' 18:02:23.738: [obs-localvocal] VAD segment 1. pushed 28672 to 51200 (22528 frames / 1408 ms). current size: 90112 bytes / 22528 frames / 1408 ms 18:02:23.738: [obs-localvocal] VAD segment end -> send to inference 18:02:23.738: [obs-localvocal] run_whisper_inference: processing 22848 samples, 1.428 sec, 4 threads 18:02:27.723: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:02:27.723: [obs-localvocal] S 0, Token 1: 14286 Kom p: 1.000 [keep: 1] 18:02:27.723: [obs-localvocal] S 0, Token 2: 14657 plet p: 1.000 [keep: 1] 18:02:27.723: [obs-localvocal] S 0, Token 3: 2766 nie p: 1.000 [keep: 1] 18:02:27.723: [obs-localvocal] S 0, Token 4: 1548 bra p: 1.000 [keep: 1] 18:02:27.723: [obs-localvocal] S 0, Token 5: 74 k p: 1.000 [keep: 1] 18:02:27.723: [obs-localvocal] S 0, Token 6: 0 ! p: 0.894 [keep: 1] 18:02:27.723: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 1.428. Ratio: 1.387. 18:02:27.723: [obs-localvocal] S 0, Token 7: 50464 [_TT_100] p: 1.000 [keep: 0] 18:02:27.723: [obs-localvocal] Decoded sentence: ' Kompletnie brak!' 18:02:27.724: [obs-localvocal] VAD segment 2. pushed 51200 to 73561 (22361 frames / 1397 ms). current size: 89444 bytes / 22361 frames / 1397 ms 18:02:27.724: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 0 18:02:27.779: [obs-localvocal] vad based segmentation. currently 1559824 bytes in the audio input buffer 18:02:27.779: [obs-localvocal] found 389956 frames from info buffer. 0 in overlap 18:02:27.784: [obs-localvocal] resampled: 2 channels, 129985 frames, 8124.062500 ms 18:02:27.799: [obs-localvocal] VAD segment 0. pushed 0 to 129985 (129985 frames / 8124 ms). current size: 609384 bytes / 152346 frames / 9521 ms 18:02:27.799: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933683016 18:02:28.332: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:28.332: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:28.333: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:28.336: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 644492 bytes / 161123 frames / 10070 ms 18:02:28.336: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933683564 18:02:28.875: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:28.876: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:28.876: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:02:28.879: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 679604 bytes / 169901 frames / 10618 ms 18:02:28.879: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933684113 18:02:29.406: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:02:29.406: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:02:29.407: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:02:29.410: [obs-localvocal] VAD segment 0. pushed 2496 to 8359 (5863 frames / 366 ms). current size: 703056 bytes / 175764 frames / 10985 ms 18:02:29.410: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933684635 18:02:29.953: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:29.953: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:29.954: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:29.957: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 738164 bytes / 184541 frames / 11533 ms 18:02:29.957: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933685184 18:02:30.498: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:30.498: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:30.499: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:30.501: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 773272 bytes / 193318 frames / 12082 ms 18:02:30.501: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933685732 18:02:31.040: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:02:31.040: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:02:31.041: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:02:31.044: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 806708 bytes / 201677 frames / 12604 ms 18:02:31.044: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933686255 18:02:31.581: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:31.582: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:31.582: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:31.586: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 841816 bytes / 210454 frames / 13153 ms 18:02:31.586: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933686803 18:02:32.122: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:32.122: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:32.123: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:32.126: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 876924 bytes / 219231 frames / 13701 ms 18:02:32.127: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933687352 18:02:32.658: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:02:32.658: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:02:32.658: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:02:32.661: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 910364 bytes / 227591 frames / 14224 ms 18:02:32.661: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933687875 18:02:33.202: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:33.202: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:33.203: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:33.205: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 945472 bytes / 236368 frames / 14773 ms 18:02:33.205: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933688423 18:02:33.746: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:33.746: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:33.747: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:33.750: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 980580 bytes / 245145 frames / 15321 ms 18:02:33.750: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933688972 18:02:34.279: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:02:34.279: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:02:34.280: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:02:34.282: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 1014016 bytes / 253504 frames / 15844 ms 18:02:34.282: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933689494 18:02:34.821: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:34.821: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:34.822: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:34.825: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1049124 bytes / 262281 frames / 16392 ms 18:02:34.825: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933690043 18:02:35.362: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:35.362: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:35.363: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:35.366: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 1084232 bytes / 271058 frames / 16941 ms 18:02:35.366: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933690591 18:02:35.899: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:35.899: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:35.900: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:02:35.903: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 1119344 bytes / 279836 frames / 17489 ms 18:02:35.903: [obs-localvocal] end not reached. vad state: start ts: 18446742355933673520, end ts: 18446742355933691140 18:02:36.438: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:02:36.439: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:02:36.439: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:02:36.442: [obs-localvocal] VAD segment 0. pushed 0 to 7744 (7744 frames / 484 ms). current size: 1150320 bytes / 287580 frames / 17973 ms 18:02:36.442: [obs-localvocal] VAD segment end -> send to inference 18:02:36.443: [obs-localvocal] run_whisper_inference: processing 287900 samples, 17.994 sec, 4 threads 18:02:39.466: save_or_load_event_callback 1, 1273606138 18:02:39.466: obs save event 18:02:39.523: [rtmp stream: 'simple_stream'] User stopped the stream 18:02:39.523: [rtmp stream: 'multi-output'] User stopped the stream 18:02:39.524: Output 'simple_stream': stopping 18:02:39.524: Output 'simple_stream': Total frames output: 8383 18:02:39.524: Output 'simple_stream': Total drawn frames: 8402 18:02:39.524: stream_stopped_event 18:02:39.524: enabled: 0, is_streaming 0, streaming_output_enabled 1, streaming_transcripts_enabled 0, is_streaming_relevant: 0, is_recording 0, recording_output_enabled 0, recording_transcripts_enabled 0, is_recording_relevant: 0, is_virtualcam_on 0, virtualcam_transcripts_enabled 0, is_virtualcam_relevant 0, is_preview_open 0, is_text_output_relevant 0, scene_collection_name: , source: 'TOK', equal_settings 1, do_captioning 0 18:02:39.524: caption_output_writer_loop streaming done 18:02:39.524: settings changed, disabling captioning 18:02:39.524: Output 'multi-output': stopping 18:02:39.524: Output 'multi-output': Total frames output: 8383 18:02:39.524: Output 'multi-output': Total drawn frames: 8402 18:02:39.524: [obs-multi-rtmp] Release output while it is active. 18:02:39.527: ==== Streaming Stop ================================================ 18:02:42.660: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:02:42.661: [obs-localvocal] S 0, Token 1: 6001 Pre p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 2: 1229 zy p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 3: 67 d p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 4: 317 ent p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 5: 22675 zar p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 6: 11728 zu p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 7: 537 ci p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 8: 1221 ł p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 9: 9516 też p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 10: 12689 premier p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 11: 24503 owi p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 12: 11 , p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 13: 3561 że p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 14: 277 o p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 15: 22734 spraw p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 16: 414 ie p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 17: 35802 zat p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 18: 13047 rzy p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 19: 37268 mania p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 20: 19625 ż p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 21: 78 o p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 22: 1221 ł p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 23: 19165 nier p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 24: 1229 zy p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 25: 9459 dow p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 26: 15338 iedz p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 27: 8908 iał p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 28: 3244 się p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 29: 710 z p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 30: 17269 medi p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 31: 3901 ów p: 1.000 [keep: 1] 18:02:42.661: [obs-localvocal] S 0, Token 32: 13 . p: 1.000 [keep: 1] 18:02:42.661: Last log entry repeated for 82 more lines 18:02:42.661: [obs-localvocal] Time token found 51264 -> 17.980. Duration: 17.993. Ratio: 1.001. 18:02:42.661: [obs-localvocal] S 0, Token 115: 51264 [_TT_900] p: 1.000 [keep: 0] 18:02:42.661: [obs-localvocal] Decoded sentence: ' Prezydent zarzucił też premierowi, że o sprawie zatrzymania żołnierzy dowiedział się z mediów. Najwyraźniej nie on jeden dowiedział się w ten sposób, bo byliśmy wczoraj świadkami długiego przepychania się. Kto wiedział, a kto nie wiedział o zatrzymaniu żołnierzy, które to przepychanka świadczy o ogromnym chaosie informacyjnym w polskim aparacie władzy' 18:02:42.715: [obs-localvocal] vad based segmentation. currently 1203724 bytes in the audio input buffer 18:02:42.715: [obs-localvocal] found 300931 frames from info buffer. 0 in overlap 18:02:42.719: [obs-localvocal] resampled: 2 channels, 100310 frames, 6269.375000 ms 18:02:42.730: [obs-localvocal] VAD segment 0. pushed 0 to 100310 (100310 frames / 6269 ms). current size: 401240 bytes / 100310 frames / 6269 ms 18:02:42.730: [obs-localvocal] end not reached. vad state: start ts: 18446742355933691688, end ts: 18446742355933697932 18:02:43.258: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:43.258: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:43.259: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:43.262: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 436348 bytes / 109087 frames / 6817 ms 18:02:43.262: [obs-localvocal] end not reached. vad state: start ts: 18446742355933691688, end ts: 18446742355933698480 18:02:43.488: [obs-localvocal] whisper_model_path_external modified 18:02:43.489: [obs-localvocal] LocalVocal filter update 18:02:43.489: [obs-localvocal] buffered_output disable 18:02:43.489: [obs-localvocal] update text source 18:02:43.489: [obs-localvocal] update whisper model 18:02:43.489: [obs-localvocal] Model path did not change: Whisper Large q5 (1Gb) == Whisper Large q5 (1Gb) 18:02:43.489: [obs-localvocal] update whisper params 18:02:43.669: Populating font family aliases took 160 ms. Replace uses of missing font family "MS Shell Dlg" with one that exists to avoid this cost. 18:02:43.703: save_or_load_event_callback 1, 1273606138 18:02:43.703: obs save event 18:02:43.807: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:43.807: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:43.807: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:43.809: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 471456 bytes / 117864 frames / 7366 ms 18:02:43.809: [obs-localvocal] end not reached. vad state: start ts: 18446742355933691688, end ts: 18446742355933543743 18:02:44.341: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:44.341: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:44.342: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:44.344: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 506564 bytes / 126641 frames / 7915 ms 18:02:44.344: [obs-localvocal] end not reached. vad state: start ts: 18446742355933691688, end ts: 18446742355933544291 18:02:44.882: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:02:44.883: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:02:44.884: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:02:44.887: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 540004 bytes / 135001 frames / 8437 ms 18:02:44.887: [obs-localvocal] end not reached. vad state: start ts: 18446742355933691688, end ts: 18446742355933544814 18:02:45.434: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:45.434: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:45.435: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:45.438: [obs-localvocal] VAD segment 0. pushed 0 to 6720 (6720 frames / 420 ms). current size: 566884 bytes / 141721 frames / 8857 ms 18:02:45.438: [obs-localvocal] VAD segment end -> send to inference 18:02:45.438: [obs-localvocal] run_whisper_inference: processing 142041 samples, 8.878 sec, 4 threads 18:02:50.695: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:02:50.696: [obs-localvocal] S 0, Token 1: 343 W p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 2: 6655 yd p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 3: 289 ar p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 4: 14320 zenia p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 5: 360 do p: 0.753 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 6: 4207 jak p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 7: 480 ich p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 8: 35045 dwa p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 9: 41543 mies p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 10: 11404 ią p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 11: 384 ce p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 12: 33346 temu p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 13: 4491 dos p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 14: 89 z p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 15: 5249 ło p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 16: 1667 na p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 17: 9370 gran p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 18: 299 ic p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 19: 1274 ę p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 20: 11 , p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 21: 257 a p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 22: 23306 także p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 23: 261 w p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 24: 3689 cz p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 25: 284 or p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 26: 1805 aj p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 27: 82 s p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 28: 2394 za p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 29: 46991 śm p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 30: 811 ier p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 31: 2162 ć p: 1.000 [keep: 1] 18:02:50.696: [obs-localvocal] S 0, Token 32: 28757 polsk p: 1.000 [keep: 1] 18:02:50.696: Last log entry repeated for 37 more lines 18:02:50.696: [obs-localvocal] Time token found 50796 -> 8.620. Duration: 8.877. Ratio: 1.030. 18:02:50.696: [obs-localvocal] S 0, Token 70: 50796 [_TT_432] p: 0.504 [keep: 0] 18:02:50.696: [obs-localvocal] Decoded sentence: ' Wydarzenia do jakich dwa miesiące temu doszło na granicę, a także wczorajsza śmierć polskiego żołnierza ugodzonego nożem mają zostać umówione na poniedziałkowej Radzie Bezpieczeństwa Narodowego' 18:02:50.750: [obs-localvocal] vad based segmentation. currently 1023164 bytes in the audio input buffer 18:02:50.750: [obs-localvocal] found 255791 frames from info buffer. 0 in overlap 18:02:50.753: [obs-localvocal] resampled: 2 channels, 85263 frames, 5328.937500 ms 18:02:50.764: [obs-localvocal] VAD segment 0. pushed 0 to 85263 (85263 frames / 5328 ms). current size: 341052 bytes / 85263 frames / 5328 ms 18:02:50.764: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545389, end ts: 18446742355933550691 18:02:51.299: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:51.299: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:51.299: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:02:51.301: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 376164 bytes / 94041 frames / 5877 ms 18:02:51.301: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545389, end ts: 18446742355933551240 18:02:51.829: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:02:51.830: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:02:51.830: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:02:51.834: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 409600 bytes / 102400 frames / 6400 ms 18:02:51.834: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545389, end ts: 18446742355933551762 18:02:52.374: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:52.374: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:52.375: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:52.378: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 444708 bytes / 111177 frames / 6948 ms 18:02:52.378: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545389, end ts: 18446742355933552311 18:02:52.911: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:52.911: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:52.912: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:52.915: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 479816 bytes / 119954 frames / 7497 ms 18:02:52.915: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545389, end ts: 18446742355933552860 18:02:53.456: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:02:53.456: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:02:53.457: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:02:53.460: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 513252 bytes / 128313 frames / 8019 ms 18:02:53.460: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545389, end ts: 18446742355933553382 18:02:54.004: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:02:54.004: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:02:54.005: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:02:54.007: [obs-localvocal] VAD segment 0. pushed 0 to 3072 (3072 frames / 192 ms). current size: 525540 bytes / 131385 frames / 8211 ms 18:02:54.007: [obs-localvocal] VAD segment end -> send to inference 18:02:54.008: [obs-localvocal] run_whisper_inference: processing 131705 samples, 8.232 sec, 4 threads 18:02:58.592: [obs-localvocal] LocalVocal filter update 18:02:58.592: [obs-localvocal] buffered_output disable 18:02:58.592: [obs-localvocal] update text source 18:02:58.592: [obs-localvocal] update whisper model 18:02:58.592: [obs-localvocal] model path changed from Whisper Large q5 (1Gb) to Whisper Tiny (74Mb) 18:02:58.592: [obs-localvocal] shutdown_whisper_thread 18:02:59.050: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:02:59.050: [obs-localvocal] S 0, Token 1: 6056 Na p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 2: 11212 naj p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 3: 32117 bli p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 4: 1427 ż p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 5: 7706 szy p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 6: 76 m p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 7: 1366 pos p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 8: 1091 ied p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 9: 39651 zeniu p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 10: 1100 Se p: 0.943 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 11: 73 j p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 12: 20140 mu p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 13: 26064 mają p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 14: 34430 zac p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 15: 8925 zą p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 16: 2162 ć p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 17: 3244 się p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 18: 582 pr p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 19: 617 ace p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 20: 12617 nad p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 21: 659 pre p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 22: 1229 zy p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 23: 1556 den p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 24: 547 ck p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 25: 332 im p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 26: 447 pro p: 0.432 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 27: 2884 je p: 0.561 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 28: 2320 kt p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 29: 443 em p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 30: 26189 ust p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 31: 41961 awy p: 1.000 [keep: 1] 18:02:59.050: [obs-localvocal] S 0, Token 32: 277 o p: 1.000 [keep: 1] 18:02:59.050: Last log entry repeated for 27 more lines 18:02:59.050: [obs-localvocal] Time token found 50764 -> 7.980. Duration: 8.231. Ratio: 1.031. 18:02:59.050: [obs-localvocal] S 0, Token 60: 50764 [_TT_400] p: 0.540 [keep: 0] 18:02:59.050: [obs-localvocal] Decoded sentence: ' Na najbliższym posiedzeniu Sejmu mają zacząć się prace nad prezydenckim projektem ustawy o działaniach na wypadek zewnętrznego zagrożenia bezpieczeństwa państwa. Propozycja...' 18:02:59.051: [obs-localvocal] VAD segment 1. pushed 3072 to 8777 (5705 frames / 356 ms). current size: 22820 bytes / 5705 frames / 356 ms 18:02:59.051: [obs-localvocal] end not reached. vad state: start ts: 18446742355933553600, end ts: 0 18:02:59.129: [obs-localvocal] Whisper context is null, exiting thread 18:02:59.129: [obs-localvocal] exiting whisper thread 18:02:59.129: [obs-localvocal] Checking if model 'Whisper Tiny' exists in data... 18:02:59.129: [obs-localvocal] Model not found in data: /Users/stream360/Library/Application Support/obs-studio/plugins/obs-localvocal.plugin/Contents/Resources/models/ggml-model-whisper-tiny 18:02:59.129: [obs-localvocal] Checking if model 'Whisper Tiny' exists in config... 18:02:59.129: [obs-localvocal] Model path in config: /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-tiny 18:02:59.129: [obs-localvocal] Model exists in config folder: /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-tiny 18:02:59.130: [obs-localvocal] Model bin file found in folder: /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-tiny/ggml-model-whisper-tiny.bin 18:02:59.130: [obs-localvocal] start_whisper_thread_with_path: /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-tiny/ggml-model-whisper-tiny.bin 18:02:59.170: [obs-localvocal] Loading whisper model from /Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-tiny/ggml-model-whisper-tiny.bin 18:02:59.170: [obs-localvocal] Using CPU for inference 18:02:59.170: [obs-localvocal] DTW token timestamps disabled 18:02:59.170: [obs-localvocal] Whisper: whisper_init_from_file_with_params_no_state: loading model from '/Users/stream360/Library/Application Support/obs-studio/plugin_config/obs-localvocal/models/ggml-model-whisper-tiny/ggml-model-whisper-tiny.bin' 18:02:59.170: [obs-localvocal] Whisper: whisper_init_with_params_no_state: use gpu = 0 18:02:59.170: [obs-localvocal] Whisper: whisper_init_with_params_no_state: flash attn = 0 18:02:59.170: [obs-localvocal] Whisper: whisper_init_with_params_no_state: gpu_device = 0 18:02:59.170: [obs-localvocal] Whisper: whisper_init_with_params_no_state: dtw = 0 18:02:59.170: [obs-localvocal] Whisper: whisper_model_load: loading model 18:02:59.170: [obs-localvocal] Whisper: whisper_model_load: n_vocab = 51865 18:02:59.170: [obs-localvocal] Whisper: whisper_model_load: n_audio_ctx = 1500 18:02:59.170: [obs-localvocal] Whisper: whisper_model_load: n_audio_state = 384 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: n_audio_head = 6 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: n_audio_layer = 4 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: n_text_ctx = 448 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: n_text_state = 384 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: n_text_head = 6 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: n_text_layer = 4 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: n_mels = 80 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: ftype = 1 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: qntvr = 0 18:02:59.171: [obs-localvocal] Whisper: whisper_model_load: type = 1 (tiny) 18:02:59.191: [obs-localvocal] Whisper: whisper_model_load: adding 1608 extra tokens 18:02:59.191: [obs-localvocal] Whisper: whisper_model_load: n_langs = 99 18:02:59.191: [obs-localvocal] Whisper: whisper_model_load: CPU total size = 77.11 MB 18:02:59.216: [obs-localvocal] Whisper: whisper_model_load: model size = 77.11 MB 18:02:59.217: [obs-localvocal] Whisper: whisper_init_state: kv self size = 9.44 MB 18:02:59.218: [obs-localvocal] Whisper: whisper_init_state: kv cross size = 9.44 MB 18:02:59.218: [obs-localvocal] Whisper: whisper_init_state: kv pad size = 2.36 MB 18:02:59.218: [obs-localvocal] Whisper: whisper_init_state: compute buffer (conv) = 13.32 MB 18:02:59.218: [obs-localvocal] Whisper: whisper_init_state: compute buffer (encode) = 85.66 MB 18:02:59.219: [obs-localvocal] Whisper: whisper_init_state: compute buffer (cross) = 4.01 MB 18:02:59.219: [obs-localvocal] Whisper: whisper_init_state: compute buffer (decode) = 96.02 MB 18:02:59.219: [obs-localvocal] Whisper model loaded: AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 0 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 18:02:59.219: [obs-localvocal] update whisper params 18:02:59.219: [obs-localvocal] starting whisper thread 18:02:59.219: [obs-localvocal] vad based segmentation. currently 978024 bytes in the audio input buffer 18:02:59.219: [obs-localvocal] found 244506 frames from info buffer. 0 in overlap 18:02:59.220: [obs-localvocal] LocalVocal filter update 18:02:59.220: [obs-localvocal] buffered_output disable 18:02:59.220: [obs-localvocal] update text source 18:02:59.220: [obs-localvocal] update whisper model 18:02:59.220: [obs-localvocal] Model path did not change: Whisper Tiny (74Mb) == Whisper Tiny (74Mb) 18:02:59.220: [obs-localvocal] update whisper params 18:02:59.220: [obs-localvocal] LocalVocal filter update 18:02:59.220: [obs-localvocal] buffered_output disable 18:02:59.220: [obs-localvocal] update text source 18:02:59.220: [obs-localvocal] update whisper model 18:02:59.220: [obs-localvocal] Model path did not change: Whisper Tiny (74Mb) == Whisper Tiny (74Mb) 18:02:59.220: [obs-localvocal] update whisper params 18:02:59.222: [obs-localvocal] resampled: 2 channels, 81502 frames, 5093.875000 ms 18:02:59.230: [obs-localvocal] VAD segment 0. pushed 0 to 81502 (81502 frames / 5093 ms). current size: 348828 bytes / 87207 frames / 5450 ms 18:02:59.230: [obs-localvocal] end not reached. vad state: start ts: 18446742355933538226, end ts: 18446742355933543293 18:02:59.759: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:02:59.760: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:02:59.760: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:02:59.762: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 383940 bytes / 95985 frames / 5999 ms 18:02:59.762: [obs-localvocal] end not reached. vad state: start ts: 18446742355933538226, end ts: 18446742355933543973 18:03:00.292: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:00.292: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:00.293: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:00.295: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 417376 bytes / 104344 frames / 6521 ms 18:03:00.295: [obs-localvocal] end not reached. vad state: start ts: 18446742355933538226, end ts: 18446742355933544495 18:03:00.834: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:00.834: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:00.835: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:00.838: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 452484 bytes / 113121 frames / 7070 ms 18:03:00.838: [obs-localvocal] end not reached. vad state: start ts: 18446742355933538226, end ts: 18446742355933545044 18:03:00.974: save_or_load_event_callback 1, 1273606138 18:03:00.974: obs save event 18:03:01.381: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:01.381: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:01.382: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:01.385: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 487592 bytes / 121898 frames / 7618 ms 18:03:01.385: [obs-localvocal] end not reached. vad state: start ts: 18446742355933538226, end ts: 18446742355933545592 18:03:01.930: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:01.930: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:01.931: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:01.934: [obs-localvocal] VAD segment 0. pushed 0 to 4096 (4096 frames / 256 ms). current size: 503976 bytes / 125994 frames / 7874 ms 18:03:01.934: [obs-localvocal] VAD segment end -> send to inference 18:03:01.934: [obs-localvocal] run_whisper_inference: processing 126314 samples, 7.895 sec, 4 threads 18:03:02.175: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:03:02.175: [obs-localvocal] S 0, Token 1: 343 W p: 0.976 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 2: 5024 ła p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 3: 2394 za p: 0.995 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 4: 26470 koń p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 5: 33967 czyć p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 6: 47576 rozp p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 7: 905 oc p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 8: 11052 zę p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 9: 1353 to p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 10: 8290 reform p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 11: 88 y p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 12: 1185 system p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 13: 84 u p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 14: 360 do p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 15: 47751 wod p: 0.995 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 16: 14320 zenia p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 17: 294 in p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 18: 74 k p: 0.661 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 19: 811 ier p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 20: 21308 owania p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 21: 594 ar p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 22: 3057 mi p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 23: 84 u p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 24: 11 , p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 25: 257 a p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 26: 281 to p: 0.997 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 27: 6430 czy p: 0.994 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 28: 33964 między p: 0.983 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 29: 294 in p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 30: 3722 ni p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 31: 6682 proced p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 32: 374 ur p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 33: 34097 uży p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 34: 2755 cia p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 35: 16586 bron p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 36: 72 i p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 37: 1667 na p: 0.998 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 38: 9370 gran p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 39: 2632 icy p: 1.000 [keep: 1] 18:03:02.175: [obs-localvocal] S 0, Token 40: 13 . p: 1.000 [keep: 0] 18:03:02.175: [obs-localvocal] Time token found 50764 -> 7.980. Duration: 7.894. Ratio: 1.011. 18:03:02.175: [obs-localvocal] S 0, Token 41: 50764 [_TT_400] p: 0.996 [keep: 0] 18:03:02.175: [obs-localvocal] Decoded sentence: ' Właza kończyć rozpoczęto reformy systemu do wodzenia inkierowania armiu, a to czy między inni procedur użycia broni na granicy' 18:03:02.175: [obs-localvocal] VAD segment 1. pushed 4096 to 8777 (4681 frames / 292 ms). current size: 18724 bytes / 4681 frames / 292 ms 18:03:02.175: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 0 18:03:02.490: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:02.490: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:02.490: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:02.492: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 53832 bytes / 13458 frames / 841 ms 18:03:02.492: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933546689 18:03:03.024: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:03.025: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:03.025: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:03:03.027: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 88944 bytes / 22236 frames / 1389 ms 18:03:03.027: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933547238 18:03:03.565: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:03.565: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:03.566: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:03.569: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 124052 bytes / 31013 frames / 1938 ms 18:03:03.569: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933547786 18:03:03.803: User is ignoring service bitrate limits. 18:03:03.811: [VideoToolbox simple_video_stream: 'h264']: session created with hardware encoding 18:03:03.841: [VideoToolbox simple_video_stream: 'h264']: settings: 18:03:03.841: vt_encoder_id com.apple.videotoolbox.videoencoder.ave.avc 18:03:03.841: rate_control: CBR 18:03:03.841: bitrate: 4000 (kbps) 18:03:03.841: quality: 0.600000 18:03:03.841: fps_num: 60 18:03:03.841: fps_den: 1 18:03:03.841: width: 1920 18:03:03.841: height: 1080 18:03:03.841: keyint: 2 (s) 18:03:03.841: limit_bitrate: off 18:03:03.841: rc_max_bitrate: 2500 (kbps) 18:03:03.841: rc_max_bitrate_window: 1.500000 (s) 18:03:03.841: hw_enc: on 18:03:03.841: profile: high 18:03:03.841: codec_type: h264 18:03:03.841: 18:03:03.844: [CoreAudio AAC: 'simple_aac']: settings: 18:03:03.844: mode: AAC 18:03:03.844: bitrate: 160 18:03:03.844: sample rate: 48000 18:03:03.844: cbr: on 18:03:03.844: output buffer: 1536 18:03:03.845: [rtmp stream: 'multi-output'] Connecting to RTMP URL rtmp://7001.szczecin.pl/0live... 18:03:03.845: [rtmp stream: 'simple_stream'] Connecting to RTMP URL rtmp://7001.szczecin.pl/0live... 18:03:03.847: save_or_load_event_callback 1, 1273606138 18:03:03.847: obs save event 18:03:04.109: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:04.109: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:04.110: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:04.111: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 157488 bytes / 39372 frames / 2460 ms 18:03:04.111: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933548309 18:03:04.120: [rtmp stream: 'multi-output'] Connection to rtmp://7001.szczecin.pl/0live (57.128.199.81) successful 18:03:04.178: [rtmp stream: 'simple_stream'] Connection to rtmp://7001.szczecin.pl/0live (57.128.199.81) successful 18:03:04.179: stream_started_event 18:03:04.179: enabled: 0, is_streaming 1, streaming_output_enabled 1, streaming_transcripts_enabled 0, is_streaming_relevant: 1, is_recording 0, recording_output_enabled 0, recording_transcripts_enabled 0, is_recording_relevant: 0, is_virtualcam_on 0, virtualcam_transcripts_enabled 0, is_virtualcam_relevant 0, is_preview_open 0, is_text_output_relevant 0, scene_collection_name: , source: 'TOK', equal_settings 1, do_captioning 0 18:03:04.179: caption_output_writer_loop streaming starting 18:03:04.179: settings changed, disabling captioning 18:03:04.182: ==== Streaming Start =============================================== 18:03:04.659: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:04.659: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:04.660: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:04.662: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 192596 bytes / 48149 frames / 3009 ms 18:03:04.662: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933548858 18:03:05.200: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:05.201: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:05.202: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:05.205: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 227704 bytes / 56926 frames / 3557 ms 18:03:05.205: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933549406 18:03:05.749: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:05.749: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:05.750: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:05.753: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 262812 bytes / 65703 frames / 4106 ms 18:03:05.753: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933549955 18:03:06.293: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:06.294: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:06.295: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:06.298: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 297920 bytes / 74480 frames / 4655 ms 18:03:06.298: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933550503 18:03:06.844: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:06.844: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:06.845: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:03:06.849: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 333032 bytes / 83258 frames / 5203 ms 18:03:06.849: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933551052 18:03:07.377: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:07.377: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:07.378: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:07.381: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 366468 bytes / 91617 frames / 5726 ms 18:03:07.381: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933551574 18:03:07.930: [obs-localvocal] vad based segmentation. currently 110344 bytes in the audio input buffer 18:03:07.930: [obs-localvocal] found 27586 frames from info buffer. 0 in overlap 18:03:07.932: [obs-localvocal] resampled: 2 channels, 9195 frames, 574.687500 ms 18:03:07.934: [obs-localvocal] VAD segment 0. pushed 0 to 9195 (9195 frames / 574 ms). current size: 403248 bytes / 100812 frames / 6300 ms 18:03:07.934: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933552149 18:03:08.473: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:08.473: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:08.474: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:08.477: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 436684 bytes / 109171 frames / 6823 ms 18:03:08.477: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933552671 18:03:09.018: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:09.018: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:09.019: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:09.022: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 471792 bytes / 117948 frames / 7371 ms 18:03:09.022: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933553220 18:03:09.557: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:09.557: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:09.558: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:09.561: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 506900 bytes / 126725 frames / 7920 ms 18:03:09.561: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933553769 18:03:10.085: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:10.085: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:10.086: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:03:10.089: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 540340 bytes / 135085 frames / 8442 ms 18:03:10.089: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933554291 18:03:10.628: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:10.628: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:10.628: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:10.630: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 575448 bytes / 143862 frames / 8991 ms 18:03:10.630: [obs-localvocal] end not reached. vad state: start ts: 18446742355933545874, end ts: 18446742355933554840 18:03:11.161: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:11.162: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:11.162: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:11.165: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 608884 bytes / 152221 frames / 9513 ms 18:03:11.165: [obs-localvocal] VAD segment end -> send to inference 18:03:11.165: [obs-localvocal] run_whisper_inference: processing 152541 samples, 9.534 sec, 4 threads 18:03:11.448: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:03:11.448: [obs-localvocal] S 0, Token 1: 1176 Z p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 2: 22620 wiąz p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 3: 5279 ku p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 4: 36746 zast p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 5: 267 at p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 6: 77 n p: 0.993 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 7: 10121 imi p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 8: 4628 wy p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 9: 20327 dar p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 10: 2904 zen p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 11: 15568 iami p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 12: 2838 nie p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 13: 13219 tylko p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 14: 2183 post p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 15: 1274 ę p: 0.947 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 16: 14701 pow p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 17: 282 an p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 18: 4907 iem p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 19: 19625 ż p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 20: 78 o p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 21: 1221 ł p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 22: 19165 nier p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 23: 89 z p: 0.984 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 24: 5103 ali p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 25: 19625 ż p: 0.997 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 26: 474 and p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 27: 4452 arm p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 28: 3901 ów p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 29: 11 , p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 30: 25382 którzy p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 31: 23810 zak p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 32: 425 ul p: 0.999 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 33: 590 iz p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 34: 267 at p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 35: 19390 rz p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 36: 2226 my p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 37: 86 w p: 0.997 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 38: 34644 anych p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 39: 6020 wo p: 0.993 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 40: 32625 jsk p: 1.000 [keep: 1] 18:03:11.448: [obs-localvocal] S 0, Token 41: 19605 owych p: 1.000 [keep: 1] 18:03:11.448: Last log entry repeated for 19 more lines 18:03:11.448: [obs-localvocal] Time token found 50832 -> 9.340. Duration: 9.533. Ratio: 1.021. 18:03:11.448: [obs-localvocal] S 0, Token 61: 50832 [_TT_468] p: 0.557 [keep: 0] 18:03:11.448: [obs-localvocal] Decoded sentence: ' Związku zastatnimi wydarzeniami nie tylko postępowaniem żołnierzali żandarmów, którzy zakulizatrzmywanych wojskowych w Kaidanki w Monm powstaje listę osob do dymisji' 18:03:11.714: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:11.714: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:11.715: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:11.717: [obs-localvocal] VAD segment 0. pushed 448 to 8777 (8329 frames / 520 ms). current size: 33316 bytes / 8329 frames / 520 ms 18:03:11.717: [obs-localvocal] end not reached. vad state: start ts: 18446742355933555416, end ts: 18446742355933555911 18:03:12.246: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:12.246: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:12.247: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:12.250: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 68424 bytes / 17106 frames / 1069 ms 18:03:12.250: [obs-localvocal] end not reached. vad state: start ts: 18446742355933555416, end ts: 18446742355933556459 18:03:12.784: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:12.784: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:12.785: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:12.788: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 101860 bytes / 25465 frames / 1591 ms 18:03:12.788: [obs-localvocal] end not reached. vad state: start ts: 18446742355933555416, end ts: 18446742355933556982 18:03:13.323: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:13.323: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:13.324: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:13.327: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 136968 bytes / 34242 frames / 2140 ms 18:03:13.327: [obs-localvocal] end not reached. vad state: start ts: 18446742355933555416, end ts: 18446742355933557530 18:03:13.862: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:13.863: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:13.865: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:03:13.868: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 172080 bytes / 43020 frames / 2688 ms 18:03:13.868: [obs-localvocal] end not reached. vad state: start ts: 18446742355933555416, end ts: 18446742355933558079 18:03:14.402: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:14.402: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:14.403: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:14.406: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 205516 bytes / 51379 frames / 3211 ms 18:03:14.406: [obs-localvocal] end not reached. vad state: start ts: 18446742355933555416, end ts: 18446742355933558601 18:03:14.934: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:14.934: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:14.934: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:14.937: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 240624 bytes / 60156 frames / 3759 ms 18:03:14.937: [obs-localvocal] end not reached. vad state: start ts: 18446742355933555416, end ts: 18446742355933559150 18:03:15.475: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:15.475: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:15.476: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:15.479: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 274060 bytes / 68515 frames / 4282 ms 18:03:15.479: [obs-localvocal] end not reached. vad state: start ts: 18446742355933555416, end ts: 18446742355933559672 18:03:16.019: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:16.019: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:16.020: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:16.023: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 309168 bytes / 77292 frames / 4830 ms 18:03:16.023: [obs-localvocal] end not reached. vad state: start ts: 18446742355933555416, end ts: 18446742355933560221 18:03:16.555: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:16.555: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:16.556: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:16.559: [obs-localvocal] VAD segment 0. pushed 0 to 3648 (3648 frames / 228 ms). current size: 323760 bytes / 80940 frames / 5058 ms 18:03:16.559: [obs-localvocal] VAD segment end -> send to inference 18:03:16.559: [obs-localvocal] run_whisper_inference: processing 81260 samples, 5.079 sec, 4 threads 18:03:16.778: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:03:16.778: [obs-localvocal] S 0, Token 1: 17296 Radio p: 0.802 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 2: 369 se p: 0.807 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 3: 17836 pin p: 0.999 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 4: 1254 form p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 5: 13008 uje p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 6: 11 , p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 7: 3561 że p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 8: 297 n p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 9: 5103 ali p: 0.999 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 10: 9815 ście p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 11: 9015 są p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 12: 22696 nawet p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 13: 1337 gener p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 14: 64 a p: 0.606 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 15: 1221 ł p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 16: 24503 owi p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 17: 11 , p: 0.966 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 18: 257 a p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 19: 14584 dy p: 0.855 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 20: 3057 mi p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 21: 9815 ście p: 0.928 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 22: 30854 możli p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 23: 826 we p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 24: 9015 są p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 25: 10678 już p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 26: 261 w p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 27: 9224 pon p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 28: 15338 iedz p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 29: 8908 iał p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 30: 916 ek p: 1.000 [keep: 1] 18:03:16.778: [obs-localvocal] S 0, Token 31: 13 . p: 1.000 [keep: 0] 18:03:16.778: [obs-localvocal] Time token found 50614 -> 4.980. Duration: 5.078. Ratio: 1.020. 18:03:16.778: [obs-localvocal] S 0, Token 32: 50614 [_TT_250] p: 0.979 [keep: 0] 18:03:16.778: [obs-localvocal] Decoded sentence: ' Radio sepin formuje, że naliście są nawet generałowi, a dymiście możliwe są już w poniedziałek' 18:03:17.100: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:17.100: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:17.101: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:03:17.103: [obs-localvocal] VAD segment 0. pushed 3520 to 8778 (5258 frames / 328 ms). current size: 21032 bytes / 5258 frames / 328 ms 18:03:17.103: [obs-localvocal] end not reached. vad state: start ts: 18446742355933561015, end ts: 18446742355933561318 18:03:17.632: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:17.632: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:17.632: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:17.635: [obs-localvocal] VAD detected no speech in 8359 frames 18:03:17.635: [obs-localvocal] Last VAD was ON: segment end -> send to inference 18:03:17.635: [obs-localvocal] run_whisper_inference: processing 5578 samples, 0.349 sec, 4 threads 18:03:17.635: [obs-localvocal] Speech segment is less than 1 second, padding with zeros to 1 second 18:03:17.799: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:03:17.799: [obs-localvocal] S 0, Token 1: 343 W p: 0.986 [keep: 1] 18:03:17.799: [obs-localvocal] S 0, Token 2: 83 t p: 0.953 [keep: 1] 18:03:17.799: [obs-localvocal] S 0, Token 3: 6038 edy p: 1.000 [keep: 1] 18:03:17.799: [obs-localvocal] S 0, Token 4: 281 to p: 0.406 [keep: 1] 18:03:17.799: [obs-localvocal] S 0, Token 5: 2838 nie p: 0.988 [keep: 1] 18:03:17.799: [obs-localvocal] S 0, Token 6: 3492 jest p: 0.999 [keep: 1] 18:03:17.799: [obs-localvocal] S 0, Token 7: 281 to p: 0.998 [keep: 1] 18:03:17.799: [obs-localvocal] S 0, Token 8: 13 . p: 0.984 [keep: 0] 18:03:17.799: [obs-localvocal] Time token found 50464 -> 1.980. Duration: 1.010. Ratio: 1.960. 18:03:17.799: [obs-localvocal] S 0, Token 9: 50464 [_TT_100] p: 0.979 [keep: 0] 18:03:17.799: [obs-localvocal] Decoded sentence: ' Wtedy to nie jest to' 18:03:18.171: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:18.172: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:18.172: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:18.175: [obs-localvocal] VAD detected no speech in 8777 frames 18:03:18.715: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:18.715: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:18.715: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:18.717: [obs-localvocal] VAD detected no speech in 8359 frames 18:03:19.260: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:19.260: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:19.261: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:19.264: [obs-localvocal] VAD detected no speech in 8777 frames 18:03:19.795: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:19.795: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:19.796: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:19.798: [obs-localvocal] VAD detected no speech in 8777 frames 18:03:20.334: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:20.334: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:20.335: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:20.338: [obs-localvocal] VAD segment 0. pushed 2496 to 8777 (6281 frames / 392 ms). current size: 25124 bytes / 6281 frames / 392 ms 18:03:20.338: [obs-localvocal] end not reached. vad state: start ts: 18446742355933564191, end ts: 18446742355933564557 18:03:20.866: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:20.866: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:20.867: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:03:20.870: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 58564 bytes / 14641 frames / 915 ms 18:03:20.870: [obs-localvocal] end not reached. vad state: start ts: 18446742355933564191, end ts: 18446742355933565080 18:03:21.403: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:21.403: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:21.404: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:21.407: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 92000 bytes / 23000 frames / 1437 ms 18:03:21.407: [obs-localvocal] end not reached. vad state: start ts: 18446742355933564191, end ts: 18446742355933565602 18:03:21.938: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:21.938: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:21.939: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:21.941: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 127108 bytes / 31777 frames / 1986 ms 18:03:21.941: [obs-localvocal] end not reached. vad state: start ts: 18446742355933564191, end ts: 18446742355933566151 18:03:22.480: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:22.480: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:22.481: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:22.485: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 162216 bytes / 40554 frames / 2534 ms 18:03:22.485: [obs-localvocal] end not reached. vad state: start ts: 18446742355933564191, end ts: 18446742355933566699 18:03:23.020: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:23.020: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:23.021: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:23.024: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 195652 bytes / 48913 frames / 3057 ms 18:03:23.024: [obs-localvocal] end not reached. vad state: start ts: 18446742355933564191, end ts: 18446742355933567222 18:03:23.557: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:23.557: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:23.558: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:23.561: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 230760 bytes / 57690 frames / 3605 ms 18:03:23.561: [obs-localvocal] end not reached. vad state: start ts: 18446742355933564191, end ts: 18446742355933567770 18:03:24.095: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:24.095: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:24.095: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:03:24.098: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 264200 bytes / 66050 frames / 4128 ms 18:03:24.098: [obs-localvocal] end not reached. vad state: start ts: 18446742355933564191, end ts: 18446742355933568293 18:03:24.628: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:24.628: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:24.633: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:24.636: [obs-localvocal] VAD segment 0. pushed 0 to 7744 (7744 frames / 484 ms). current size: 295176 bytes / 73794 frames / 4612 ms 18:03:24.636: [obs-localvocal] VAD segment end -> send to inference 18:03:24.636: [obs-localvocal] run_whisper_inference: processing 74114 samples, 4.632 sec, 4 threads 18:03:24.816: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:03:24.817: [obs-localvocal] S 0, Token 1: 318 S p: 0.736 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 2: 1221 ł p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 3: 305 ow p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 4: 7155 anie p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 5: 274 d p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 6: 12679 nia p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 7: 261 w p: 0.576 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 8: 2843 rad p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 9: 5951 iu p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 10: 281 to p: 0.867 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 11: 15724 kę p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 12: 4010 fem p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] S 0, Token 13: 13 . p: 1.000 [keep: 1] 18:03:24.817: [obs-localvocal] Time token found 50514 -> 2.980. Duration: 4.632. Ratio: 1.554. 18:03:24.817: [obs-localvocal] S 0, Token 14: 50514 [_TT_150] p: 0.999 [keep: 0] 18:03:24.817: [obs-localvocal] S 0, Token 15: 50257 [_EOT_] p: 1.000 [keep: 0] 18:03:24.817: [obs-localvocal] Decoded sentence: ' Słowanie dnia w radiu tokę fem.' 18:03:25.184: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:25.184: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:25.185: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:25.187: [obs-localvocal] VAD detected no speech in 8777 frames 18:03:25.717: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:25.717: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:25.718: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:25.721: [obs-localvocal] VAD detected no speech in 8777 frames 18:03:26.264: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:26.264: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:26.265: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:26.268: [obs-localvocal] VAD detected no speech in 8777 frames 18:03:26.809: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:26.810: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:26.811: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:26.814: [obs-localvocal] VAD segment 0. pushed 3520 to 8359 (4839 frames / 302 ms). current size: 19356 bytes / 4839 frames / 302 ms 18:03:26.814: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933571009 18:03:27.347: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:27.347: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:27.347: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:03:27.350: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 54468 bytes / 13617 frames / 851 ms 18:03:27.350: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933571558 18:03:27.877: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:27.879: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:27.880: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:27.883: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 87904 bytes / 21976 frames / 1373 ms 18:03:27.883: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933572080 18:03:28.419: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:28.419: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:28.420: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:28.423: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 123012 bytes / 30753 frames / 1922 ms 18:03:28.423: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933572629 18:03:28.972: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:28.972: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:28.973: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:28.976: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 158120 bytes / 39530 frames / 2470 ms 18:03:28.976: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933573178 18:03:29.504: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:29.504: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:29.505: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:29.508: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 193228 bytes / 48307 frames / 3019 ms 18:03:29.508: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933573726 18:03:30.044: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:30.044: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:30.045: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:30.048: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 226664 bytes / 56666 frames / 3541 ms 18:03:30.048: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933574249 18:03:30.594: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:30.595: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:30.597: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:30.601: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 261772 bytes / 65443 frames / 4090 ms 18:03:30.601: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933574797 18:03:31.131: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:31.131: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:31.132: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:03:31.135: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 296884 bytes / 74221 frames / 4638 ms 18:03:31.135: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933575346 18:03:31.665: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:31.665: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:31.666: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:31.669: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 330320 bytes / 82580 frames / 5161 ms 18:03:31.669: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933575868 18:03:32.200: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:32.200: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:32.201: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:32.203: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 365428 bytes / 91357 frames / 5709 ms 18:03:32.203: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933576417 18:03:32.739: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:32.740: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:32.741: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:32.745: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 398864 bytes / 99716 frames / 6232 ms 18:03:32.745: [obs-localvocal] end not reached. vad state: start ts: 18446742355933570733, end ts: 18446742355933576939 18:03:33.281: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:33.281: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:33.282: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:33.285: [obs-localvocal] VAD segment 0. pushed 0 to 5120 (5120 frames / 320 ms). current size: 419344 bytes / 104836 frames / 6552 ms 18:03:33.285: [obs-localvocal] VAD segment end -> send to inference 18:03:33.285: [obs-localvocal] run_whisper_inference: processing 105156 samples, 6.572 sec, 4 threads 18:03:33.528: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:03:33.528: [obs-localvocal] S 0, Token 1: 376 M p: 0.999 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 2: 20642 ija p: 0.976 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 3: 1221 ł p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 4: 274 d p: 0.643 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 5: 28321 wor p: 0.999 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 6: 6522 czy p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 7: 74 k p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 8: 16586 bron p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 9: 72 i p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 10: 3244 się p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 11: 11 , p: 0.585 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 12: 39628 żad p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 13: 716 ne p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 14: 9459 dow p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 15: 843 ody p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 16: 2838 nie p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 17: 26366 były p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 18: 297 n p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 19: 23848 isz p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 20: 3689 cz p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 21: 546 one p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 22: 281 to p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 23: 710 z p: 0.933 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 24: 19601 asz p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 25: 1221 ł p: 0.998 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 26: 3611 od p: 0.998 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 27: 9726 wy p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 28: 89 z p: 0.970 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 29: 66 c p: 0.997 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 30: 1929 ane p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 31: 637 sp p: 1.000 [keep: 1] 18:03:33.528: [obs-localvocal] S 0, Token 32: 304 al p: 1.000 [keep: 1] 18:03:33.528: Last log entry repeated for 10 more lines 18:03:33.528: [obs-localvocal] Time token found 50688 -> 6.460. Duration: 6.572. Ratio: 1.017. 18:03:33.528: [obs-localvocal] S 0, Token 43: 50688 [_TT_324] p: 0.810 [keep: 0] 18:03:33.528: [obs-localvocal] Decoded sentence: ' Mijał dworczyk broni się, żadne dowody nie były niszczone to zaszł odwyzcane spalca polityczna ze msta jata' 18:03:33.528: [obs-localvocal] VAD segment 1. pushed 5120 to 8777 (3657 frames / 228 ms). current size: 14628 bytes / 3657 frames / 228 ms 18:03:33.528: [obs-localvocal] end not reached. vad state: start ts: 18446742355933577285, end ts: 0 18:03:33.798: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:33.798: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:33.799: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:33.801: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 48064 bytes / 12016 frames / 751 ms 18:03:33.801: [obs-localvocal] end not reached. vad state: start ts: 18446742355933577285, end ts: 18446742355933578010 18:03:34.332: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:34.332: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:34.333: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:03:34.335: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 81504 bytes / 20376 frames / 1273 ms 18:03:34.335: [obs-localvocal] VAD segment end -> send to inference 18:03:34.335: [obs-localvocal] run_whisper_inference: processing 20696 samples, 1.293 sec, 4 threads 18:03:34.499: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:03:34.499: [obs-localvocal] S 0, Token 1: 2604 tu p: 0.504 [keep: 1] 18:03:34.499: [obs-localvocal] S 0, Token 2: 1427 ż p: 1.000 [keep: 1] 18:03:34.499: [obs-localvocal] S 0, Token 3: 18334 przed p: 1.000 [keep: 1] 18:03:34.499: [obs-localvocal] S 0, Token 4: 4628 wy p: 0.968 [keep: 1] 18:03:34.499: [obs-localvocal] S 0, Token 5: 3918 bor p: 1.000 [keep: 1] 18:03:34.499: [obs-localvocal] S 0, Token 6: 4526 ami p: 1.000 [keep: 1] 18:03:34.499: [obs-localvocal] S 0, Token 7: 13 . p: 1.000 [keep: 0] 18:03:34.499: [obs-localvocal] Time token found 50414 -> 0.980. Duration: 1.293. Ratio: 1.319. 18:03:34.499: [obs-localvocal] S 0, Token 8: 50414 [_TT_50] p: 0.780 [keep: 0] 18:03:34.499: [obs-localvocal] Decoded sentence: ' tuż przed wyborami' 18:03:34.873: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:34.873: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:34.873: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:34.876: [obs-localvocal] VAD segment 0. pushed 3520 to 8777 (5257 frames / 328 ms). current size: 21028 bytes / 5257 frames / 328 ms 18:03:34.876: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933579081 18:03:35.411: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:35.411: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:35.412: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:35.415: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 56136 bytes / 14034 frames / 877 ms 18:03:35.415: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933579630 18:03:35.945: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:35.945: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:35.946: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:35.949: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 89572 bytes / 22393 frames / 1399 ms 18:03:35.949: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933580152 18:03:36.483: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:36.483: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:36.484: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:36.487: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 124680 bytes / 31170 frames / 1948 ms 18:03:36.487: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933580701 18:03:37.035: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:37.035: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:37.036: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:37.038: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 159788 bytes / 39947 frames / 2496 ms 18:03:37.038: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933581249 18:03:37.573: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:37.573: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:37.574: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:03:37.576: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 194900 bytes / 48725 frames / 3045 ms 18:03:37.577: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933581798 18:03:38.113: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:38.114: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:38.115: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:38.118: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 228336 bytes / 57084 frames / 3567 ms 18:03:38.118: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933582320 18:03:38.660: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:38.661: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:38.661: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:38.664: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 263444 bytes / 65861 frames / 4116 ms 18:03:38.664: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933582869 18:03:39.203: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:39.204: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:39.204: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:39.207: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 298552 bytes / 74638 frames / 4664 ms 18:03:39.207: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933583418 18:03:39.749: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:39.749: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:39.750: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:39.752: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 333660 bytes / 83415 frames / 5213 ms 18:03:39.752: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933583966 18:03:40.296: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:40.296: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:40.297: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:40.300: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 368768 bytes / 92192 frames / 5762 ms 18:03:40.300: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933584515 18:03:40.844: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:40.845: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:40.848: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:40.851: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 403876 bytes / 100969 frames / 6310 ms 18:03:40.852: [obs-localvocal] end not reached. vad state: start ts: 18446742355933578779, end ts: 18446742355933585063 18:03:41.382: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:41.383: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:41.383: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:03:41.386: [obs-localvocal] VAD segment 0. pushed 0 to 7744 (7744 frames / 484 ms). current size: 434852 bytes / 108713 frames / 6794 ms 18:03:41.387: [obs-localvocal] VAD segment end -> send to inference 18:03:41.387: [obs-localvocal] run_whisper_inference: processing 109033 samples, 6.815 sec, 4 threads 18:03:41.657: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:03:41.657: [obs-localvocal] S 0, Token 1: 9118 Tak p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 2: 26366 były p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 3: 7870 sz p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 4: 1023 ew p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 5: 2330 ka p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 6: 5248 ń p: 0.997 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 7: 66 c p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 8: 1663 era p: 0.956 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 9: 375 li p: 0.865 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 10: 29403 prem p: 0.612 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 11: 10609 iera p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 12: 714 po p: 0.993 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 13: 9015 są p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 14: 22630 pi p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 15: 459 su p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 16: 74 k p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 17: 474 and p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 18: 88 y p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 19: 11 , p: 0.939 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 20: 3546 dad p: 0.871 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 21: 89 z p: 0.864 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 22: 360 do p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 23: 12201 Europ p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 24: 289 ar p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 25: 4326 lam p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 26: 268 en p: 0.803 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 27: 11 , p: 0.866 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 28: 2604 tu p: 0.999 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 29: 2752 mi p: 1.000 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 30: 417 ch p: 0.999 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 31: 378 od p: 0.781 [keep: 1] 18:03:41.657: [obs-localvocal] S 0, Token 32: 86 w p: 0.974 [keep: 1] 18:03:41.657: Last log entry repeated for 23 more lines 18:03:41.657: [obs-localvocal] Time token found 50700 -> 6.700. Duration: 6.814. Ratio: 1.017. 18:03:41.657: [obs-localvocal] S 0, Token 56: 50700 [_TT_336] p: 0.497 [keep: 0] 18:03:41.657: [obs-localvocal] Decoded sentence: ' Tak były szewkańcera lipremiera po sąpi sukandy, dadz do Europarlamen, tu mi chodworczek, komentuje wniosek o uchlenie mój imunitetu' 18:03:41.921: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:41.921: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:41.922: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:41.924: [obs-localvocal] VAD segment 0. pushed 1472 to 8777 (7305 frames / 456 ms). current size: 29220 bytes / 7305 frames / 456 ms 18:03:41.924: [obs-localvocal] end not reached. vad state: start ts: 18446742355933585704, end ts: 18446742355933586134 18:03:42.458: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:42.458: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:42.459: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:42.461: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 64328 bytes / 16082 frames / 1005 ms 18:03:42.461: [obs-localvocal] end not reached. vad state: start ts: 18446742355933585704, end ts: 18446742355933586683 18:03:42.997: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:42.997: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:42.998: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:43.001: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 97764 bytes / 24441 frames / 1527 ms 18:03:43.001: [obs-localvocal] end not reached. vad state: start ts: 18446742355933585704, end ts: 18446742355933587205 18:03:43.534: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:43.534: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:43.535: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:43.538: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 132872 bytes / 33218 frames / 2076 ms 18:03:43.538: [obs-localvocal] end not reached. vad state: start ts: 18446742355933585704, end ts: 18446742355933587754 18:03:44.069: [obs-localvocal] vad based segmentation. currently 100308 bytes in the audio input buffer 18:03:44.069: [obs-localvocal] found 25077 frames from info buffer. 0 in overlap 18:03:44.069: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:44.072: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 166308 bytes / 41577 frames / 2598 ms 18:03:44.072: [obs-localvocal] end not reached. vad state: start ts: 18446742355933585704, end ts: 18446742355933588276 18:03:44.616: [obs-localvocal] vad based segmentation. currently 105328 bytes in the audio input buffer 18:03:44.616: [obs-localvocal] found 26332 frames from info buffer. 0 in overlap 18:03:44.617: [obs-localvocal] resampled: 2 channels, 8778 frames, 548.625000 ms 18:03:44.620: [obs-localvocal] VAD segment 0. pushed 0 to 8778 (8778 frames / 548 ms). current size: 201420 bytes / 50355 frames / 3147 ms 18:03:44.620: [obs-localvocal] end not reached. vad state: start ts: 18446742355933585704, end ts: 18446742355933588825 18:03:44.649: save_or_load_event_callback 1, 1273606138 18:03:44.650: obs save event 18:03:44.731: [rtmp stream: 'simple_stream'] User stopped the stream 18:03:44.732: Output 'simple_stream': stopping 18:03:44.732: Output 'simple_stream': Total frames output: 2312 18:03:44.732: Output 'simple_stream': Total drawn frames: 2454 18:03:44.733: stream_stopped_event 18:03:44.733: enabled: 0, is_streaming 0, streaming_output_enabled 1, streaming_transcripts_enabled 0, is_streaming_relevant: 0, is_recording 0, recording_output_enabled 0, recording_transcripts_enabled 0, is_recording_relevant: 0, is_virtualcam_on 0, virtualcam_transcripts_enabled 0, is_virtualcam_relevant 0, is_preview_open 0, is_text_output_relevant 0, scene_collection_name: , source: 'TOK', equal_settings 1, do_captioning 0 18:03:44.733: settings changed, disabling captioning 18:03:44.733: caption_output_writer_loop streaming done 18:03:44.737: ==== Streaming Stop ================================================ 18:03:44.746: [rtmp stream: 'multi-output'] User stopped the stream 18:03:44.749: Output 'multi-output': stopping 18:03:44.749: Output 'multi-output': Total frames output: 2433 18:03:44.749: Output 'multi-output': Total drawn frames: 2455 18:03:44.749: [obs-multi-rtmp] Release output while it is active. 18:03:45.153: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:45.153: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:45.153: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:45.155: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 236528 bytes / 59132 frames / 3695 ms 18:03:45.155: [obs-localvocal] end not reached. vad state: start ts: 18446742355933585704, end ts: 18446742355933589373 18:03:45.691: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:45.692: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:45.692: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:45.696: [obs-localvocal] VAD segment 0. pushed 0 to 4096 (4096 frames / 256 ms). current size: 252912 bytes / 63228 frames / 3951 ms 18:03:45.696: [obs-localvocal] VAD segment end -> send to inference 18:03:45.696: [obs-localvocal] run_whisper_inference: processing 63548 samples, 3.972 sec, 4 threads 18:03:45.913: [obs-localvocal] S 0, Token 0: 50364 [_BEG_] p: 1.000 [keep: 0] 18:03:45.914: [obs-localvocal] S 0, Token 1: 413 D p: 0.507 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 2: 8699 wie p: 0.995 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 3: 322 on p: 0.999 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 4: 1110 sk p: 0.692 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 5: 812 ó p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 6: 89 z p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 7: 7468 pode p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 8: 73 j p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 9: 81 r p: 0.962 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 10: 2904 zen p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 11: 4907 iem p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 12: 2839 ut p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 13: 47130 rud p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 14: 77 n p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 15: 952 ian p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 16: 654 ia p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 17: 8299 ś p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 18: 1493 led p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 19: 2682 zt p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 20: 4151 wa p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 21: 261 w p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 22: 22734 spraw p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 23: 414 ie p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 24: 991 tak p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 25: 710 z p: 0.921 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 26: 7916 wan p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 27: 1274 ę p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 28: 2784 ja p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 29: 612 fer p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 30: 10071 mail p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 31: 21091 owej p: 1.000 [keep: 1] 18:03:45.914: [obs-localvocal] S 0, Token 32: 13 . p: 1.000 [keep: 0] 18:03:45.914: [obs-localvocal] Time token found 50564 -> 3.980. Duration: 3.971. Ratio: 1.002. 18:03:45.914: [obs-localvocal] S 0, Token 33: 50564 [_TT_200] p: 0.987 [keep: 0] 18:03:45.914: [obs-localvocal] Decoded sentence: ' Dwie on skóz podejrzeniem utrudniania śledztwa w sprawie tak zwanę jafer mailowej' 18:03:45.914: [obs-localvocal] VAD segment 1. pushed 4096 to 8359 (4263 frames / 266 ms). current size: 17052 bytes / 4263 frames / 266 ms 18:03:45.914: [obs-localvocal] end not reached. vad state: start ts: 18446742355933589656, end ts: 0 18:03:46.227: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:46.227: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:46.227: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:46.229: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 52160 bytes / 13040 frames / 815 ms 18:03:46.229: [obs-localvocal] end not reached. vad state: start ts: 18446742355933589656, end ts: 18446742355933590444 18:03:46.765: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:46.766: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:46.766: [obs-localvocal] resampled: 2 channels, 8359 frames, 522.437500 ms 18:03:46.769: [obs-localvocal] VAD segment 0. pushed 0 to 8359 (8359 frames / 522 ms). current size: 85596 bytes / 21399 frames / 1337 ms 18:03:46.769: [obs-localvocal] end not reached. vad state: start ts: 18446742355933589656, end ts: 18446742355933590967 18:03:47.294: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:47.294: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:47.295: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:47.298: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 120704 bytes / 30176 frames / 1886 ms 18:03:47.298: [obs-localvocal] end not reached. vad state: start ts: 18446742355933589656, end ts: 18446742355933591515 18:03:47.838: [obs-localvocal] vad based segmentation. currently 100312 bytes in the audio input buffer 18:03:47.838: [obs-localvocal] found 25078 frames from info buffer. 0 in overlap 18:03:47.839: [obs-localvocal] resampled: 2 channels, 8360 frames, 522.500000 ms 18:03:47.841: [obs-localvocal] VAD segment 0. pushed 0 to 8360 (8360 frames / 522 ms). current size: 154144 bytes / 38536 frames / 2408 ms 18:03:47.841: [obs-localvocal] end not reached. vad state: start ts: 18446742355933589656, end ts: 18446742355933592038 18:03:48.377: [obs-localvocal] vad based segmentation. currently 105324 bytes in the audio input buffer 18:03:48.377: [obs-localvocal] found 26331 frames from info buffer. 0 in overlap 18:03:48.377: [obs-localvocal] resampled: 2 channels, 8777 frames, 548.562500 ms 18:03:48.379: [obs-localvocal] VAD segment 0. pushed 0 to 8777 (8777 frames / 548 ms). current size: 189252 bytes / 47313 frames / 2957 ms 18:03:48.379: [obs-localvocal] end not reached. vad state: start ts: 18446742355933589656, end ts: 18446742355933592586 18:03:48.718: ==== Shutting down ================================================== 18:03:48.719: save_or_load_event_callback 1, 1273606138 18:03:48.719: obs save event 18:03:48.723: [obs-localvocal] filter remove 18:03:48.754: [obs-localvocal] filter destroy 18:03:48.754: [obs-localvocal] shutdown_whisper_thread 18:03:48.771: [obs-localvocal] Whisper context is null, exiting thread 18:03:48.771: [obs-localvocal] exiting whisper thread 18:03:48.795: [obs-localvocal] Whisper context is null, exiting thread 18:03:48.795: [obs-localvocal] exiting whisper thread 18:03:48.875: All scene data cleared 18:03:48.875: ------------------------------------------------ 18:03:48.875: [obs-multi-rtmp] Save 4 targets, 0 video configs, 0 audio configs 18:03:48.876: [obs-multi-rtmp] Save config into /Users/stream360/Library/Application Support/obs-studio/basic/profiles/Bez tytułu/obs-multi-rtmp.json 18:03:48.876: obs_frontent_exiting, stopping captioner 18:03:48.877: obs_frontent_exiting done 18:03:48.937: [obs-localvocal] plugin unloaded 18:03:48.937: google_s2t_caption_plugin 0.28 obs_module_unload 18:03:48.939: [obs-websocket] [obs_module_unload] Shutting down... 18:03:48.939: Tried to call obs_frontend_remove_event_callback with no callbacks! 18:03:48.939: [obs-websocket] [obs_module_unload] Finished shutting down. 18:03:48.992: [Scripting] Total detached callbacks: 0 18:03:48.992: Freeing OBS context data 18:03:49.003: == Profiler Results ============================= 18:03:49.003: run_program_init: 1567.91 ms 18:03:49.003: ┣OBSApp::AppInit: 7.412 ms 18:03:49.003: ┃ ┗OBSApp::InitLocale: 2.062 ms 18:03:49.003: ┗OBSApp::OBSInit: 1445.41 ms 18:03:49.003: ┣obs_startup: 1.279 ms 18:03:49.003: ┗OBSBasic::OBSInit: 1340.66 ms 18:03:49.003: ┣OBSBasic::InitBasicConfig: 0.14 ms 18:03:49.003: ┣OBSBasic::ResetAudio: 0.077 ms 18:03:49.003: ┣OBSBasic::ResetVideo: 213.086 ms 18:03:49.003: ┃ ┗obs_init_graphics: 211.303 ms 18:03:49.003: ┃ ┗shader compilation: 178.262 ms 18:03:49.003: ┣OBSBasic::InitOBSCallbacks: 0.002 ms 18:03:49.003: ┣OBSBasic::InitHotkeys: 0.017 ms 18:03:49.003: ┣obs_load_all_modules2: 208.531 ms 18:03:49.003: ┃ ┣obs_init_module(aja-output-ui): 0.06 ms 18:03:49.003: ┃ ┣obs_init_module(aja): 0.036 ms 18:03:49.003: ┃ ┣obs_init_module(coreaudio-encoder): 0 ms 18:03:49.003: ┃ ┣obs_init_module(decklink-captions): 0 ms 18:03:49.003: ┃ ┣obs_init_module(decklink-output-ui): 0.001 ms 18:03:49.003: ┃ ┣obs_init_module(decklink): 2.763 ms 18:03:49.003: ┃ ┣obs_init_module(frontend-tools): 1.548 ms 18:03:49.003: ┃ ┣obs_init_module(image-source): 0.006 ms 18:03:49.003: ┃ ┣obs_init_module(mac-avcapture-legacy): 76.727 ms 18:03:49.003: ┃ ┣obs_init_module(mac-avcapture): 0.001 ms 18:03:49.003: ┃ ┣obs_init_module(mac-capture): 0.01 ms 18:03:49.003: ┃ ┣obs_init_module(mac-syphon): 0 ms 18:03:49.003: ┃ ┣obs_init_module(mac-videotoolbox): 0.005 ms 18:03:49.003: ┃ ┣obs_init_module(mac-virtualcam): 0.006 ms 18:03:49.003: ┃ ┣obs_init_module(obs-browser): 23.652 ms 18:03:49.003: ┃ ┣obs_init_module(obs-ffmpeg): 0.042 ms 18:03:49.003: ┃ ┣obs_init_module(obs-filters): 0.01 ms 18:03:49.003: ┃ ┣obs_init_module(obs-outputs): 0.001 ms 18:03:49.003: ┃ ┣obs_init_module(obs-transitions): 0.003 ms 18:03:49.003: ┃ ┣obs_init_module(obs-vst): 0.001 ms 18:03:49.003: ┃ ┣obs_init_module(obs-webrtc): 0.004 ms 18:03:49.003: ┃ ┣obs_init_module(obs-websocket): 2.304 ms 18:03:49.003: ┃ ┣obs_init_module(obs-x264): 0.001 ms 18:03:49.003: ┃ ┣obs_init_module(rtmp-services): 0.81 ms 18:03:49.003: ┃ ┣obs_init_module(text-freetype2): 0.006 ms 18:03:49.003: ┃ ┣obs_init_module(vlc-video): 1.426 ms 18:03:49.003: ┃ ┣obs_init_module(cloud-closed-captions): 0.032 ms 18:03:49.003: ┃ ┣obs_init_module(obs-localvocal): 0.019 ms 18:03:49.003: ┃ ┗obs_init_module(obs-multi-rtmp): 55.635 ms 18:03:49.003: ┣OBSBasic::InitService: 0.258 ms 18:03:49.003: ┣OBSBasic::ResetOutputs: 7.261 ms 18:03:49.003: ┣OBSBasic::CreateHotkeys: 0.021 ms 18:03:49.003: ┣OBSBasic::InitPrimitives: 0.038 ms 18:03:49.003: ┗OBSBasic::Load: 846.968 ms 18:03:49.003: obs_hotkey_thread(25 ms): min=0 ms, median=0 ms, max=0.196 ms, 99th percentile=0.001 ms, 100% below 25 ms 18:03:49.003: audio_thread(Audio): min=0.003 ms, median=0.344 ms, max=12.18 ms, 99th percentile=3.86 ms 18:03:49.003: ┗receive_audio: min=0.001 ms, median=0.325 ms, max=12.086 ms, 99th percentile=3.231 ms, 0.814269 calls per parent call 18:03:49.003: ┣buffer_audio: min=0 ms, median=0.001 ms, max=3.702 ms, 99th percentile=0.004 ms 18:03:49.003: ┗do_encode: min=0.157 ms, median=0.323 ms, max=12.075 ms, 99th percentile=3.116 ms 18:03:49.003: ┣encode(simple_aac): min=0.155 ms, median=0.305 ms, max=12.042 ms, 99th percentile=3.022 ms 18:03:49.003: ┗send_packet: min=0 ms, median=0.009 ms, max=3.046 ms, 99th percentile=0.032 ms, 1.99976 calls per parent call 18:03:49.003: obs_graphics_thread(16.6667 ms): min=0.06 ms, median=2.039 ms, max=684.275 ms, 99th percentile=4.965 ms, 99.9924% below 16.667 ms 18:03:49.003: ┣tick_sources: min=0 ms, median=0.012 ms, max=684.08 ms, 99th percentile=0.061 ms 18:03:49.003: ┣output_frame: min=0.057 ms, median=1.32 ms, max=5.318 ms, 99th percentile=3.336 ms 18:03:49.003: ┃ ┣gs_context(video->graphics): min=0.057 ms, median=1.203 ms, max=4.875 ms, 99th percentile=3.018 ms 18:03:49.003: ┃ ┃ ┣render_video: min=0.009 ms, median=0.431 ms, max=4.522 ms, 99th percentile=2.599 ms 18:03:49.003: ┃ ┃ ┃ ┣render_main_texture: min=0.008 ms, median=0.181 ms, max=2.358 ms, 99th percentile=0.91 ms 18:03:49.003: ┃ ┃ ┃ ┣render_convert_texture: min=0.029 ms, median=0.047 ms, max=2.485 ms, 99th percentile=0.339 ms, 0.817461 calls per parent call 18:03:49.003: ┃ ┃ ┃ ┗stage_output_texture: min=0.098 ms, median=0.215 ms, max=3.422 ms, 99th percentile=1.535 ms, 0.817461 calls per parent call 18:03:49.003: ┃ ┃ ┣gs_flush: min=0.006 ms, median=0.019 ms, max=1.508 ms, 99th percentile=0.39 ms 18:03:49.003: ┃ ┃ ┗download_frame: min=0 ms, median=0.743 ms, max=3.4 ms, 99th percentile=1.295 ms, 0.817461 calls per parent call 18:03:49.003: ┃ ┗output_video_data: min=0.07 ms, median=0.13 ms, max=1.715 ms, 99th percentile=0.487 ms, 0.81731 calls per parent call 18:03:49.003: ┗render_displays: min=0 ms, median=0.776 ms, max=6.787 ms, 99th percentile=2.169 ms 18:03:49.003: video_thread(video): min=0.418 ms, median=0.756 ms, max=20.259 ms, 99th percentile=2.52 ms 18:03:49.003: ┗receive_video: min=0.417 ms, median=0.755 ms, max=20.258 ms, 99th percentile=2.513 ms 18:03:49.003: ┗do_encode: min=0.417 ms, median=0.755 ms, max=20.258 ms, 99th percentile=2.513 ms 18:03:49.003: ┣encode(simple_video_stream): min=0.404 ms, median=0.742 ms, max=20.254 ms, 99th percentile=2.49 ms 18:03:49.003: ┗send_packet: min=0 ms, median=0.006 ms, max=0.767 ms, 99th percentile=0.024 ms, 1.99889 calls per parent call 18:03:49.003: ================================================= 18:03:49.003: == Profiler Time Between Calls ================== 18:03:49.003: obs_hotkey_thread(25 ms): min=25.026 ms, median=29.349 ms, max=39.543 ms, 5.43577% within ±2% of 25 ms (0% lower, 94.5642% higher) 18:03:49.003: obs_graphics_thread(16.6667 ms): min=14.605 ms, median=16.666 ms, max=684.294 ms, 76.0121% within ±2% of 16.667 ms (12.0544% lower, 11.9335% higher) 18:03:49.003: ================================================= 18:03:49.033: Number of memory leaks: 48