diff --git a/examples/assets/ttfa_2_vs_4_threads_llm_with_stt.png b/examples/assets/ttfa_2_vs_4_threads_llm_with_stt.png
new file mode 100644
index 0000000..1591cf2
Binary files /dev/null and b/examples/assets/ttfa_2_vs_4_threads_llm_with_stt.png differ
diff --git a/examples/assets/ttfs_2_vs_4_threads_llm_with_tts.png b/examples/assets/ttfs_2_vs_4_threads_llm_with_tts.png
new file mode 100644
index 0000000..ebe370e
Binary files /dev/null and b/examples/assets/ttfs_2_vs_4_threads_llm_with_tts.png differ
diff --git a/examples/experimentals/voice_engine/environment.log b/examples/experimentals/voice_engine/environment.log
new file mode 100644
index 0000000..2dde523
--- /dev/null
+++ b/examples/experimentals/voice_engine/environment.log
@@ -0,0 +1,477 @@
+2024-11-13 14:01:53,569 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:01:58,586 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:01:58,596 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:01:58,598 - __main__ - INFO - STT process initialized with PID: 8415
+2024-11-13 14:01:58,599 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:01:58,602 - __main__ - ERROR - Failed to initialize LLM process: [Errno 13] Permission denied: '/data/data/com.termux/files/home/llama.cpp/llama-server'
+2024-11-13 14:04:08,010 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:04:13,023 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:04:13,033 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:04:13,035 - __main__ - INFO - STT process initialized with PID: 8554
+2024-11-13 14:04:13,036 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:04:13,041 - __main__ - ERROR - Failed to initialize LLM process: [Errno 13] Permission denied: '/data/data/com.termux/files/home/llama.cpp/llama-server'
+2024-11-13 14:24:33,136 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:24:38,165 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:24:38,175 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:24:38,178 - __main__ - INFO - STT process initialized with PID: 10850
+2024-11-13 14:24:38,178 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:24:48,265 - __main__ - INFO - LLM process initialized with PID: 10859
+2024-11-13 14:24:48,293 - __main__ - INFO - Process 10850 terminated gracefully.
+2024-11-13 14:24:48,469 - __main__ - INFO - Process 10859 terminated gracefully.
+2024-11-13 14:31:24,457 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:31:29,481 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:31:29,494 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:31:29,497 - __main__ - INFO - STT process initialized with PID: 16751
+2024-11-13 14:31:29,497 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:31:39,506 - __main__ - INFO - LLM process initialized with PID: 16760
+2024-11-13 14:31:39,535 - __main__ - INFO - Process 16751 terminated gracefully.
+2024-11-13 14:31:39,699 - __main__ - INFO - Process 16760 terminated gracefully.
+2024-11-13 14:35:13,569 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:35:18,593 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:35:18,604 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:35:18,606 - __main__ - INFO - STT process initialized with PID: 17760
+2024-11-13 14:35:18,606 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:35:28,622 - __main__ - INFO - LLM process initialized with PID: 17772
+2024-11-13 14:35:33,724 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:35:33,741 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 35
+2024-11-13 14:35:33,757 - __main__ - INFO - Process 17760 terminated gracefully.
+2024-11-13 14:35:33,972 - __main__ - INFO - Process 17772 terminated gracefully.
+2024-11-13 14:40:59,905 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:41:04,929 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:41:04,943 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:41:04,945 - __main__ - INFO - STT process initialized with PID: 22547
+2024-11-13 14:41:04,945 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:41:14,958 - __main__ - INFO - LLM process initialized with PID: 22563
+2024-11-13 14:41:17,573 - __main__ - INFO - Process 22547 terminated gracefully.
+2024-11-13 14:41:17,739 - __main__ - INFO - Process 22563 terminated gracefully.
+2024-11-13 14:42:04,177 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:42:09,203 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:42:09,211 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:42:09,213 - __main__ - INFO - STT process initialized with PID: 22741
+2024-11-13 14:42:09,213 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:42:19,223 - __main__ - INFO - LLM process initialized with PID: 22750
+2024-11-13 14:42:21,680 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:42:25,983 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 14:42:25,985 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 14:42:25,985 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 14:42:25,988 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 14:42:25,996 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 14:42:25,996 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 14:42:27,441 - __main__ - INFO - TTFS: 5.778118371963501
+2024-11-13 14:42:35,438 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 14:44:10,904 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:44:15,222 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 14:44:15,231 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 14:44:15,236 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 14:44:15,242 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 14:44:15,254 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 14:44:15,261 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 14:44:16,192 - __main__ - INFO - TTFS: 5.319575071334839
+2024-11-13 14:44:28,328 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 14:51:00,038 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:51:05,062 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:51:05,072 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:51:05,075 - __main__ - INFO - STT process initialized with PID: 24744
+2024-11-13 14:51:05,076 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 4 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:51:15,083 - __main__ - INFO - LLM process initialized with PID: 24753
+2024-11-13 14:51:16,671 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:51:20,765 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 14:51:20,766 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 14:51:20,767 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 14:51:20,772 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 14:51:20,781 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 14:51:20,782 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 14:51:21,858 - __main__ - INFO - TTFS: 5.194319725036621
+2024-11-13 14:51:43,463 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 14:52:02,088 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:52:05,715 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 14:52:05,717 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 14:52:05,717 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 14:52:05,720 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 14:52:05,724 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 14:52:05,725 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 14:52:06,926 - __main__ - INFO - TTFS: 4.842855453491211
+2024-11-13 14:52:29,029 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 14:55:20,037 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:55:25,059 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:55:25,070 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:55:25,073 - __main__ - INFO - STT process initialized with PID: 27355
+2024-11-13 14:55:25,073 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 4 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:55:35,084 - __main__ - INFO - LLM process initialized with PID: 27364
+2024-11-13 14:55:47,163 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:55:51,171 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 14:55:51,172 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 14:55:51,173 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 14:55:51,176 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 14:55:51,181 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 14:55:51,182 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 14:55:52,203 - __main__ - INFO - TTFS: 5.0501768589019775
+2024-11-13 14:56:24,591 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 14:56:58,785 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:57:02,408 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 14:57:02,409 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 14:57:02,411 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 14:57:02,415 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 14:57:02,419 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 14:57:02,421 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 14:57:03,427 - __main__ - INFO - TTFS: 4.6472227573394775
+2024-11-13 14:57:26,751 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 14:58:19,877 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 14:58:24,905 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:58:24,912 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 14:58:24,914 - __main__ - INFO - STT process initialized with PID: 28320
+2024-11-13 14:58:24,914 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 14:58:34,917 - __main__ - INFO - LLM process initialized with PID: 28329
+2024-11-13 14:58:44,587 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:58:48,728 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 14:58:48,733 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 14:58:48,736 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 14:58:48,740 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 14:58:48,745 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 14:58:48,746 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 14:58:51,879 - __main__ - INFO - TTFS: 7.300859451293945
+2024-11-13 14:59:04,608 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 14:59:15,314 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 14:59:18,960 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 14:59:18,961 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 14:59:18,962 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 14:59:18,967 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 14:59:18,971 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 14:59:18,972 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 14:59:19,879 - __main__ - INFO - TTFS: 4.58224630355835
+2024-11-13 14:59:33,417 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:00:03,367 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:00:07,011 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 15:00:07,013 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 15:00:07,014 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:00:07,018 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:00:07,025 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:00:07,026 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:00:07,923 - __main__ - INFO - TTFS: 4.569139003753662
+2024-11-13 15:00:20,743 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:14:50,561 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 15:14:55,586 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:14:55,601 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 15:14:55,603 - __main__ - INFO - STT process initialized with PID: 7594
+2024-11-13 15:14:55,603 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 20 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 15:15:05,613 - __main__ - INFO - LLM process initialized with PID: 7610
+2024-11-13 15:15:11,880 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:15:16,129 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 15:15:16,133 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 15:15:16,135 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:15:16,142 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:15:16,168 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:15:16,169 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:15:19,826 - __main__ - INFO - TTFS: 7.970261096954346
+2024-11-13 15:15:33,055 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:15:41,269 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:15:46,513 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 15:15:46,514 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 15:15:46,515 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:15:46,529 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:15:46,545 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:15:46,546 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:15:47,750 - __main__ - INFO - TTFS: 6.495905160903931
+2024-11-13 15:16:07,495 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:19:53,329 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 15:19:58,353 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:19:58,369 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 15:19:58,372 - __main__ - INFO - STT process initialized with PID: 17073
+2024-11-13 15:19:58,372 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 15:20:08,382 - __main__ - INFO - LLM process initialized with PID: 17084
+2024-11-13 15:20:13,987 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:20:18,139 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 15:20:18,143 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 15:20:18,144 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:20:18,153 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:20:18,357 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:20:18,358 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:20:19,192 - __main__ - INFO - TTFS: 5.216456651687622
+2024-11-13 15:20:46,262 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:20:51,005 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:20:54,997 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 15:20:55,003 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 15:20:55,005 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:20:55,010 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:20:55,020 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:20:55,021 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:20:55,906 - __main__ - INFO - TTFS: 4.929144382476807
+2024-11-13 15:21:16,969 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:28:38,763 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 15:28:43,784 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:28:43,797 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 15:28:43,800 - __main__ - INFO - STT process initialized with PID: 19544
+2024-11-13 15:28:43,801 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 15:28:53,809 - __main__ - INFO - LLM process initialized with PID: 19555
+2024-11-13 15:28:53,837 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:28:57,890 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 15:28:57,891 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 15:28:57,892 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:28:57,895 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:28:57,899 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:28:57,899 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:28:58,812 - __main__ - INFO - TTFS: 4.995412588119507
+2024-11-13 15:29:14,216 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:29:14,228 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:29:17,867 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 15:29:17,869 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 15:29:17,869 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:29:17,874 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:29:17,879 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:29:17,880 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:29:18,787 - __main__ - INFO - TTFS: 4.568600654602051
+2024-11-13 15:29:30,814 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:29:30,827 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:29:34,473 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 15:29:34,475 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 15:29:34,476 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:29:34,480 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:29:34,486 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:29:34,487 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:29:35,392 - __main__ - INFO - TTFS: 4.5740437507629395
+2024-11-13 15:29:46,359 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:29:50,053 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:29:54,103 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 46
+2024-11-13 15:29:54,105 - __main__ - DEBUG - STT response: STTResponse(text=' What is the value of 34 plus 53?\n')
+2024-11-13 15:29:54,106 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:29:54,108 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:29:54,112 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:29:54,114 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:29:55,016 - __main__ - INFO - TTFS: 4.9750142097473145
+2024-11-13 15:30:03,720 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:30:51,976 - __main__ - INFO - Process 19544 terminated gracefully.
+2024-11-13 15:30:52,344 - __main__ - INFO - Process 19555 terminated gracefully.
+2024-11-13 15:48:38,415 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 15:48:43,438 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:48:43,446 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 15:48:43,448 - __main__ - INFO - STT process initialized with PID: 31465
+2024-11-13 15:48:43,449 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 15:48:53,456 - __main__ - INFO - LLM process initialized with PID: 31481
+2024-11-13 15:48:56,556 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:49:00,762 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 42
+2024-11-13 15:49:00,766 - __main__ - DEBUG - STT response: STTResponse(text=' Give me 5 examples of colors\n')
+2024-11-13 15:49:00,767 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:49:00,773 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:49:00,783 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:49:00,784 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:49:01,575 - __main__ - INFO - TTFS: 5.037225723266602
+2024-11-13 15:49:27,828 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 15:57:30,444 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 15:57:35,466 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:57:35,477 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 15:57:35,479 - __main__ - INFO - STT process initialized with PID: 4584
+2024-11-13 15:57:35,480 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 15:57:45,518 - __main__ - INFO - LLM process initialized with PID: 4597
+2024-11-13 15:57:52,592 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 15:57:57,229 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 42
+2024-11-13 15:57:57,235 - __main__ - DEBUG - STT response: STTResponse(text=' Give me 5 examples of colors\n')
+2024-11-13 15:57:57,238 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 15:57:57,241 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 15:57:57,250 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 15:57:57,251 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 15:57:58,101 - __main__ - INFO - TTFS: 5.519280195236206
+2024-11-13 15:58:20,259 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 16:01:23,932 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 16:56:38,215 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 16:56:43,240 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 16:56:43,249 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 16:56:43,252 - __main__ - INFO - STT process initialized with PID: 994
+2024-11-13 16:56:43,253 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 16:56:53,261 - __main__ - INFO - LLM process initialized with PID: 1033
+2024-11-13 16:57:38,944 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 16:57:42,571 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 42
+2024-11-13 16:57:42,576 - __main__ - DEBUG - STT response: STTResponse(text=' Give me 5 examples of colors\n')
+2024-11-13 16:57:42,577 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 16:57:42,578 - __main__ - DEBUG - TTS Thread Started
+2024-11-13 16:57:42,583 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 16:57:42,596 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 16:57:42,597 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 16:57:42,599 - __main__ - DEBUG - PIPER PID, 2253
+2024-11-13 16:57:44,621 - __main__ - INFO - TTFS: 5.689475059509277
+2024-11-13 16:57:45,908 - __main__ - DEBUG - Sending ----> Here are 5 examples of colors
+2024-11-13 16:57:46,146 - __main__ - DEBUG - Sending ----> :
+
+2024-11-13 16:57:46,592 - __main__ - DEBUG - Sending ----> 1.
+2024-11-13 16:57:47,474 - __main__ - DEBUG - Sending ---->  Red
+2.
+2024-11-13 16:57:48,316 - __main__ - DEBUG - Sending ---->  Blue
+3.
+2024-11-13 16:57:49,144 - __main__ - DEBUG - Sending ---->  Yellow
+4.
+2024-11-13 16:57:49,958 - __main__ - DEBUG - Sending ---->  Green
+5.
+2024-11-13 16:57:51,727 - __main__ - DEBUG - Sending ---->  Purple
+
+I'm happy to give you more
+2024-11-13 16:57:52,121 - __main__ - DEBUG - Sending ---->  examples if
+2024-11-13 16:57:52,923 - __main__ - DEBUG - Sending ---->  you'd like!
+2024-11-13 16:57:53,110 - __main__ - DEBUG - Received Stop Event At TTS
+2024-11-13 16:57:53,112 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 17:01:28,067 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 17:01:31,681 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 42
+2024-11-13 17:01:31,682 - __main__ - DEBUG - STT response: STTResponse(text=' Give me 5 examples of colors\n')
+2024-11-13 17:01:31,683 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 17:01:31,684 - __main__ - DEBUG - TTS Thread Started
+2024-11-13 17:01:31,686 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 17:01:31,696 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 17:01:31,697 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 17:01:31,697 - __main__ - DEBUG - PIPER PID, 3006
+2024-11-13 17:01:32,538 - __main__ - INFO - TTFS: 4.486324310302734
+2024-11-13 17:01:33,762 - __main__ - DEBUG - Sending ----> Here are 5 examples of colors
+2024-11-13 17:01:36,713 - __main__ - DEBUG - Sending ----> 
+Blue
+Red
+Yellow
+Green
+Purple
+
+Let me know if
+2024-11-13 17:01:38,077 - __main__ - DEBUG - Sending ---->  you'd like me to provide more
+2024-11-13 17:01:38,479 - __main__ - DEBUG - Sending ---->  examples.
+2024-11-13 17:01:39,057 - __main__ - DEBUG - Sending ---->  
+
+Note:
+2024-11-13 17:01:39,651 - __main__ - DEBUG - Sending ---->  I can also
+2024-11-13 17:01:40,259 - __main__ - DEBUG - Sending ---->  give you information
+2024-11-13 17:01:40,831 - __main__ - DEBUG - Sending ---->  about the colors
+2024-11-13 17:01:41,431 - __main__ - DEBUG - Sending ---->  you choose if
+2024-11-13 17:01:42,220 - __main__ - DEBUG - Sending ---->  you'd like.
+2024-11-13 17:01:42,404 - __main__ - DEBUG - Sending ---->  For
+2024-11-13 17:01:42,823 - __main__ - DEBUG - Sending ---->  example,
+2024-11-13 17:01:43,008 - __main__ - DEBUG - Sending ---->  if
+2024-11-13 17:01:43,811 - __main__ - DEBUG - Sending ---->  you choose blue,
+2024-11-13 17:01:44,610 - __main__ - DEBUG - Sending ---->  I could provide information
+2024-11-13 17:01:44,997 - __main__ - DEBUG - Sending ---->  about different
+2024-11-13 17:01:45,815 - __main__ - DEBUG - Sending ---->  shades of blue,
+2024-11-13 17:01:46,595 - __main__ - DEBUG - Sending ---->  blue pigments,
+2024-11-13 17:01:47,402 - __main__ - DEBUG - Sending ---->  blue dyes,
+2024-11-13 17:01:47,814 - __main__ - DEBUG - Sending ---->  blue colors
+2024-11-13 17:01:48,412 - __main__ - DEBUG - Sending ---->  in nature,
+2024-11-13 17:01:48,814 - __main__ - DEBUG - Sending ---->  etc.
+2024-11-13 17:01:49,622 - __main__ - DEBUG - Sending ---->  Let me know if
+2024-11-13 17:01:50,432 - __main__ - DEBUG - Sending ---->  you'd like that
+2024-11-13 17:01:50,651 - __main__ - DEBUG - Sending ---->  information
+2024-11-13 17:01:50,839 - __main__ - DEBUG - Sending ----> .
+2024-11-13 17:01:51,851 - __main__ - DEBUG - Sending ---->  
+
+Let me know if
+2024-11-13 17:01:53,069 - __main__ - DEBUG - Sending ---->  you have any other questions or
+2024-11-13 17:01:53,271 - __main__ - DEBUG - Sending ---->  if
+2024-11-13 17:01:55,298 - __main__ - DEBUG - Sending ---->  there's anything else I can help you with.
+2024-11-13 17:01:56,543 - __main__ - DEBUG - Sending ---->  
+
+Here are 5 more
+2024-11-13 17:01:57,146 - __main__ - DEBUG - Sending ---->  examples of colors
+2024-11-13 17:01:57,949 - __main__ - DEBUG - Received Stop Event At TTS
+2024-11-13 17:01:57,949 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 17:11:29,744 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-13 17:11:34,758 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 17:11:34,765 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-13 17:11:34,767 - __main__ - INFO - STT process initialized with PID: 9692
+2024-11-13 17:11:34,767 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Llama-3.2-3B-Instruct-Q4_0_4_4.gguf
+2024-11-13 17:11:44,778 - __main__ - INFO - LLM process initialized with PID: 9728
+2024-11-13 17:11:47,775 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 17:11:52,281 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 42
+2024-11-13 17:11:52,283 - __main__ - DEBUG - STT response: STTResponse(text=' Give me 5 examples of colors\n')
+2024-11-13 17:11:52,284 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 17:11:52,285 - __main__ - DEBUG - TTS Thread Started
+2024-11-13 17:11:52,287 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 17:11:52,351 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 17:11:52,353 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 17:11:52,393 - __main__ - DEBUG - PIPER PID, 9901
+2024-11-13 17:11:54,178 - __main__ - INFO - TTFS: 6.445502042770386
+2024-11-13 17:11:54,559 - __main__ - DEBUG - Sending ----> 1.
+2024-11-13 17:11:55,508 - __main__ - DEBUG - Sending ---->  Red
+2.
+2024-11-13 17:11:56,447 - __main__ - DEBUG - Sending ---->  Blue
+3.
+2024-11-13 17:11:57,295 - __main__ - DEBUG - Sending ---->  Green
+4.
+2024-11-13 17:11:58,214 - __main__ - DEBUG - Sending ---->  Yellow
+5.
+2024-11-13 17:12:00,611 - __main__ - DEBUG - Sending ---->  Purple
+
+Here are 5 examples of colors
+2024-11-13 17:12:00,814 - __main__ - DEBUG - Sending ----> :
+
+
+2024-11-13 17:12:01,343 - __main__ - DEBUG - Sending ----> 1.
+2024-11-13 17:12:02,340 - __main__ - DEBUG - Sending ---->  Red
+2.
+2024-11-13 17:12:03,231 - __main__ - DEBUG - Sending ---->  Blue
+3.
+2024-11-13 17:12:04,225 - __main__ - DEBUG - Sending ---->  Green
+4.
+2024-11-13 17:12:05,043 - __main__ - DEBUG - Sending ---->  Yellow
+5.
+2024-11-13 17:12:07,347 - __main__ - DEBUG - Sending ---->  Purple
+
+The text is written in **bold text**,
+2024-11-13 17:12:07,776 - __main__ - DEBUG - Sending ---->  indicating that
+2024-11-13 17:12:08,601 - __main__ - DEBUG - Sending ---->  the examples of colors
+2024-11-13 17:12:09,874 - __main__ - DEBUG - Sending ---->  are in **bold**.
+2024-11-13 17:12:11,158 - __main__ - DEBUG - Sending ---->  Here is the revised text:
+
+
+
+2024-11-13 17:12:13,072 - __main__ - DEBUG - Sending ----> <font face="Courier New">
+1.
+2024-11-13 17:12:14,966 - __main__ - DEBUG - Sending ---->  <b>Red</b>
+2.
+2024-11-13 17:12:16,841 - __main__ - DEBUG - Sending ---->  <b>Blue</b>
+3.
+2024-11-13 17:12:18,755 - __main__ - DEBUG - Sending ---->  <b>Green</b>
+4.
+2024-11-13 17:12:20,643 - __main__ - DEBUG - Sending ---->  <b>Yellow</b>
+5.
+2024-11-13 17:12:22,311 - __main__ - DEBUG - Decode Thread Stopped.
+2024-11-13 17:12:22,311 - __main__ - DEBUG - Received Stop Event At TTS
+2024-11-13 17:12:30,693 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-13 17:12:34,397 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /inference HTTP/11" 200 42
+2024-11-13 17:12:34,398 - __main__ - DEBUG - STT response: STTResponse(text=' Give me 5 examples of colors\n')
+2024-11-13 17:12:34,399 - __main__ - DEBUG - Decode Thread Started.
+2024-11-13 17:12:34,400 - __main__ - DEBUG - TTS Thread Started
+2024-11-13 17:12:34,402 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8081
+2024-11-13 17:12:34,407 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8081 "POST /completion HTTP/11" 200 None
+2024-11-13 17:12:34,408 - __main__ - DEBUG - LLM response: <Response [200]>
+2024-11-13 17:12:34,412 - __main__ - DEBUG - PIPER PID, 10312
+2024-11-13 17:12:35,239 - __main__ - INFO - TTFS: 4.553896427154541
+2024-11-13 17:12:36,490 - __main__ - DEBUG - Sending ----> Here are 5 examples of colors
+2024-11-13 17:12:36,692 - __main__ - DEBUG - Sending ----> :
+
+2024-11-13 17:12:37,099 - __main__ - DEBUG - Sending ----> 1.
+2024-11-13 17:12:37,950 - __main__ - DEBUG - Sending ---->  Blue
+2.
+2024-11-13 17:12:38,772 - __main__ - DEBUG - Sending ---->  Green
+3.
+2024-11-13 17:12:39,621 - __main__ - DEBUG - Sending ---->  Yellow
+4.
+2024-11-13 17:12:40,442 - __main__ - DEBUG - Sending ---->  Red
+5.
+2024-11-13 17:12:41,690 - __main__ - DEBUG - Sending ---->  Purple
+
+Let me know if
+2024-11-26 16:50:29,837 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-26 16:50:34,876 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-26 16:50:34,893 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-26 16:50:34,896 - __main__ - INFO - STT process initialized with PID: 9648
+2024-11-26 16:50:34,896 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Qwen2.5-3.1B-Q4_0_4_4.gguf
+2024-11-26 16:50:44,908 - __main__ - INFO - LLM process initialized with PID: 9660
+2024-11-26 16:50:47,295 - __main__ - INFO - Process 9648 terminated gracefully.
+2024-11-26 16:50:47,706 - __main__ - INFO - Process 9660 terminated gracefully.
+2024-11-26 17:04:19,477 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-26 17:04:24,525 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-26 17:04:24,545 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-26 17:04:24,549 - __main__ - INFO - STT process initialized with PID: 11898
+2024-11-26 17:04:24,550 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Qwen2.5-3.1B-Q4_0_4_4.gguf
+2024-11-26 17:04:34,563 - __main__ - INFO - LLM process initialized with PID: 11983
+2024-11-26 17:25:45,391 - __main__ - INFO - Process 11898 terminated gracefully.
+2024-11-26 17:25:45,598 - __main__ - INFO - Process 11983 terminated gracefully.
+2024-11-26 17:25:49,938 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/whisper.cpp/server -t 4 -p 1 -ng -fa --port 8080 -m /data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin
+2024-11-26 17:25:54,983 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1:8080
+2024-11-26 17:25:54,998 - urllib3.connectionpool - DEBUG - http://127.0.0.1:8080 "POST /load HTTP/11" 200 43
+2024-11-26 17:25:55,001 - __main__ - INFO - STT process initialized with PID: 13518
+2024-11-26 17:25:55,002 - __main__ - INFO - Initializing environment with command: /data/data/com.termux/files/home/llama.cpp/llama-server -t 2 -b 8192 -ub 512 -n 128 -c 2048 -fa --port 8081 -m /data/data/com.termux/files/home/models/Qwen2.5-3.1B-Q4_0_4_4.gguf
+2024-11-26 17:26:05,016 - __main__ - INFO - LLM process initialized with PID: 13573
+2024-11-26 17:28:59,360 - __main__ - INFO - Process 13518 terminated gracefully.
+2024-11-26 17:28:59,468 - __main__ - INFO - Process 13573 terminated gracefully.
diff --git a/examples/experimentals/voice_engine/main.py b/examples/experimentals/voice_engine/main.py
index 1861cf9..208781b 100644
--- a/examples/experimentals/voice_engine/main.py
+++ b/examples/experimentals/voice_engine/main.py
@@ -13,9 +13,12 @@
 import requests
 import logging
 import time
+import os
+import re
+import psutil
 
 LOGGER = None
-DEFAULT_CONFIG = "recipe/default.yaml"
+DEFAULT_CONFIG = "nyuntam/examples/experimentals/voice_engine/recipe/rpi5.yaml"
 
 
 def set_logger(*args, **kwargs):
@@ -158,10 +161,22 @@ def get_options(self):
         return cmd
 
 
+@dataclass
+class TTSEnvironmentConfig(EnvironmentConfig, EnvironmentConfigMeta):
+    voice: bool = field(default=False)
+    model: str = field(default="en_US-lessac-medium")
+    length_scale: float = field(default=1.5)
+
+    def get_options(self):
+        SPACE = " "
+        cmd = f""
+
+
 @dataclass
 class EngineEnvironmentConfig(EnvironmentConfigMeta):
     stt: STTEnvironmentConfig = field(default_factory=STTEnvironmentConfig)
     llm: LLMEnvironmentConfig = field(default_factory=LLMEnvironmentConfig)
+    tts: TTSEnvironmentConfig = field(default_factory=TTSEnvironmentConfig)
     log_path: tp.Union[str, Path] = field(default="environment.log")
 
     def __post_init__(self):
@@ -200,6 +215,11 @@ def parse_args():
 ##################################################
 
 
+class EnvironmentTypes(StrEnum):
+    STT = "stt"
+    LLM = "llm"
+
+
 @dataclass
 class STTInput:
     environment_config: STTEnvironmentConfig
@@ -347,7 +367,7 @@ def init_handlers(self) -> None:
 
         # NOTE: When using chain of responsibility, initialize handlers here
 
-    def call(self, input: STTInput) -> EngineResponse:
+    def call(self, input: STTInput, ttsConfig) -> EngineResponse:
         assert isinstance(input, STTInput), "Input must be of type STTInput"
         tick = time.time()
         stt_response = STTResponse.from_response(call_stt_environment(input))
@@ -356,14 +376,30 @@ def call(self, input: STTInput) -> EngineResponse:
         llm_input = LLMInput.from_stt_response(self.config.llm, stt_response)
         if llm_input.stream:
             # implement stream response handling
+            tts_processing_queue = queue.Queue()
             decoded_streams = queue.Queue()
             stream_queue = queue.Queue()
             stop_event = threading.Event()
+
             decode_thread = threading.Thread(
                 target=decode_stream,
-                args=(stop_event, stream_queue, decoded_streams, True),
+                args=(
+                    stop_event,
+                    stream_queue,
+                    decoded_streams,
+                    tts_processing_queue,
+                    True,
+                ),
             )
             decode_thread.start()
+
+            if ttsConfig.voice:
+                tts_processing_thread = threading.Thread(
+                    target=create_tts_wav,
+                    args=(stop_event, tts_processing_queue, ttsConfig),
+                )
+                tts_processing_thread.start()
+
             llm_input.data["stream"] = True
             ttfs = None
             response = call_llm_environment(llm_input)
@@ -374,12 +410,16 @@ def call(self, input: STTInput) -> EngineResponse:
                 if line:
                     if ttfs is None:
                         ttfs = time.time()
-                        print("TTFS: ", ttfs - tick)
+                        LOGGER.info(f"TTFS: {ttfs - tick}")
 
                     stream_queue.put(line)
             tock = time.time()
             stop_event.set()
+
             decode_thread.join()
+            if ttsConfig.voice:
+                tts_processing_thread.join()
+
             llm_response = LLMResponse(
                 text=decoded_streams_to_text(list(decoded_streams.queue)),
                 streams=list(decoded_streams.queue),
@@ -436,6 +476,7 @@ def initialize_environment(config: EnvironmentConfig):
             + config.get_options().split()
             + config.get_model_option().split()
         )
+
         LOGGER.info(f"Initializing environment with command: {' '.join(cmd)}")
         return subprocess.Popen(cmd)
 
@@ -473,7 +514,9 @@ def decode_stream(
     stop_event: threading.Event,
     stream_queue: queue.Queue,
     decoded_streams: queue.Queue,
+    tts_processing_queue: queue.Queue,
     decode_and_print: bool = False,
+    decode_and_talk: bool = True,
 ):
     while not stop_event.is_set() or not stream_queue.empty():
         try:
@@ -483,21 +526,163 @@ def decode_stream(
             if line:
                 json_response = json.loads(line.decode("utf-8").replace("data: ", ""))
                 decoded_streams.put(json_response)
-                # if decode_and_print:
-                #     # print decoded stream continously with flush
-                #     print(json_response["content"], end="", flush=True)
+                if decode_and_talk:
+                    tts_processing_queue.put(json_response["content"])
+
         except queue.Empty:
             pass  # No data to process yet, continue
 
 
+def create_tts_wav(
+    stop_event: threading.Event,
+    tts_processing_queue: queue.Queue,
+    ttsConfig,
+    # output_dir: str = "/home/piuser/voice/core/test-output",
+):
+
+    piper_process = subprocess.Popen(
+        [
+            "piper",
+            "--model",
+            f"{ttsConfig.model}",
+            "--length-scale",
+            f"{ttsConfig.length_scale}",
+            "--output_raw",
+        ],
+        stdin=subprocess.PIPE,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        universal_newlines=True,
+    )
+
+    piper_proc = psutil.Process(piper_process.pid)
+
+    # Define FFmpeg command to stream the audio over HTTP
+    ffmpeg_command = [
+        "ffmpeg",
+        "-f",
+        "s16le",  # Input format (16-bit PCM, little-endian)
+        "-ar",
+        "22050",  # Sample rate
+        "-ac",
+        "1",  # Number of audio channels (mono)
+        "-i",
+        "-",  # Input from stdin (output from Piper)
+        "-acodec",
+        "aac",  # Audio codec (AAC)
+        "-ab",
+        "128k",  # Audio bitrate
+        "-f",
+        "adts",  # Output format
+        "-content_type",
+        "audio/aac",  # Content type for the HTTP stream
+        "-listen",
+        "1",  # Make FFmpeg act as a server
+        "http://0.0.0.0:8082/feed.aac",  # Output URL
+        "-acodec",
+        "pcm_s16le",  # Audio codec for WAV
+        # os.path.join(output_dir, "output.wav")  # Output WAV file path
+    ]
+
+    ffmpeg_process = subprocess.Popen(
+        ffmpeg_command,
+        stdin=piper_process.stdout,  # Take input from Piper process
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+    )
+
+    # Monitor FFmpeg HTTP stream for first byte
+    def monitor_stream():
+        LOGGER.info("🟢 Starting FFMPEG 🟢")
+        stream_start_time = time.time()
+        url = "http://0.0.0.0:8082/feed.aac"
+        while True:
+            try:
+                with requests.get(url, stream=True, timeout=1) as response:
+                    if response.status_code == 200:
+                        # Record the time when the first byte is received
+                        first_byte_time = time.time() - stream_start_time
+                        LOGGER.info(
+                            f"🔴 Time to receive first byte of audio: {first_byte_time:.2f} seconds"
+                        )
+                        break
+            except requests.exceptions.RequestException as e:
+                time.sleep(1)
+
+    threading.Thread(target=monitor_stream, daemon=True).start()
+
+    buffer = ""
+
+    try:
+        while not stop_event.is_set() or not tts_processing_queue.empty():
+            if tts_processing_queue.qsize() > 0:
+                try:
+                    # Get one item from the queue
+                    text_part = tts_processing_queue.get(timeout=0.001)
+                    buffer += text_part
+                    # LOGGER.debug(f"BUFFER : {buffer}")
+
+                    # Check if the buffer contains a full sentence
+                    if any(
+                        delimiter in buffer
+                        for delimiter in [".", "!", "?", ":", ";", ","]
+                    ):
+                        # Split the buffer into sentences
+                        sentences = re.split(r"(?<=[.!?])\s+", buffer)
+
+                        # Keep the last partial sentence in the buffer
+                        buffer = (
+                            sentences.pop()
+                            if not re.search(r"[.!?]$", buffer) and len(sentences) > 1
+                            else ""
+                        )
+
+                        # Join the complete sentences and send to Piper
+                        text = " ".join(sentences)
+
+                        # Measure peak memory usage of Piper process
+                        if text:
+                            try:
+                                LOGGER.debug(f"Sending ----> {text}")
+                                # Write the text to Piper's stdin
+                                piper_process.stdin.write(f"{text}\n")
+                                piper_process.stdin.flush()
+
+                            except BrokenPipeError:
+                                LOGGER.error(
+                                    "BrokenPipeError: Piper process terminated unexpectedly."
+                                )
+                                break
+
+                except queue.Empty:
+                    if stop_event.is_set():
+                        break
+                    continue  # No data to process yet, continue
+
+        LOGGER.debug("Received Stop Event At TTS")
+
+    except Exception as e:
+        LOGGER.debug(f"Unable to run TTS engine. Error message : {e}")
+
+    finally:
+        piper_process.stdin.close()
+        for process in [piper_process, ffmpeg_process]:
+            if process and process.poll() is None:
+                process.terminate()
+                try:
+                    process.wait(timeout=5)
+                except subprocess.TimeoutExpired:
+                    process.kill()
+
+
 def decoded_streams_to_text(decoded_streams: tp.List[tp.Dict[str, tp.Any]]) -> str:
     return " ".join([stream["content"] for stream in decoded_streams])
 
 
-
 ##################################################
 ##################################################
 
+
 def print_dict(d: dict, indent: int = 0):
     for k, v in d.items():
         if isinstance(v, dict):
@@ -528,7 +713,7 @@ def print_dict(d: dict, indent: int = 0):
             if user_input == "exit":
                 break
             stt_input = STTInput(environment_config=config.stt, audio_path=user_input)
-            response = engine.call(stt_input)
+            response = engine.call(stt_input, config.tts)
             print_dict(
                 {
                     "latency": response.latency,
@@ -539,4 +724,4 @@ def print_dict(d: dict, indent: int = 0):
             print(f"-" * 50)
     except Exception as e:
         engine.terminate()
-        raise e
\ No newline at end of file
+        raise e
diff --git a/examples/experimentals/voice_engine/main_android.py b/examples/experimentals/voice_engine/main_android.py
new file mode 100644
index 0000000..db1159a
--- /dev/null
+++ b/examples/experimentals/voice_engine/main_android.py
@@ -0,0 +1,690 @@
+from dataclasses import dataclass, field, asdict, is_dataclass, fields
+from argparse import ArgumentParser
+from contextlib import contextmanager
+from enum import StrEnum
+from pathlib import Path
+import typing as tp
+import subprocess
+import queue
+import threading
+from abc import ABC, abstractmethod
+import yaml
+import json
+import requests
+import logging
+import time
+import os
+import re
+import psutil
+
+LOGGER = None
+DEFAULT_CONFIG = "nyuntam/examples/experimentals/voice_engine/recipe/rpi5.yaml"
+
+
+def set_logger(*args, **kwargs):
+    global LOGGER
+    logging.basicConfig(*args, **kwargs)
+    LOGGER = logging.getLogger(__name__)
+
+
+##################################################
+#           Environment Configurations           #
+##################################################
+
+
+class EnvironmentConfigMeta(ABC):
+
+    @classmethod
+    def from_dict(cls, data: dict):
+        kwargs = {}
+        for field in fields(cls):
+            if field.name in data:
+                if is_dataclass(field.type):
+                    kwargs[field.name] = field.type.from_dict(data[field.name])
+                else:
+                    kwargs[field.name] = data[field.name]
+        return cls(**kwargs)
+
+    @classmethod
+    def from_yaml(cls, path: str):
+        with open(path, "r") as file:
+            data = yaml.safe_load(file)
+        return cls.from_dict(data)
+
+    def to_dict(self):
+        return asdict(self)
+
+    def to_yaml(self, path: str):
+        with open(path, "w") as file:
+            yaml.dump(self.to_dict(), file)
+
+
+@dataclass
+class EnvironmentConfig(ABC):
+    n_threads: int = field(default=4)
+    n_procs: int = field(default=1)
+    flash_attn: bool = field(default=True)
+    gpu: bool = field(default=False)
+    port: int = field(default=8080)
+    model: tp.Union[str, Path] = field(default="")
+
+    _executable: tp.Union[str, Path] = field(default="")
+    _warmup: int = field(default=0)
+
+    @property
+    def executable_path(self) -> str:
+        if isinstance(self._executable, Path):
+            return str(self._executable.absolute().resolve())
+        return self._executable
+
+    @property
+    def model_path(self) -> str:
+        if isinstance(self.model, Path):
+            return str(self.model.absolute().resolve())
+        return self.model
+
+    def get_model_option(self) -> str:
+        return f"-m {self.model_path}"
+
+    @abstractmethod
+    def get_options(self) -> str:
+        pass
+
+    def __post_init__(self):
+        assert self.n_threads in [4, 2, 1], "Number of threads must be 1, 2, or 4"
+        assert self.n_procs in [1, 2, 4], "Number of processes must be 1, 2, or 4"
+        assert self.port > 0, "Port number must be positive"
+
+
+@dataclass
+class STTEnvironmentConfig(EnvironmentConfig, EnvironmentConfigMeta):
+    """Environment configuration for whisper.cpp"""
+
+    port: int = field(default=8080)
+    _warmup: int = field(default=5)
+
+    def get_options(self):
+        SPACE = " "
+        flash_attn = "-fa"
+        gpu = "-ng"
+        threads = f"-t {self.n_threads}"
+        processes = f"-p {self.n_procs}"
+        port = f"--port {self.port}"
+        cmd = threads
+        cmd += SPACE
+        cmd += processes
+        cmd += SPACE
+        if not self.gpu:
+            cmd += gpu
+            cmd += SPACE
+        if self.flash_attn:
+            cmd += flash_attn
+            cmd += SPACE
+        cmd += port
+        cmd += SPACE
+        return cmd
+
+
+@dataclass
+class LLMEnvironmentConfig(EnvironmentConfig, EnvironmentConfigMeta):
+    """Environment configuration for llama.cpp"""
+
+    batch_size: int = field(default=8192)
+    ubatch_size: int = field(default=512)
+    n_predict: int = field(default=-1)
+    stream: bool = field(default=True)
+    port: int = field(default=8081)
+    _warmup: int = field(default=15)
+
+    def get_options(self):
+        SPACE = " "
+        flash_attn = "-fa"
+        threads = f"-t {self.n_threads}"
+        batch_size = f"-b {self.batch_size}"
+        ubatch_size = f"-ub {self.ubatch_size}"
+        port = f"--port {self.port}"
+        n_predict = f"-n {self.n_predict}"
+        context_length = f"-c 2048"
+
+        cmd = threads
+        cmd += SPACE
+        cmd += batch_size
+        cmd += SPACE
+        cmd += ubatch_size
+        cmd += SPACE
+        cmd += n_predict
+        cmd += SPACE
+        cmd += context_length
+        cmd += SPACE
+        if self.flash_attn:
+            cmd += flash_attn
+            cmd += SPACE
+        cmd += port
+        cmd += SPACE
+        return cmd
+
+
+@dataclass
+class TTSEnvironmentConfig(EnvironmentConfig, EnvironmentConfigMeta):
+    voice: bool = field(default=False)
+    model: str = field(default="en_US-lessac-medium")
+    length_scale: float = field(default=1.5)
+
+    def get_options(self):
+        SPACE = " "
+        cmd = f""
+
+
+@dataclass
+class EngineEnvironmentConfig(EnvironmentConfigMeta):
+    stt: STTEnvironmentConfig = field(default_factory=STTEnvironmentConfig)
+    llm: LLMEnvironmentConfig = field(default_factory=LLMEnvironmentConfig)
+    tts: TTSEnvironmentConfig = field(default_factory=TTSEnvironmentConfig)
+    log_path: tp.Union[str, Path] = field(default="environment.log")
+
+    def __post_init__(self):
+        set_logger(
+            filename=self.log_path,
+            level=logging.DEBUG,
+            format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
+        )
+
+
+##################################################
+#           Argument Parsing Functions           #
+##################################################
+
+
+def parse_args():
+    # TODO: Add more arguments as necessary
+    parser = ArgumentParser(description="Environment Configuration Parser")
+    parser.add_argument(
+        "--config",
+        type=str,
+        default=None,
+        help="Path to the configuration file (.yaml)",
+    )
+    parser.add_argument(
+        "--log",
+        type=str,
+        default="environment.log",
+        help="Path to the log file",
+    )
+    return parser.parse_args()
+
+
+##################################################
+#                  Engine Class                  #
+##################################################
+
+
+class EnvironmentTypes(StrEnum):
+    STT = "stt"
+    LLM = "llm"
+
+
+@dataclass
+class STTInput:
+    environment_config: STTEnvironmentConfig
+    audio_path: tp.Union[str, Path]
+    data: tp.Optional[tp.Dict[str, tp.Any]] = None
+
+    @property
+    def config(self) -> STTEnvironmentConfig:
+        return self.environment_config
+
+    @property
+    def audio(self) -> str:
+        if isinstance(self.audio_path, Path):
+            return str(self.audio_path.absolute().resolve())
+        return self.audio_path
+
+
+@dataclass
+class STTResponse:
+    text: str
+
+    @classmethod
+    def from_response(cls, response: requests.Response):
+        if response.status_code != 200:
+            raise_exception_from_response(response)
+        return cls(response.json()["text"])
+
+
+def default_llm_input_data_factory():
+    return {
+        "prompt": "",
+        "n_predict": -1,
+        "stream": True,
+    }
+
+
+@dataclass
+class LLMInput:
+    environment_config: LLMEnvironmentConfig
+    prompt: str
+    data: tp.Optional[tp.Dict[str, tp.Any]] = field(
+        default_factory=default_llm_input_data_factory
+    )
+
+    @property
+    def stream(self) -> bool:
+        if self.data is not None and "stream" in self.data:
+            return self.data["stream"]
+        return self.config.stream
+
+    @stream.setter
+    def stream(self, value: bool):
+        if self.data is None:
+            self.data = {}
+        self.data["stream"] = value
+
+    @property
+    def config(self) -> LLMEnvironmentConfig:
+        return self.environment_config
+
+    @classmethod
+    def from_stt_response(
+        cls, environment_config: LLMEnvironmentConfig, stt_response: STTResponse
+    ):
+        return cls(environment_config, stt_response.text)
+
+    def get_data(self):
+        return {
+            **self.data,
+            "prompt": self.prompt,
+        }
+
+
+@dataclass
+class LLMResponse:
+    text: str
+    streams: tp.List[tp.Dict[str, tp.Any]] = field(default_factory=list)
+    ttfs: float = 0.0
+    stream: bool = False
+
+
+@dataclass
+class EngineInput:
+    stt_input: tp.Optional[STTInput] = None
+    llm_input: tp.Optional[LLMInput] = None
+
+
+@dataclass
+class EngineResponse:
+    stt_response: tp.Optional[STTResponse] = None
+    llm_response: tp.Optional[LLMResponse] = None
+    latency: float = 0.0
+    stt_latency: float = 0.0
+
+
+class Engine:
+    def __init__(self, config: EngineEnvironmentConfig):
+        self.config = config
+        self.init_handlers()
+
+    @property
+    def stt(self) -> tp.Optional[subprocess.Popen]:
+        if hasattr(self, "_stt_process"):
+            return self._stt_process
+        else:
+            return None
+
+    @stt.setter
+    def stt(self, value: subprocess.Popen):
+        self._stt_process = value
+
+    @property
+    def llm(self) -> tp.Optional[subprocess.Popen]:
+        if hasattr(self, "_llm_process"):
+            return self._llm_process
+        else:
+            return None
+
+    @llm.setter
+    def llm(self, value: subprocess.Popen):
+        self._llm_process = value
+
+    def init_handlers(self) -> None:
+        # Initialize stt
+        if self.stt is not None:
+            raise ValueError("STT process already initialized")
+        try:
+            self.stt = initialize_stt_environment(self.config.stt)
+            LOGGER.info(f"STT process initialized with PID: {self.stt.pid}")
+        except Exception as e:
+            LOGGER.error(f"Failed to initialize STT process: {e}")
+            raise e
+
+        # Initialize llm
+        if self.llm is not None:
+            raise ValueError("LLM process already initialized")
+        try:
+            self.llm = initialize_llm_environment(self.config.llm)
+            if self.llm.poll() is not None:
+                raise Exception(f"LLM process failed to start: {self.llm.stderr}")
+            LOGGER.info(f"LLM process initialized with PID: {self.llm.pid}")
+        except Exception as e:
+            LOGGER.error(f"Failed to initialize LLM process: {e}")
+            raise e
+
+        # NOTE: When using chain of responsibility, initialize handlers here
+
+    def call(self, input: STTInput, ttsConfig) -> EngineResponse:
+        assert isinstance(input, STTInput), "Input must be of type STTInput"
+        tick = time.time()
+        stt_response = STTResponse.from_response(call_stt_environment(input))
+        stt_latency = time.time()
+        LOGGER.debug(f"STT response: {stt_response}")
+        llm_input = LLMInput.from_stt_response(self.config.llm, stt_response)
+        if llm_input.stream:
+            # implement stream response handling
+            tts_processing_queue = queue.Queue()
+            decoded_streams = queue.Queue()
+            stream_queue = queue.Queue()
+            stop_event = threading.Event()
+
+            decode_thread = threading.Thread(
+                target=decode_stream,
+                args=(
+                    stop_event,
+                    stream_queue,
+                    decoded_streams,
+                    tts_processing_queue,
+                    True,
+                ),
+            )
+            decode_thread.start()
+
+            if ttsConfig.voice:
+                tts_processing_thread = threading.Thread(
+                    target=create_tts_wav,
+                    args=(stop_event, tts_processing_queue, ttsConfig),
+                )
+                tts_processing_thread.start()
+
+            llm_input.data["stream"] = True
+            ttfs = None
+            response = call_llm_environment(llm_input)
+            LOGGER.debug(f"LLM response: {response}")
+            if not response.ok or response.status_code != 200:
+                raise_exception_from_response(response)
+            for line in response.iter_lines():
+                if line:
+                    if ttfs is None:
+                        ttfs = time.time()
+                        LOGGER.info(f"TTFS: {ttfs - tick}")
+
+                    stream_queue.put(line)
+            tock = time.time()
+            stop_event.set()
+
+            decode_thread.join()
+            if ttsConfig.voice:
+                tts_processing_thread.join()
+
+            llm_response = LLMResponse(
+                text=decoded_streams_to_text(list(decoded_streams.queue)),
+                streams=list(decoded_streams.queue),
+                ttfs=ttfs - tick,
+                stream=True,
+            )
+            return EngineResponse(
+                stt_response=stt_response,
+                llm_response=llm_response,
+                latency=tock - tick,
+                stt_latency=stt_latency - tick,
+            )
+        else:
+            raise NotImplementedError("Non-streaming response handling not implemented")
+        # NOTE: When using chain of responsibility, call handlers here
+
+    def terminate(self):
+        if self.stt is not None:
+            kill_process(self.stt)
+        if self.llm is not None:
+            kill_process(self.llm)
+
+
+##################################################
+#                 Utility Functions              #
+##################################################
+
+
+@contextmanager
+def warmup_environment(warmup_time: int):
+    yield
+    time.sleep(warmup_time)
+
+
+def raise_exception_from_response(response: requests.Response):
+    LOGGER.error(
+        f"API call failed with status code: {response.status_code}, response: {response.text}"
+    )
+    raise Exception(
+        f"API call failed with status code: {response.status_code}, response: {response.text}"
+    )
+
+
+def kill_process(process: subprocess.Popen):
+    process.terminate()
+    process.wait()
+    LOGGER.info(f"Process {process.pid} terminated gracefully.")
+
+
+# Set CPU affinity for the entire process
+def process_affinity(process_id, affinity_cores):
+    p = psutil.Process(process_id)
+    p.cpu_affinity(affinity_cores)
+
+
+def initialize_environment(config: EnvironmentConfig):
+    with warmup_environment(config._warmup):
+        cmd: tp.List[str] = (
+            [config.executable_path]
+            + config.get_options().split()
+            + config.get_model_option().split()
+        )
+
+        LOGGER.info(f"Initializing environment with command: {' '.join(cmd)}")
+        return subprocess.Popen(cmd)
+
+
+def initialize_stt_environment(config: STTEnvironmentConfig):
+    proc = initialize_environment(config=config)
+    url = f"http://127.0.0.1:{config.port}/load"
+    data = {"model": config.model_path}
+    response = requests.post(url, json=data)
+    if not response.ok or response.status_code != 200:
+        LOGGER.error(f"Failed to load STT model.")
+        raise_exception_from_response(response)
+    return proc
+
+
+initialize_llm_environment: tp.Callable[[LLMEnvironmentConfig], subprocess.Popen] = (
+    initialize_environment
+)
+
+
+def call_stt_environment(input: STTInput):
+    url = f"http://127.0.0.1:{input.config.port}/inference"
+    files = {"file": open(input.audio, "rb")}
+    data = input.data
+    return requests.post(url, files=files, data=data)
+
+
+def call_llm_environment(input: LLMInput):
+    url = f"http://127.0.0.1:{input.config.port}/completion"
+    data = input.get_data()
+    return requests.post(url, json=data, stream=input.stream)
+
+
+def decode_stream(
+    stop_event: threading.Event,
+    stream_queue: queue.Queue,
+    decoded_streams: queue.Queue,
+    tts_processing_queue: queue.Queue,
+    decode_and_print: bool = False,
+    decode_and_talk: bool = True,
+):
+    LOGGER.debug("Decode Thread Started.")
+    while not stop_event.is_set() or not stream_queue.empty():
+        try:
+            line: bytearray = stream_queue.get(
+                timeout=0.001
+            )  # Get stream data from queue
+            if line:
+                json_response = json.loads(line.decode("utf-8").replace("data: ", ""))
+                decoded_streams.put(json_response)
+                if decode_and_talk:
+                    tts_processing_queue.put(json_response["content"])
+
+        except queue.Empty:
+            pass  # No data to process yet, continue
+    LOGGER.debug("Decode Thread Stopped.")
+
+
+def create_tts_wav(
+    stop_event: threading.Event,
+    tts_processing_queue: queue.Queue,
+    ttsConfig,
+    # output_dir: str = "/home/piuser/voice/core/test-output",
+):
+    LOGGER.debug("TTS Thread Started")
+    piper_process = subprocess.Popen(
+        [
+            "espeak"
+        ],
+        stdin=subprocess.PIPE,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        universal_newlines=True,
+    )
+
+    LOGGER.debug(f"PIPER PID, {piper_process.pid}")
+
+    piper_proc = psutil.Process(piper_process.pid)
+
+    buffer = ""
+
+    try:
+        while not stop_event.is_set() or not tts_processing_queue.empty():
+            if tts_processing_queue.qsize() > 0:
+                try:
+                    # Get one item from the queue
+                    text_part = tts_processing_queue.get(timeout=0.001)
+                    buffer += text_part
+                    # LOGGER.debug(f"BUFFER : {buffer}")
+
+                    # Check if the buffer contains a full sentence
+                    if any(
+                        delimiter in buffer
+                        for delimiter in [".", "!", "?", ":", ";", "," , "and", "but", "or", "nor", "for", "yet", "so",  # Coordinating conjunctions
+                                          "after", "although", "as", "as if", "as long as", "as much as", "as soon as", "as though", 
+                                            "because", "before", "by the time", "even if", "even though", "if", "if only", 
+                                            "in case", "in order that", "lest", "once", "only if", "provided that", 
+                                            "since", "so that", "than", "that", "though", "till", "unless", 
+                                            "until", "when", "whenever", "where", "whereas", "wherever", "whether", 
+                                            "while",  # Subordinating conjunctions
+                                            "both", "either", "neither", "not only", "whether or not"  # Correlative conjunctions
+                                        ]
+                    ):
+                        # Split the buffer into sentences
+                        sentences = re.split(r"(?<=[.!?])\s+", buffer)
+
+                        # Keep the last partial sentence in the buffer
+                        buffer = (
+                            sentences.pop()
+                            if not re.search(r"[.!?]$", buffer) and len(sentences) > 1
+                            else ""
+                        )
+
+                        # Join the complete sentences and send to Piper
+                        text = " ".join(sentences)
+
+                        # Measure peak memory usage of Piper process
+                        if text:
+                            try:
+                                LOGGER.debug(f"Sending ----> {text}")
+                                # Write the text to Piper's stdin
+                                piper_process.stdin.write(f"{text}\n")
+                                piper_process.stdin.flush()
+
+                            except BrokenPipeError:
+                                LOGGER.error(
+                                    "BrokenPipeError: Piper process terminated unexpectedly."
+                                )
+                                break
+
+                except queue.Empty:
+                    if stop_event.is_set():
+                        break
+                    continue  # No data to process yet, continue
+
+        LOGGER.debug("Received Stop Event At TTS")
+
+    except Exception as e:
+        LOGGER.debug(f"Unable to run TTS engine. Error message : {e}")
+
+    finally:
+        piper_process.stdin.close()
+        for process in [piper_process, ffmpeg_process]:
+            if process and process.poll() is None:
+                process.terminate()
+                try:
+                    process.wait(timeout=5)
+                    LOGGER.debug("TTS thread Stopped.")
+                except subprocess.TimeoutExpired:
+                    process.kill()
+
+
+def decoded_streams_to_text(decoded_streams: tp.List[tp.Dict[str, tp.Any]]) -> str:
+    return " ".join([stream["content"] for stream in decoded_streams])
+
+
+##################################################
+##################################################
+
+
+def print_dict(d: dict, indent: int = 0):
+    for k, v in d.items():
+        if isinstance(v, dict):
+            print(" " * indent, f" - {k}:")
+            print_dict(v, indent + 4)
+        else:
+            print(" " * indent, f" - {k}: {v}")
+
+
+if __name__ == "__main__":
+    args = parse_args()
+    if args.config:
+        config = EngineEnvironmentConfig.from_yaml(args.config)
+        print_dict(config.to_dict())
+
+    else:
+        config = EngineEnvironmentConfig()
+        config.to_yaml("/home/piuser/edge/recipe/default.yaml")
+
+    engine = Engine(config)
+
+    try:
+        while True:
+            # input an audio file path from the user
+            user_input = input("Enter the path to the audio file: ")
+            if user_input == "":
+                user_input = "/home/piuser/shwu/audio_samples/5sec/79833.wav"
+            if user_input == "exit":
+                break
+            stt_input = STTInput(environment_config=config.stt, audio_path=user_input)
+            response = engine.call(stt_input, config.tts)
+            print_dict(
+                {
+                    "latency": response.latency,
+                    "ttfs": response.llm_response.ttfs,
+                    "stt_latency": response.stt_latency,
+                }
+            )
+            print(f"-" * 50)
+    except Exception as e:
+        engine.terminate()
+        raise e
diff --git a/examples/experimentals/voice_engine/main_android_continous.py b/examples/experimentals/voice_engine/main_android_continous.py
new file mode 100644
index 0000000..78d5cda
--- /dev/null
+++ b/examples/experimentals/voice_engine/main_android_continous.py
@@ -0,0 +1,721 @@
+from dataclasses import dataclass, field, asdict, is_dataclass, fields
+from argparse import ArgumentParser
+from contextlib import contextmanager
+from enum import StrEnum
+from pathlib import Path
+import typing as tp
+import subprocess
+import queue
+import threading
+from abc import ABC, abstractmethod
+import yaml
+import json
+import requests
+import logging
+import time
+import os
+import re
+import psutil
+from receive_audio import receive_audio
+
+LOGGER = None
+DEFAULT_CONFIG = "nyuntam/examples/experimentals/voice_engine/recipe/rpi5.yaml"
+
+
+def set_logger(*args, **kwargs):
+    global LOGGER
+    logging.basicConfig(*args, **kwargs)
+    LOGGER = logging.getLogger(__name__)
+
+
+##################################################
+#           Environment Configurations           #
+##################################################
+
+
+class EnvironmentConfigMeta(ABC):
+
+    @classmethod
+    def from_dict(cls, data: dict):
+        kwargs = {}
+        for field in fields(cls):
+            if field.name in data:
+                if is_dataclass(field.type):
+                    kwargs[field.name] = field.type.from_dict(data[field.name])
+                else:
+                    kwargs[field.name] = data[field.name]
+        return cls(**kwargs)
+
+    @classmethod
+    def from_yaml(cls, path: str):
+        with open(path, "r") as file:
+            data = yaml.safe_load(file)
+        return cls.from_dict(data)
+
+    def to_dict(self):
+        return asdict(self)
+
+    def to_yaml(self, path: str):
+        with open(path, "w") as file:
+            yaml.dump(self.to_dict(), file)
+
+
+@dataclass
+class EnvironmentConfig(ABC):
+    n_threads: int = field(default=4)
+    n_procs: int = field(default=1)
+    flash_attn: bool = field(default=True)
+    gpu: bool = field(default=False)
+    port: int = field(default=8080)
+    model: tp.Union[str, Path] = field(default="")
+
+    _executable: tp.Union[str, Path] = field(default="")
+    _warmup: int = field(default=0)
+
+    @property
+    def executable_path(self) -> str:
+        if isinstance(self._executable, Path):
+            return str(self._executable.absolute().resolve())
+        return self._executable
+
+    @property
+    def model_path(self) -> str:
+        if isinstance(self.model, Path):
+            return str(self.model.absolute().resolve())
+        return self.model
+
+    def get_model_option(self) -> str:
+        return f"-m {self.model_path}"
+
+    @abstractmethod
+    def get_options(self) -> str:
+        pass
+
+    def __post_init__(self):
+        assert self.n_threads in [4, 2, 1], "Number of threads must be 1, 2, or 4"
+        assert self.n_procs in [1, 2, 4], "Number of processes must be 1, 2, or 4"
+        assert self.port > 0, "Port number must be positive"
+
+
+@dataclass
+class STTEnvironmentConfig(EnvironmentConfig, EnvironmentConfigMeta):
+    """Environment configuration for whisper.cpp"""
+
+    port: int = field(default=8080)
+    _warmup: int = field(default=5)
+
+    def get_options(self):
+        SPACE = " "
+        flash_attn = "-fa"
+        gpu = "-ng"
+        threads = f"-t {self.n_threads}"
+        processes = f"-p {self.n_procs}"
+        port = f"--port {self.port}"
+        cmd = threads
+        cmd += SPACE
+        cmd += processes
+        cmd += SPACE
+        if not self.gpu:
+            cmd += gpu
+            cmd += SPACE
+        if self.flash_attn:
+            cmd += flash_attn
+            cmd += SPACE
+        cmd += port
+        cmd += SPACE
+        return cmd
+
+
+@dataclass
+class LLMEnvironmentConfig(EnvironmentConfig, EnvironmentConfigMeta):
+    """Environment configuration for llama.cpp"""
+
+    batch_size: int = field(default=8192)
+    ubatch_size: int = field(default=512)
+    n_predict: int = field(default=-1)
+    stream: bool = field(default=True)
+    port: int = field(default=8081)
+    _warmup: int = field(default=15)
+
+    def get_options(self):
+        SPACE = " "
+        flash_attn = "-fa"
+        threads = f"-t {self.n_threads}"
+        batch_size = f"-b {self.batch_size}"
+        ubatch_size = f"-ub {self.ubatch_size}"
+        port = f"--port {self.port}"
+        n_predict = f"-n {self.n_predict}"
+        context_length = f"-c 2048"
+
+        cmd = threads
+        cmd += SPACE
+        cmd += batch_size
+        cmd += SPACE
+        cmd += ubatch_size
+        cmd += SPACE
+        cmd += n_predict
+        cmd += SPACE
+        cmd += context_length
+        cmd += SPACE
+        if self.flash_attn:
+            cmd += flash_attn
+            cmd += SPACE
+        cmd += port
+        cmd += SPACE
+        return cmd
+
+
+@dataclass
+class TTSEnvironmentConfig(EnvironmentConfig, EnvironmentConfigMeta):
+    voice: bool = field(default=False)
+    model: str = field(default="en_US-lessac-medium")
+    length_scale: float = field(default=1.5)
+
+    def get_options(self):
+        SPACE = " "
+        cmd = f""
+
+
+@dataclass
+class EngineEnvironmentConfig(EnvironmentConfigMeta):
+    stt: STTEnvironmentConfig = field(default_factory=STTEnvironmentConfig)
+    llm: LLMEnvironmentConfig = field(default_factory=LLMEnvironmentConfig)
+    tts: TTSEnvironmentConfig = field(default_factory=TTSEnvironmentConfig)
+    log_path: tp.Union[str, Path] = field(default="environment.log")
+
+    def __post_init__(self):
+        set_logger(
+            filename=self.log_path,
+            level=logging.DEBUG,
+            format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
+        )
+
+
+##################################################
+#           Argument Parsing Functions           #
+##################################################
+
+
+def parse_args():
+    # TODO: Add more arguments as necessary
+    parser = ArgumentParser(description="Environment Configuration Parser")
+    parser.add_argument(
+        "--config",
+        type=str,
+        default=None,
+        help="Path to the configuration file (.yaml)",
+    )
+    parser.add_argument(
+        "--log",
+        type=str,
+        default="environment.log",
+        help="Path to the log file",
+    )
+    return parser.parse_args()
+
+
+##################################################
+#                  Engine Class                  #
+##################################################
+
+
+class EnvironmentTypes(StrEnum):
+    STT = "stt"
+    LLM = "llm"
+
+
+@dataclass
+class STTInput:
+    environment_config: STTEnvironmentConfig
+    audio_path: tp.Union[str, Path]
+    data: tp.Optional[tp.Dict[str, tp.Any]] = None
+
+    @property
+    def config(self) -> STTEnvironmentConfig:
+        return self.environment_config
+
+    @property
+    def audio(self) -> str:
+        if isinstance(self.audio_path, Path):
+            return str(self.audio_path.absolute().resolve())
+        return self.audio_path
+
+
+@dataclass
+class STTResponse:
+    text: str
+
+    @classmethod
+    def from_response(cls, response: requests.Response):
+        if response.status_code != 200:
+            raise_exception_from_response(response)
+        return cls(response.json()["text"])
+
+
+def default_llm_input_data_factory():
+    return {
+        "prompt": "",
+        "n_predict": -1,
+        "stream": True,
+    }
+
+
+@dataclass
+class LLMInput:
+    environment_config: LLMEnvironmentConfig
+    prompt: str
+    data: tp.Optional[tp.Dict[str, tp.Any]] = field(
+        default_factory=default_llm_input_data_factory
+    )
+
+    @property
+    def stream(self) -> bool:
+        if self.data is not None and "stream" in self.data:
+            return self.data["stream"]
+        return self.config.stream
+
+    @stream.setter
+    def stream(self, value: bool):
+        if self.data is None:
+            self.data = {}
+        self.data["stream"] = value
+
+    @property
+    def config(self) -> LLMEnvironmentConfig:
+        return self.environment_config
+
+    @classmethod
+    def from_stt_response(
+        cls, environment_config: LLMEnvironmentConfig, stt_response: STTResponse
+    ):
+        return cls(environment_config, stt_response.text)
+
+    def get_data(self):
+        self.prompt_qwen = f"<|im_start|>system: You are Qwen, a smart and intelligent smart assistant who can give clear and crisp answer to user. You do not hallucinate at all <|im_end|> <|im_start|>user: {self.prompt} <|im_end|> <|im_start|>assistant "
+        print(self.prompt_qwen)
+        return {
+            **self.data,
+            "prompt": self.prompt_qwen,
+        }
+
+
+@dataclass
+class LLMResponse:
+    text: str
+    streams: tp.List[tp.Dict[str, tp.Any]] = field(default_factory=list)
+    ttfs: float = 0.0
+    stream: bool = False
+
+
+@dataclass
+class EngineInput:
+    stt_input: tp.Optional[STTInput] = None
+    llm_input: tp.Optional[LLMInput] = None
+
+
+@dataclass
+class EngineResponse:
+    stt_response: tp.Optional[STTResponse] = None
+    llm_response: tp.Optional[LLMResponse] = None
+    latency: float = 0.0
+    stt_latency: float = 0.0
+
+
+class Engine:
+    def __init__(self, config: EngineEnvironmentConfig):
+        self.config = config
+        self.init_handlers()
+
+    @property
+    def stt(self) -> tp.Optional[subprocess.Popen]:
+        if hasattr(self, "_stt_process"):
+            return self._stt_process
+        else:
+            return None
+
+    @stt.setter
+    def stt(self, value: subprocess.Popen):
+        self._stt_process = value
+
+    @property
+    def llm(self) -> tp.Optional[subprocess.Popen]:
+        if hasattr(self, "_llm_process"):
+            return self._llm_process
+        else:
+            return None
+
+    @llm.setter
+    def llm(self, value: subprocess.Popen):
+        self._llm_process = value
+
+    def init_handlers(self) -> None:
+        # Initialize stt
+        if self.stt is not None:
+            raise ValueError("STT process already initialized")
+        try:
+            self.stt = initialize_stt_environment(self.config.stt)
+            LOGGER.info(f"STT process initialized with PID: {self.stt.pid}")
+        except Exception as e:
+            LOGGER.error(f"Failed to initialize STT process: {e}")
+            raise e
+
+        # Initialize llm
+        if self.llm is not None:
+            raise ValueError("LLM process already initialized")
+        try:
+            self.llm = initialize_llm_environment(self.config.llm)
+            if self.llm.poll() is not None:
+                raise Exception(f"LLM process failed to start: {self.llm.stderr}")
+            LOGGER.info(f"LLM process initialized with PID: {self.llm.pid}")
+        except Exception as e:
+            LOGGER.error(f"Failed to initialize LLM process: {e}")
+            raise e
+
+        # NOTE: When using chain of responsibility, initialize handlers here
+
+    def call(self, input: STTInput, ttsConfig) -> EngineResponse:
+        assert isinstance(input, STTInput), "Input must be of type STTInput"
+        tick = time.time()
+        stt_response = STTResponse.from_response(call_stt_environment(input))
+        stt_latency = time.time()
+        LOGGER.debug(f"STT response: {stt_response}")
+        if (stt_response.text == '{"text": " "}' ) or (stt_response.text == '{"text": " "}') or (stt_response.text == '{"text": "  "}' ) or (stt_response.text is None ) :
+            LOGGER.debug("Could not find any STT output for LLM")
+            return EngineResponse(
+                stt_response=stt_response,
+            ) 
+        llm_input = LLMInput.from_stt_response(self.config.llm, stt_response)
+        if llm_input.stream:
+            # implement stream response handling
+            tts_processing_queue = queue.Queue()
+            decoded_streams = queue.Queue()
+            stream_queue = queue.Queue()
+            stop_event = threading.Event()
+
+            decode_thread = threading.Thread(
+                target=decode_stream,
+                args=(
+                    stop_event,
+                    stream_queue,
+                    decoded_streams,
+                    tts_processing_queue,
+                    True,
+                ),
+            )
+            decode_thread.start()
+
+            if ttsConfig.voice:
+                tts_processing_thread = threading.Thread(
+                    target=create_tts_wav,
+                    args=(stop_event, tts_processing_queue, ttsConfig),
+                )
+                tts_processing_thread.start()
+
+            llm_input.data["stream"] = True
+            ttfs = None
+            response = call_llm_environment(llm_input)
+            LOGGER.debug(f"LLM response: {response}")
+            if not response.ok or response.status_code != 200:
+                raise_exception_from_response(response)
+            for line in response.iter_lines():
+                if line:
+                    if ttfs is None:
+                        ttfs = time.time()
+                        LOGGER.info(f"TTFS: {ttfs - tick}")
+
+                    stream_queue.put(line)
+            tock = time.time()
+            stop_event.set()
+
+            decode_thread.join()
+            if ttsConfig.voice:
+                tts_processing_thread.join()
+
+            llm_response = LLMResponse(
+                text=decoded_streams_to_text(list(decoded_streams.queue)),
+                streams=list(decoded_streams.queue),
+                ttfs=ttfs - tick,
+                stream=True,
+            )
+            return EngineResponse(
+                stt_response=stt_response,
+                llm_response=llm_response,
+                latency=tock - tick,
+                stt_latency=stt_latency - tick,
+            )
+        else:
+            raise NotImplementedError("Non-streaming response handling not implemented")
+        # NOTE: When using chain of responsibility, call handlers here
+
+    def terminate(self):
+        if self.stt is not None:
+            kill_process(self.stt)
+        if self.llm is not None:
+            kill_process(self.llm)
+
+
+##################################################
+#                 Utility Functions              #
+##################################################
+
+
+@contextmanager
+def warmup_environment(warmup_time: int):
+    yield
+    time.sleep(warmup_time)
+
+
+def raise_exception_from_response(response: requests.Response):
+    LOGGER.error(
+        f"API call failed with status code: {response.status_code}, response: {response.text}"
+    )
+    raise Exception(
+        f"API call failed with status code: {response.status_code}, response: {response.text}"
+    )
+
+
+def kill_process(process: subprocess.Popen):
+    process.terminate()
+    process.wait()
+    LOGGER.info(f"Process {process.pid} terminated gracefully.")
+
+
+# Set CPU affinity for the entire process
+def process_affinity(process_id, affinity_cores):
+    p = psutil.Process(process_id)
+    p.cpu_affinity(affinity_cores)
+
+
+def initialize_environment(config: EnvironmentConfig):
+    with warmup_environment(config._warmup):
+        cmd: tp.List[str] = (
+            [config.executable_path]
+            + config.get_options().split()
+            + config.get_model_option().split()
+        )
+
+        LOGGER.info(f"Initializing environment with command: {' '.join(cmd)}")
+        return subprocess.Popen(cmd)
+
+
+def initialize_stt_environment(config: STTEnvironmentConfig):
+    proc = initialize_environment(config=config)
+    url = f"http://127.0.0.1:{config.port}/load"
+    data = {"model": config.model_path}
+    response = requests.post(url, json=data)
+    if not response.ok or response.status_code != 200:
+        LOGGER.error(f"Failed to load STT model.")
+        raise_exception_from_response(response)
+    return proc
+
+
+initialize_llm_environment: tp.Callable[[LLMEnvironmentConfig], subprocess.Popen] = (
+    initialize_environment
+)
+
+
+def call_stt_environment(input: STTInput):
+    url = f"http://127.0.0.1:{input.config.port}/inference"
+    files = {"file": open(input.audio, "rb")}
+    data = input.data
+    return requests.post(url, files=files, data=data)
+
+
+def call_llm_environment(input: LLMInput):
+    url = f"http://127.0.0.1:{input.config.port}/completion"
+    data = input.get_data()
+    return requests.post(url, json=data, stream=input.stream)
+
+
+def decode_stream(
+    stop_event: threading.Event,
+    stream_queue: queue.Queue,
+    decoded_streams: queue.Queue,
+    tts_processing_queue: queue.Queue,
+    decode_and_print: bool = False,
+    decode_and_talk: bool = True,
+):
+    LOGGER.debug("Decode Thread Started.")
+    while not stop_event.is_set() or not stream_queue.empty():
+        try:
+            line: bytearray = stream_queue.get(
+                timeout=0.001
+            )  # Get stream data from queue
+            if line:
+                json_response = json.loads(line.decode("utf-8").replace("data: ", ""))
+                decoded_streams.put(json_response)
+                if decode_and_talk:
+                    tts_processing_queue.put(json_response["content"])
+
+        except queue.Empty:
+            pass  # No data to process yet, continue
+    LOGGER.debug("Decode Thread Stopped.")
+
+
+def create_tts_wav(
+    stop_event: threading.Event,
+    tts_processing_queue: queue.Queue,
+    ttsConfig,
+    # output_dir: str = "/home/piuser/voice/core/test-output",
+):
+    LOGGER.debug("TTS Thread Started")
+    piper_process = subprocess.Popen(
+        [
+            "espeak"
+        ],
+        stdin=subprocess.PIPE,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        universal_newlines=True,
+    )
+
+    LOGGER.debug(f"PIPER PID, {piper_process.pid}")
+
+    piper_proc = psutil.Process(piper_process.pid)
+
+    buffer = ""
+
+    try:
+        while not stop_event.is_set() or not tts_processing_queue.empty():
+            if tts_processing_queue.qsize() > 0:
+                try:
+                    # Get one item from the queue
+                    text_part = tts_processing_queue.get(timeout=0.001)
+                    buffer += text_part
+                    # LOGGER.debug(f"BUFFER : {buffer}")
+
+                    # Check if the buffer contains a full sentence
+                    if any(
+                        delimiter in buffer
+                        for delimiter in [".", "!", "?", ":", ";", "," , "and", "but", "or", "nor", "for", "yet", "so",  # Coordinating conjunctions
+                                        ]
+                    ):
+                        # Split the buffer into sentences
+                        sentences = re.split(r"(?<=[.!?])\s+", buffer)
+
+                        # Keep the last partial sentence in the buffer
+                        buffer = (
+                            sentences.pop()
+                            if not re.search(r"[.!?]$", buffer) and len(sentences) > 1
+                            else ""
+                        )
+
+                        # Join the complete sentences and send to Piper
+                        text = " ".join(sentences)
+
+                        # Measure peak memory usage of Piper process
+                        if text:
+                            try:
+                                LOGGER.debug(f"Sending ----> {text}")
+                                # Write the text to Piper's stdin
+                                piper_process.stdin.write(f"{text}\n")
+                                piper_process.stdin.flush()
+
+                            except BrokenPipeError:
+                                LOGGER.error(
+                                    "BrokenPipeError: Piper process terminated unexpectedly."
+                                )
+                                break
+
+                except queue.Empty:
+                    if stop_event.is_set():
+                        break
+                    continue  # No data to process yet, continue
+
+        LOGGER.debug("Received Stop Event At TTS")
+
+    except Exception as e:
+        LOGGER.debug(f"Unable to run TTS engine. Error message : {e}")
+
+    finally:
+        piper_process.stdin.close()
+        for process in [piper_process, ffmpeg_process]:
+            if process and process.poll() is None:
+                process.terminate()
+                try:
+                    process.wait(timeout=5)
+                    LOGGER.debug("TTS thread Stopped.")
+                except subprocess.TimeoutExpired:
+                    process.kill()
+
+
+def decoded_streams_to_text(decoded_streams: tp.List[tp.Dict[str, tp.Any]]) -> str:
+    return " ".join([stream["content"] for stream in decoded_streams])
+
+
+##################################################
+##################################################
+
+
+def print_dict(d: dict, indent: int = 0):
+    for k, v in d.items():
+        if isinstance(v, dict):
+            print(" " * indent, f" - {k}:")
+            print_dict(v, indent + 4)
+        else:
+            print(" " * indent, f" - {k}: {v}")
+
+
+
+import os
+import time
+from pathlib import Path
+
+def wait_for_audio_file(directory):
+    """
+    Continuously watch for an audio file in the specified directory.
+    Returns the path to the audio file once found.
+    """
+    print(f"Watching directory: {directory}")
+    while True:
+        files = [f for f in os.listdir(directory) if f.endswith(".wav")]
+        if files:
+            # Assuming you want to process the first found file
+            file_path = os.path.join(directory, files[0])
+            print(f"Found audio file: {file_path}")
+            return file_path
+        time.sleep(0.5)  # Wait for 1 second before checking again
+
+
+if __name__ == "__main__":
+    args = parse_args()
+    if args.config:
+        config = EngineEnvironmentConfig.from_yaml(args.config)
+        print_dict(config.to_dict())
+    else:
+        config = EngineEnvironmentConfig()
+        config.to_yaml("/home/piuser/edge/recipe/default.yaml")
+
+    engine = Engine(config)
+
+    try:
+        while True:
+            # Directory to watch for audio files
+            audio_file_dir = "./received_audio.wav"
+            receive_audio(audio_file_dir)
+            # Continuously search for an audio file
+            #user_input = wait_for_audio_file(audio_file_dir)
+            user_input = "./received_audio.wav"
+
+            # Execute the processing once the audio file is found
+            stt_input = STTInput(environment_config=config.stt, audio_path=user_input)
+            response = engine.call(stt_input, config.tts)
+
+
+            # print_dict(
+            #     {
+            #         "latency": response.latency,
+            #         "ttfs": response.llm_response.ttfs,
+            #         "stt_latency": response.stt_latency,
+            #     }
+            # )
+            print(f"-" * 50)
+            try:
+                os.remove(user_input)
+                print(f"Deleted processed file: {user_input}")
+            except OSError as e:
+                print(f"Error deleting file: {user_input}, {e}")
+
+    except Exception as e:
+        engine.terminate()
+        raise 
\ No newline at end of file
diff --git a/examples/experimentals/voice_engine/receive_audio.py b/examples/experimentals/voice_engine/receive_audio.py
new file mode 100644
index 0000000..5086c02
--- /dev/null
+++ b/examples/experimentals/voice_engine/receive_audio.py
@@ -0,0 +1,134 @@
+import socket
+import numpy as np
+import pyaudio
+import wave
+import time
+
+def receive_audio(path='./received_audio.wav',
+                  HOST='192.168.1.24',  # Pico W's IP address
+                  PORT=5000,
+                  SAMPLE_RATE=16000,
+                  CHANNELS=1,
+                  FORMAT=pyaudio.paInt16,
+                  CHUNK_SIZE=1600,
+                  GRACE_PERIOD=5):  # Grace period in seconds
+    """
+    Receives audio data from the Pico W over TCP and saves it to a WAV file.
+    Initially blocks to wait for data, then becomes non-blocking for termination.
+    """
+    # Each sample is 2 bytes (16 bits)
+    BYTES_PER_SAMPLE = 2  # FIXED: Changed from 1 to 2 for 16-bit audio
+    TOTAL_SAMPLES = SAMPLE_RATE * CHANNELS * 5  # 5 seconds of audio
+    TOTAL_BYTES = TOTAL_SAMPLES * BYTES_PER_SAMPLE
+
+    # Initialize PyAudio
+    p = pyaudio.PyAudio()
+
+    # Create a stream to play audio
+    stream = p.open(format=FORMAT,
+                    channels=CHANNELS,
+                    rate=SAMPLE_RATE,
+                    output=True,
+                    frames_per_buffer=CHUNK_SIZE)  # Added frames_per_buffer
+
+    frames = []  # List to store audio frames
+    received_bytes = 0  # Counter for total bytes received
+    first_byte_received = False
+    last_data_time = time.time()  # Tracks time of last received data
+
+    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
+        print(f"Connecting to {HOST}:{PORT}...")
+        try:
+            s.connect((HOST, PORT))
+            print("Connected to Pico W.")
+        except Exception as e:
+            print(f"Failed to connect: {e}")
+            return
+
+        try:
+            data_buffer = b''
+            s.setblocking(True)  # Start with blocking mode
+            print("Waiting for first byte...")
+
+            while True:
+                try:
+                    # Receive data
+                    data = s.recv(CHUNK_SIZE * BYTES_PER_SAMPLE)  # Adjusted receive size
+                    if data:
+                        if not first_byte_received:
+                            first_byte_received = True
+                            print("First byte received, switching to non-blocking mode.")
+                            s.setblocking(False)  # Switch to non-blocking mode
+
+                        received_bytes += len(data)
+                        last_data_time = time.time()  # Reset the timeout timer
+                        data_buffer += data
+
+                        # Process data in CHUNK_SIZE increments
+                        while len(data_buffer) >= CHUNK_SIZE * BYTES_PER_SAMPLE:  # Adjusted chunk check
+                            chunk = data_buffer[:CHUNK_SIZE * BYTES_PER_SAMPLE]
+                            data_buffer = data_buffer[CHUNK_SIZE * BYTES_PER_SAMPLE:]
+
+                            # Convert bytes to NumPy array
+                            audio_data = np.frombuffer(chunk, dtype=np.int16)
+
+                            # Remove DC offset (optional)
+                            # dc_offset = np.mean(audio_data)
+                            # audio_data = audio_data - int(dc_offset)
+
+                            # # Apply gain to amplify the audio
+                            # gain_factor = 2.0
+                            # audio_data = audio_data * gain_factor
+
+                            # Ensure we don't exceed the int16 range
+                            audio_data = np.clip(audio_data, -32768, 32767).astype(np.int16)
+
+                            # Convert back to bytes
+                            processed_data = audio_data.tobytes()
+
+                            # Write data to audio stream
+                            stream.write(processed_data)
+
+                            # Append data to frames list
+                            frames.append(processed_data)
+
+                        # Check if we have received enough data
+                        if received_bytes >= TOTAL_BYTES:
+                            print("Received enough audio data.")
+                            break
+
+                    else:
+                        # Non-blocking termination if no data is received
+                        if time.time() - last_data_time > GRACE_PERIOD:
+                            print("Grace period exceeded, terminating.")
+                            break
+
+                except BlockingIOError:
+                    # Non-blocking mode will raise this if no data is available
+                    if time.time() - last_data_time > GRACE_PERIOD:
+                        print("No more data available during grace period, terminating.")
+                        break
+
+        finally:
+            print("Saving audio...")
+            if frames:
+                save_segment(frames, path, p, CHANNELS, FORMAT, SAMPLE_RATE)
+            else:
+                print("No frames captured.")
+            stream.stop_stream()
+            stream.close()
+            p.terminate()
+            print("Connection closed.")
+
+def save_segment(frames, path, p, CHANNELS, FORMAT, SAMPLE_RATE):
+    wf = wave.open(path, 'wb')
+    wf.setnchannels(CHANNELS)
+    wf.setsampwidth(p.get_sample_size(FORMAT))
+    wf.setframerate(SAMPLE_RATE)
+    wf.writeframes(b''.join(frames))
+    wf.close()
+    print(f"Audio segment saved to {path}")
+
+# Example usage
+if __name__ == '__main__':
+    receive_audio()
\ No newline at end of file
diff --git a/examples/experimentals/voice_engine/recipe/recipe_android.yaml b/examples/experimentals/voice_engine/recipe/recipe_android.yaml
new file mode 100644
index 0000000..689477c
--- /dev/null
+++ b/examples/experimentals/voice_engine/recipe/recipe_android.yaml
@@ -0,0 +1,27 @@
+llm:
+  _executable: '/data/data/com.termux/files/home/llama.cpp/llama-server'
+  _warmup: 10
+  batch_size: 8192
+  flash_attn: true
+  gpu: false
+  model: '/data/data/com.termux/files/home/models/Qwen2.5-3.1B-Q4_0_4_4.gguf'
+  n_predict: 128
+  n_procs: 1
+  n_threads: 2
+  port: 8081
+  stream: true
+  ubatch_size: 512
+stt:
+  _executable: '/data/data/com.termux/files/home/whisper.cpp/server'
+  _warmup: 5
+  flash_attn: true
+  gpu: false
+  model: '/data/data/com.termux/files/home/models/ggml-tiny-q4_0.bin'
+  n_procs: 1
+  n_threads: 4
+  port: 8080
+tts:
+  voice: true
+  model: "en_US-lessac-medium"
+  length_scale: 1.5
+log_path: environment.log
diff --git a/examples/experimentals/voice_engine/recipe/rpi5.yaml b/examples/experimentals/voice_engine/recipe/rpi5.yaml
index 4a063a1..6f8c473 100644
--- a/examples/experimentals/voice_engine/recipe/rpi5.yaml
+++ b/examples/experimentals/voice_engine/recipe/rpi5.yaml
@@ -5,10 +5,10 @@ llm:
   batch_size: 8192
   flash_attn: true
   gpu: false
-  model: 'llama.cpp/local_models/llama38B-Model-8.0B-Q4_0_4_4.gguff' # find the model here - https://huggingface.co/AbhrantaNYUN/meta-llama3-8B-Q4_0_4_4/tree/main
+  model: 'llama38B-Model-8.0B-Q4_0_4_4.gguff' # find the model here - https://huggingface.co/AbhrantaNYUN/meta-llama3-8B-Q4_0_4_4/tree/main
   n_predict: 128
   n_procs: 1
-  n_threads: 4
+  n_threads: 2
   port: 8081
   stream: true
   ubatch_size: 512
@@ -17,8 +17,12 @@ stt:
   _warmup: 5
   flash_attn: true
   gpu: false
-  model: 'whisper.cpp/models/ggml-tiny-q4_0.en.bin'
+  model: 'whisper.cpp/models/local_models/ggml-tiny-q4_0.en.bin'
   n_procs: 1
   n_threads: 4
   port: 8080
+tts:
+  voice: true
+  model: "en_US-lessac-medium"
+  length_scale: 1.5
 log_path: environment.log
\ No newline at end of file
diff --git a/examples/experimentals/voice_engine/run_on_android.md b/examples/experimentals/voice_engine/run_on_android.md
new file mode 100644
index 0000000..11190ad
--- /dev/null
+++ b/examples/experimentals/voice_engine/run_on_android.md
@@ -0,0 +1,67 @@
+## Steps to setup on android device : 
+
+<ol> 
+<li> Install termux apk : <a href=https://github.com/termux/termux-app/releases/tag/v0.118.1> link </a>  </li>
+
+**NOTE :** Termux might not work on latest version of android, so, it is advisable use android 9 (tested) 
+
+<li> in termux, run : <code>apt upgrade && apt update</code></li>
+<li> install python in termux : <ol> <li><code>pkg install tur-repo</code></li> <li><code>pkg install python3.11</code></li></ol>
+<li> Install espeak using :  <code>pkg install espeak</code> </li>
+</ol>
+
+## Setup llama.cpp:
+Setup llama.cpp on android using the following commands:
+<ol>
+<li><code>git clone https://github.com/ggerganov/llama.cpp.git</code></li>
+<li><code>cd llama.cpp</code> </li>
+<li><code>apt install git cmake</code> </li>
+<li><code>make GGML_NO_LLAMAFILE=1</code></li>
+</ol>
+
+## Setup whisper.cpp:
+Setup whisper.cpp on android using the following commands:
+<ol>
+<li> <code>git clone https://github.com/ggerganov/whisper.cpp.git</code></li>
+<li> <code>cd whisper.cpp </code></li>
+<li> <code>make</code> </li>
+</ol>
+
+## Getting the Llama model : 
+
+<ol> 
+<li> Create a folder to store the llama-model -> <code>mkdir llama-model</code></li>
+<li> Download the Llama3.2-3B model from : <a href = https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/tree/main?show_file_info=Llama-3.2-3B-Instruct-Q4_0_4_4.gguf> here </a> </li>
+<li> Move the model into the llama-model folder </li>
+</ol>
+
+## Getting the whisper model:
+<ol>
+<li> Create a folder to store the whisper-model -> <code>mkdir whisper-model</code> </li>
+<li> Download <a href = https://huggingface.co/danielus/ggml-whisper-models/tree/main>ggml-tiny-fp16.bin</a></li>
+<li>Move the model into whisper-model</li>
+<li>Quantize the model to 4 bit (if necessary) using the following command :<code> whisper.cpp/quantize whisper-model/ggml-tiny-fp16.bin whisper-model/ggml-tiny-q4_0.bin q4_0</code> </li>
+<li> Delete the fp16 model (if Q4 is being used) to save space </li>
+</ol>
+
+## Setup the nyuntam code base :
+The code is present in nyunta, so we need to get that. 
+
+<ol>
+<li><code>git clone https://github.com/nyunAI/nyuntam.git</code> </li>
+<li> <code>cd nyuntam </code></li>
+<li> The code is currently in the "tts" branch : <code>git checkout origin/tts</code> </li>
+</ol>
+
+## Running the code : 
+<ol>
+<li> Move into the appropriate folder : <code>nyuntam/examples/experimentals/voice-engine</code> </li>
+<li> Put the correct executable (llama-server, whisper-server) path for your system in the yaml file present in <code>recipe/recipe_android.yaml</code> . **NOTE :** These servers are present inside llama.cpp and whisper.cpp respectively </li>
+<li> Put the correct model file path in the recipe yaml file </li>
+<li>run the main_android.py using the following command : <code>python3.11 main_android.py --config recipe/recipe_android.yaml</code> </li>
+</ol>
+
+**NOTE :** If you are running this for the first time, there maybe a few packages that will be missing. These can be installed using  `pip3.11 install [package-name]`
+
+
+
diff --git a/examples/experimentals/voice_engine/tts-blog.md b/examples/experimentals/voice_engine/tts-blog.md
new file mode 100644
index 0000000..a6b4735
--- /dev/null
+++ b/examples/experimentals/voice_engine/tts-blog.md
@@ -0,0 +1,11 @@
+# The speed evals for text-to-speech engine on Raspberry Pi 5
+
+![](../../assets/ttfs_2_vs_4_threads_llm_with_tts.png)
+
+Above diagram shows Time to First Text Stream coming from the LLM
+
+
+
+![](../../assets/ttfs_2_vs_4_threads_llm_with_tts.png)
+
+Abiove diagram shows Time to First Audio stream coming from the TTS engine.
\ No newline at end of file