Commit 6ac36d

2026-04-15 07:50:35 Anonymous: update
/dev/null .. indra-swarm/api handles.md
@@ 0,0 1,33 @@
+ # API Handles
+
+ ## Gemma-4-26B-A4B-it
+ This is primarily used for VIP (for both RP and non-RP calls).
+ [Model Specs](https://huggingface.co/google/gemma-4-26B-A4B-it)
+
+ **Docker image's Current Sglang Config**
+ ``
+ python3 -m sglang.launch_server
+ --model-path google/gemma-4-26b-a4b-it
+ --tp 2
+ --port 3000
+ --host 0.0.0.0
+ --attention-backend triton
+ --mem-fraction-static 0.8
+ --max-running-requests 128
+ --chunked-prefill-size 4096
+ --context-length 32768
+ --trust-remote-code
+ --enable-piecewise-cuda-graph
+ --schedule-policy lpm
+ ``
+ **Test Curl**
+ ```
+ curl -X \
+ POST http://192.168.40.40:8002/v1/audio/speech -H \
+ "Content-Type: application/json" -d '{
+ "text": "This is a text-to-speech system check. Audio synthesis is functional on Indra.",
+ "voice_ref": "nona.wav",
+ "seed": 42
+ }' --output \
+ tts_test.wav
+ ```
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9