# Indra-Swarm API

## Gemma-4-26B-A4B-it

[Model info](https://huggingface.co/google/gemma-4-26B-A4B-it)

**Docker image's Current Sglang Config**


```
      python3 -m sglang.launch_server
      --model-path google/gemma-4-26b-a4b-it
      --tp 2
      --port 3000
      --host 0.0.0.0
      --attention-backend triton
      --mem-fraction-static 0.8
      --max-running-requests 128
      --chunked-prefill-size 4096
      --context-length 32768
      --trust-remote-code
      --enable-piecewise-cuda-graph
      --schedule-policy lpm
```

**Test Curl**

```
curl http://192.168.40.40:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemma-4-26b-a4b-it",
    "messages": [{"role": "user", "content": "System check. Are you online?"}]
  }'
 ```

## faster-Qwen3-tts

[Model info](https://github.com/andimarafioti/faster-qwen3-tts)

**Test Curl**

```
curl -X POST http://192.168.40.40:8002/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "This is a text-to-speech system check. Audio synthesis is functional on Indra.",
    "voice": "nona",
    "response_format": "wav",
    "seed": 42
  }' \
  --output tts_test.wav
 ```

to change voices, set "voice" to any of the following: 
- aus-female-1
- aus-female-2
- aus-female-3
- aus-female-4
- aus-female-5
- aus-female-6
- aus-male-1
- aus-male-2
- aus-male-3
- aus-male-4
- aus-male-5
- aus-male-6
- aus-male-7
- charter
- gaius
- _gantry
- nona
- oni
- vulcan

 
 
 ## faster-whisper-large-v3-turbo-ct2
 
 [Model info](https://huggingface.co/deepdml/faster-whisper-large-v3-turbo-ct2)
 
 **Test Curl**
Uses a known locally saved audio file on the indra machine for testing

```
curl http://192.168.40.40:8005/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F "file=@/mnt/nvme3n1/swarm/voice-samples/aus-male-1.wav" \
  -F "model=deepdml/faster-whisper-large-v3-turbo-ct2" \
  -F "response_format=json"
```