Run any Skill in Manus with one click

$pwd:

deepgram-java-text-to-speech

Name: Deepgram Java Text To Speech
Author: deepgram

// Use when writing or reviewing Java code in this repo that calls Deepgram Text-to-Speech v1 (`/v1/speak`) for audio synthesis. Covers one-shot REST via `client.speak().v1().audio().generate(...)` and streaming synthesis via `client.speak().v1().v1WebSocket()`. Use `deepgram-java-voice-agent` for full-duplex assistants instead of one-way synthesis. Triggers include "tts", "text to speech", "speak", "aura", "streaming tts", and "speak websocket".

Run Skill in Manus

$ git log --oneline --stat

stars:4

forks:3

updated:April 24, 2026 at 15:45

SKILL.md

readonly

related-skills.json

same repository

deepgram-java-voice-agent.md

from "deepgram/deepgram-java-sdk"

Use when writing or reviewing Java code in this repo that builds an interactive voice agent over `agent.deepgram.com/v1/agent/converse`. Covers `client.agent().v1().v1WebSocket()`, `AgentV1Settings`, `sendSettings`, `sendMedia`, event handlers, provider configuration, and message injection. Use `deepgram-java-text-to-speech` for one-way synthesis or the STT skills for transcription-only flows. Triggers include "voice agent", "agent converse", "full duplex", "barge in", "function call", and "agent websocket".

2026-04-244

deepgram-java-management-api.md

from "deepgram/deepgram-java-sdk"

Use when writing or reviewing Java code in this repo that calls Deepgram Management APIs for projects, project models, API keys, members, invites, usage, and billing. Covers `client.manage().v1().*` plus related think-model discovery under `client.agent().v1().settings().think().models()`. Use `deepgram-java-voice-agent` for live agent conversations instead of admin APIs. Triggers include "management api", "list projects", "api keys", "members", "invites", "usage", "billing", and "models".

2026-04-244

deepgram-java-speech-to-text.md

from "deepgram/deepgram-java-sdk"

Use when writing or reviewing Java code in this repo that calls Deepgram Speech-to-Text v1 (`/v1/listen`) for prerecorded or live transcription. Covers `client.listen().v1().media().transcribeUrl` / `transcribeFile` (REST) and `client.listen().v1().v1WebSocket()` (WebSocket). Use `deepgram-java-audio-intelligence` for analytics overlays, `deepgram-java-conversational-stt` for Flux `/v2/listen`, and `deepgram-java-voice-agent` for full-duplex assistants. Triggers include "transcribe", "speech to text", "STT", "listen v1", "nova-3", "live transcription", and "websocket transcription".

2026-04-244

deepgram-java-audio-intelligence.md

from "deepgram/deepgram-java-sdk"

Use when writing or reviewing Java code in this repo that enables Deepgram intelligence overlays on `/v1/listen` audio transcription - diarization, entity detection, sentiment, summarize, topics, intents, language detection, and redaction. Same endpoint as plain STT, but with extra request fields on `ListenV1RequestUrl` or `MediaTranscribeRequestOctetStream`. Use `deepgram-java-speech-to-text` for plain transcripts and `deepgram-java-text-intelligence` for analysis on existing text. Triggers include "audio intelligence", "diarize", "summarize audio", "sentiment from audio", "topic detection", and "redact".

2026-04-244

deepgram-java-conversational-stt.md

from "deepgram/deepgram-java-sdk"

Use when writing or reviewing Java code in this repo that calls Deepgram Conversational STT v2 / Flux over `/v2/listen`. Covers `client.listen().v2().v2WebSocket()`, `V2ConnectOptions`, `onTurnInfo`, and turn-aware close handling. Use `deepgram-java-speech-to-text` for standard v1 transcription and `deepgram-java-voice-agent` for fully interactive assistants. Triggers include "flux", "conversational stt", "listen v2", "turn detection", "end of turn", and "eot".

2026-04-244

deepgram-java-text-intelligence.md

from "deepgram/deepgram-java-sdk"

Use when writing or reviewing Java code in this repo that calls Deepgram Text Intelligence / Read (`/v1/read`) for text analysis. Covers `client.read().v1().text().analyze(...)` with `ReadV1Request` or `TextAnalyzeRequest`. Use `deepgram-java-audio-intelligence` when the source is audio instead of text. Triggers include "read api", "text intelligence", "analyze text", "sentiment", "topics", "intents", and "summarize text".

2026-04-244

package.json

"author": "deepgram"

"repository": "deepgram/deepgram-java-sdk"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name

deepgram-java-text-to-speech

description

Use when writing or reviewing Java code in this repo that calls Deepgram Text-to-Speech v1 (`/v1/speak`) for audio synthesis. Covers one-shot REST via `client.speak().v1().audio().generate(...)` and streaming synthesis via `client.speak().v1().v1WebSocket()`. Use `deepgram-java-voice-agent` for full-duplex assistants instead of one-way synthesis. Triggers include "tts", "text to speech", "speak", "aura", "streaming tts", and "speak websocket".

Using Deepgram Text-to-Speech (Java SDK)

Convert text to audio with REST or stream audio back incrementally over WebSocket via /v1/speak.

When to use this product

REST (audio().generate) — one-shot synthesis when you already have the full text.
WebSocket (v1WebSocket()) — lower-latency synthesis while text arrives in chunks.

Use a different skill when:

You need the system to listen, think, and speak in one session → deepgram-java-voice-agent.

Authentication

import com.deepgram.DeepgramClient;

DeepgramClient client = DeepgramClient.builder()
        .apiKey(System.getenv("DEEPGRAM_API_KEY"))
        .build();

API key auth uses Authorization: Token <apiKey>.

Quick start — REST

import com.deepgram.resources.speak.v1.audio.requests.SpeakV1Request;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardCopyOption;

SpeakV1Request request = SpeakV1Request.builder()
        .text("Hello! This is a text-to-speech example using the Deepgram Java SDK.")
        .build();

InputStream audioStream = client.speak().v1().audio().generate(request);
Files.copy(audioStream, Path.of("output.mp3"), StandardCopyOption.REPLACE_EXISTING);
audioStream.close();

REST returns an InputStream, not JSON.

Quick start — WebSocket

import com.deepgram.resources.speak.v1.types.SpeakV1Close;
import com.deepgram.resources.speak.v1.types.SpeakV1CloseType;
import com.deepgram.resources.speak.v1.types.SpeakV1Flush;
import com.deepgram.resources.speak.v1.types.SpeakV1FlushType;
import com.deepgram.resources.speak.v1.types.SpeakV1Text;
import com.deepgram.resources.speak.v1.websocket.V1WebSocketClient;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.concurrent.TimeUnit;
import java.util.logging.Level;
import java.util.logging.Logger;

Logger logger = Logger.getLogger("StreamingTts");
V1WebSocketClient wsClient = client.speak().v1().v1WebSocket();
OutputStream audioOutput = new FileOutputStream("output_streaming.wav");

// Log write failures rather than throwing from the WebSocket callback thread
// (matches examples/speak/StreamingTts.java).
wsClient.onSpeakV1Audio(audioData -> {
    try {
        audioOutput.write(audioData.toByteArray());
    } catch (IOException e) {
        logger.log(Level.SEVERE, "Failed to write streaming audio to output file.", e);
    }
});

// Close the output stream when the server disconnects so we don't leak the file handle.
wsClient.onDisconnected(message -> {
    try {
        audioOutput.close();
    } catch (IOException e) {
        logger.log(Level.WARNING, "Failed to close streaming audio output file.", e);
    }
});

wsClient.connect().get(10, TimeUnit.SECONDS);
wsClient.sendText(SpeakV1Text.builder().text("Hello, this is streaming text to speech.").build());
wsClient.sendFlush(SpeakV1Flush.builder().type(SpeakV1FlushType.FLUSH).build());
wsClient.sendClose(SpeakV1Close.builder().type(SpeakV1CloseType.CLOSE).build());

Async equivalent

import com.deepgram.AsyncDeepgramClient;
import java.io.InputStream;
import java.util.concurrent.CompletableFuture;

AsyncDeepgramClient asyncClient = AsyncDeepgramClient.builder()
        .apiKey(System.getenv("DEEPGRAM_API_KEY"))
        .build();

CompletableFuture<InputStream> future = asyncClient.speak().v1().audio().generate(request);

Key parameters / API surface

REST request builder: SpeakV1Request.builder().text(...)
Verified REST params: model, encoding, sampleRate, bitRate, container, callback, callbackMethod, mipOptOut, tag
REST methods: audio().generate(request) and audio().withRawResponse().generate(request)
WSS connect options: model, encoding, sampleRate, mipOptOut
WSS send methods: sendText(...), sendFlush(...), sendClear(...), sendClose(...)
WSS handlers: onSpeakV1Audio, onMetadata, onFlushed, onCleared, onWarning, plus generic connection/error hooks

API reference (layered)

In-repo source of truth: src/main/java/com/deepgram/resources/speak/v1/ and examples/speak/. reference.md is not present in this checkout.
Canonical OpenAPI (REST): https://developers.deepgram.com/openapi.yaml
Canonical AsyncAPI (WSS): https://developers.deepgram.com/asyncapi.yaml
Context7: /llmstxt/developers_deepgram_llms_txt
Product docs:

Gotchas

REST returns audio bytes as InputStream. Save or consume it; do not try to deserialize JSON.
Flush before close on WebSocket. The example sends Flush before Close so the tail of the audio is not lost.
Streaming audio arrives as binary ByteString. Convert to bytes before writing or playback.
WebSocket options are narrower than REST. container and bitRate are REST request fields, not WebSocket connect options in this checkout.
TTS defaults are minimal unless you set them. The example only sets text; pick an explicit model/encoding when output format matters.
There is no Java TextBuilder helper in this repo. That Python helper does not exist here.
Async REST is CompletableFuture<InputStream>. You still need to close the stream after the future resolves.

Example files in this repo

examples/speak/TextToSpeech.java
examples/speak/StreamingTts.java
examples/agent/ProviderCombinations.java — shows Aura model selection inside Agent configs

Central product skills

For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:

npx skills add deepgram/skills

This SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).

deepgram-java-text-to-speech

More from this repository

More from this repository

Using Deepgram Text-to-Speech (Java SDK)

When to use this product

Authentication

Quick start — REST

Quick start — WebSocket

Async equivalent

Key parameters / API surface

API reference (layered)

Gotchas

Example files in this repo

Central product skills

Using Deepgram Text-to-Speech (Java SDK)

When to use this product

Authentication

Quick start — REST

Quick start — WebSocket

Async equivalent

Key parameters / API surface

API reference (layered)

Gotchas

Example files in this repo

Central product skills