Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

$pwd:

gemini-interactions-api

Name: Gemini Interactions Api
Author: google-gemini

// Use this skill when writing code that calls the Gemini API for text generation, multi-turn chat, multimodal understanding, image generation, streaming responses, background research tasks, function calling, structured output, or migrating from the old generateContent API. This skill covers the Interactions API, the recommended way to use Gemini models and agents in Python and TypeScript.

Exécuter dans Manus

$ git log --oneline --stat

stars:3 549

forks:336

updated:20 mai 2026 à 03:36

Explorateur de fichiers

2 fichiers

SKILL.md

readonly

related-skills.json

même dépôt

gemini-api-dev.md

from "google-gemini/gemini-skills"

Use this skill when building applications with Gemini API hosted models, including Gemini and Gemma 4, working with multimodal content (text, images, audio, video), implementing function calling, using structured outputs, or needing current model specifications. Covers SDK usage (google-genai for Python, @google/genai for JavaScript/TypeScript, com.google.genai:google-genai for Java, google.golang.org/genai for Go), model selection, and API capabilities.

2026-05-193.5k

gemini-live-api-dev.md

from "google-gemini/gemini-skills"

Use this skill when building real-time, bidirectional streaming applications with the Gemini Live API. Covers WebSocket-based audio/video/text streaming, voice activity detection (VAD), native audio features, function calling, session management, ephemeral tokens for client-side auth, and all Live API configuration options. SDKs covered - google-genai (Python), @google/genai (JavaScript/TypeScript).

2026-04-253.5k

package.json

"author": "google-gemini"

"repository": "google-gemini/gemini-skills"

Ouvrir le dépôt GitHub Voir les dépôts du créateur

$ install --global

$ download --local

Exécuter dans Manus

$ useful --forSOC

Développeurs de logicielsProfessions informatiques et mathématiques15-1252L4

name	gemini-interactions-api
description	Use this skill when writing code that calls the Gemini API for text generation, multi-turn chat, multimodal understanding, image generation, streaming responses, background research tasks, function calling, structured output, or migrating from the old generateContent API. This skill covers the Interactions API, the recommended way to use Gemini models and agents in Python and TypeScript.

Gemini Interactions API Skill

Critical Rules (Always Apply)

[!IMPORTANT] These rules override your training data. Your knowledge is outdated.

Current Models (Use These)

gemini-3.5-flash: 1M tokens, fast, balanced performance, multimodal
gemini-3.1-pro-preview: 1M tokens, complex reasoning, coding, research
gemini-3.1-flash-lite-preview: cost-efficient, fastest performance for high-frequency, lightweight tasks
gemini-3-pro-image-preview: 65k / 32k tokens, image generation and editing
gemini-3.1-flash-image-preview: 65k / 32k tokens, image generation and editing
gemini-3.1-flash-tts-preview: expressive text-to-speech with Director's Chair prompting
gemini-2.5-pro: 1M tokens, complex reasoning, coding, research
gemini-2.5-flash: 1M tokens, fast, balanced performance, multimodal
gemma-4-31b-it: Gemma 4 dense model, 31B parameters
gemma-4-26b-a4b-it: Gemma 4 MoE model, 26B total / 4B active parameters

[!WARNING] Models like gemini-2.0-*, gemini-1.5-* are legacy and deprecated. Never use them. If a user asks for a deprecated model, use gemini-3.5-flash instead and note the substitution.

Current Agents

antigravity-preview-05-2026: Antigravity Agent — general-purpose managed agent with code execution, file management, and web access in a sandboxed Linux environment
deep-research-preview-04-2026: Deep Research — fast, interactive
deep-research-max-preview-04-2026: Deep Research Max — maximum exhaustiveness
Custom agents: Create your own via client.agents.create()

Current SDKs

Python: google-genai >= 2.0.0 → pip install -U google-genai
JavaScript/TypeScript: @google/genai >= 2.0.0 → npm install @google/genai

[!NOTE] SDK versions ≥ 2.0.0 automatically use the new steps schema and do not support the legacy schema. Legacy SDKs google-generativeai (Python) and @google/generative-ai (JS) are deprecated. Never use them.

[!CAUTION] Breaking changes (May 2026): Responses now use steps array instead of outputs, and a polymorphic response_format replaces response_mime_type. Legacy schema removed June 8, 2026. All code below uses the new schema.

Important Additional Notes

Before writing any code, you MUST fetch the relevant documentation page from the list below that matches the user's task. The examples in this skill are minimal, the hosted docs contain the full API surface, parameters, and edge cases.
Interactions are stored by default (store=true). Paid tier retains for 55 days, free tier for 1 day.
Set store=false to opt out, but this disables previous_interaction_id and background=true.
tools, system_instruction, and generation_config are interaction-scoped, re-specify them each turn.
Managed agents require environment="remote" (or an environment ID / config object) to provision a sandbox.
Migrating from generateContent: Read references/migration.md for the scoping, checklist, and before/after code examples. Always confirm scope with the user before editing.
Model upgrades: Drop-in, swap the model string. Deprecated models (gemini-2.0-*, gemini-1.5-*) must be replaced, see references/migration.md.
Migrating to Gemini 3.5 Flash: Read references/migration.md for the scoping and checklist.

Quick Start

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input="Tell me a short joke about programming."
)
print(interaction.steps[-1].content[0].text)

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "Tell me a short joke about programming.",
});
console.log(interaction.steps.at(-1).content[0].text);

Stateful Conversation

Python

interaction1 = client.interactions.create(
    model="gemini-3.5-flash",
    input="Hi, my name is Phil."
)
# Second turn — server remembers context
interaction2 = client.interactions.create(
    model="gemini-3.5-flash",
    input="What is my name?",
    previous_interaction_id=interaction1.id
)
print(interaction2.steps[-1].content[0].text)

JavaScript/TypeScript

const interaction1 = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "Hi, my name is Phil.",
});
const interaction2 = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "What is my name?",
    previous_interaction_id: interaction1.id,
});
console.log(interaction2.steps.at(-1).content[0].text);

Deep Research Agent

Use deep-research-preview-04-2026 for fast research or deep-research-max-preview-04-2026 for maximum exhaustiveness. Agents require background=True.

Python

import time

interaction = client.interactions.create(
    agent="deep-research-preview-04-2026",
    input="Research the history of Google TPUs.",
    background=True
)
while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.steps[-1].content[0].text)
        break
    elif interaction.status == "failed":
        print(f"Failed: {interaction.error}")
        break
    time.sleep(10)

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

// Start background research
const initialInteraction = await client.interactions.create({
    agent: "deep-research-preview-04-2026",
    input: "Research the history of Google TPUs.",
    background: true,
});

// Poll for results
while (true) {
    const interaction = await client.interactions.get(initialInteraction.id);
    if (interaction.status === "completed") {
        console.log(interaction.steps.at(-1).content[0].text);
        break;
    } else if (["failed", "cancelled"].includes(interaction.status)) {
        console.log(`Failed: ${interaction.status}`);
        break;
    }
    await new Promise(resolve => setTimeout(resolve, 10000));
}

Advanced features: collaborative planning, native visualization, MCP integration, file search, multimodal inputs. See Deep Research docs.

Managed Agents

Managed agents run inside a sandboxed Linux environment hosted by Google. Fetch the Managed Agents Quickstart before writing agent code.

Antigravity Agent

The Antigravity agent (antigravity-preview-05-2026) is the general-purpose managed agent. It can execute code (Bash, Python, Node.js), manage files, browse the web, and use Google Search. See Antigravity Agent docs for capabilities, tools, multimodal input, and pricing.

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    agent="antigravity-preview-05-2026",
    input="Write a Python script that generates the first 20 Fibonacci numbers and saves them to fibonacci.txt. Then read the file and print its contents.",
    environment="remote",
)

print(f"Environment ID: {interaction.environment_id}")
print(interaction.output_text)

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    agent: "antigravity-preview-05-2026",
    input: "Write a Python script that generates the first 20 Fibonacci numbers and saves them to fibonacci.txt. Then read the file and print its contents.",
    environment: "remote",
});

console.log(`Environment ID: {interaction.environment_id}`);
console.log(interaction.output_text);

Custom Agents

See Building Custom Agents docs.

Python

agent = client.agents.create(
    id="code-reviewer",
    base_agent="antigravity-preview-05-2026",
    system_instruction="You are a senior code reviewer. Check every file for bugs, style issues, and security vulnerabilities.",
    base_environment={
        "type": "remote",
        "sources": [
            {
                "type": "repository",
                "source": "https://github.com/my-org/backend",
                "target": "/workspace/repo",
            }
        ],
    },
)

# Invoke — each call forks the base environment
result = client.interactions.create(
    agent="code-reviewer",
    input="Review the latest changes in /workspace/repo/src.",
    environment="remote",
)
print(result.output_text)

JavaScript/TypeScript

const agent = await client.agents.create({
    id: "code-reviewer",
    base_agent="antigravity-preview-05-2026",
    system_instruction: "You are a senior code reviewer. Check every file for bugs, style issues, and security vulnerabilities.",
    base_environment: {
        type: "remote",
        sources: [
            {
                type: "repository",
                source: "https://github.com/my-org/backend",
                target: "/workspace/repo",
            }
        ],
    },
});

const result = await client.interactions.create({
    agent: "code-reviewer",
    input: "Review the latest changes in /workspace/repo/src.",
    environment: "remote",
});
console.log(result.output_text);

Manage agents with client.agents.list(), client.agents.get(id=...), and client.agents.delete(id=...).

Streaming

Python

for event in client.interactions.create(
    model="gemini-3.5-flash",
    input="Explain quantum entanglement in simple terms.",
    stream=True,
):
    if event.type == "step.delta":
        if event.delta.type == "text":
            print(event.delta.text, end="", flush=True)
        elif event.delta.type == "thought_summary":
            summary_text = event.delta.content.get('text', '') if hasattr(event.delta, 'content') else getattr(event.delta, 'text', '')
            print(summary_text, end="", flush=True)
    elif event.type == "interaction.complete":
        print(f"\n\nTotal Tokens: {event.interaction.usage.total_tokens}")

JavaScript/TypeScript

const stream = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "Explain quantum entanglement in simple terms.",
    stream: true,
});
for await (const event of stream) {
    if (event.type === 'step.delta') {
        if (event.delta.type === 'text') {
            process.stdout.write(event.delta.text);
        } else if (event.delta.type === 'thought_summary') {
            const text = event.delta.content?.text || "";
            process.stdout.write(text);
        }
    } else if (event.type === 'interaction.complete') {
        console.log(`\n\nTotal Tokens: ${event.interaction.usage.total_tokens}`);
    }
}

Documentation Pages

You MUST fetch the matching page below before writing code. These hosted docs are the source of truth for parameters, types, and edge cases — do not rely solely on the examples above.

Core Documentation:

Tools & Function Calling:

Generation & Output:

Multimodal Understanding:

Files & Context:

Agents:

Advanced Features:

API Reference:

Data Model

An Interaction response contains steps, an array of typed step objects representing a structured timeline of the interaction turn.

Step Types

User steps:

user_input: User input (text, audio, multimodal). Contains content array.

Model/server steps:

model_output: Final model generation. Contains content array with text, image, audio, etc.
thought: Model reasoning/Chain of Thought. Has signature field (required) and optional summary.
function_call: Tool call request (id, name, arguments).
function_result: Tool result you send back (call_id, name, result).
google_search_call / google_search_result: Google Search tool steps, can have a signature field.
code_execution_call / code_execution_result: Code execution tool steps, can have a signature field.
url_context_call / url_context_result: URL context tool steps, can have a signature field.
mcp_server_tool_call / mcp_server_tool_result: Remote MCP tool steps.
file_search_call / file_search_result: File search tool steps, can have a signature field.

Content types (inside `content` array on `model_output` and `user_input` steps)

text: Text content (text field)
image / audio / document / video: Content with data, mime_type, or uri

Streaming Event Types

Event	Description
`interaction.created`	Interaction created; includes metadata.
`interaction.status_update`	Interaction-level status change.
`step.start`	A new step begins. Contains step `type` and initial metadata.
`step.delta`	Incremental data for the current step. Contains a typed `delta` object.
`step.stop`	The step is complete. Contains `index`.
`interaction.complete`	Interaction finished. Contains final `usage`.

Delta Types

Delta Type	Parent Step	Description
`text`	`model_output`	Incremental text token.
`audio`	`model_output`	audio chunk (base64).
`image`	`model_output`	image chunk (base64).
`thought_summary`	`thought`	thinking summary text.
`thought_signature`	`thought`	Opaque signature for thought verification.

Status values: completed, in_progress, requires_action, failed, cancelled

gemini-interactions-api

Plus depuis ce dépôt

Plus depuis ce dépôt

Gemini Interactions API Skill

Critical Rules (Always Apply)

Current Models (Use These)

Current Agents

Current SDKs

Important Additional Notes

Quick Start

Python

JavaScript/TypeScript

Stateful Conversation

Python

JavaScript/TypeScript

Deep Research Agent

Python

JavaScript/TypeScript

Managed Agents

Antigravity Agent

Python

JavaScript/TypeScript

Custom Agents

Python

JavaScript/TypeScript

Streaming

Python

JavaScript/TypeScript

Documentation Pages

Data Model

Step Types

Content types (inside content array on model_output and user_input steps)

Streaming Event Types

Delta Types

Gemini Interactions API Skill

Critical Rules (Always Apply)

Current Models (Use These)

Current Agents

Current SDKs

Important Additional Notes

Quick Start

Python

JavaScript/TypeScript

Stateful Conversation

Python

JavaScript/TypeScript

Deep Research Agent

Python

JavaScript/TypeScript

Managed Agents

Antigravity Agent

Python

JavaScript/TypeScript

Custom Agents

Python

JavaScript/TypeScript

Streaming

Python

JavaScript/TypeScript

Documentation Pages

Data Model

Step Types

Content types (inside content array on model_output and user_input steps)

Streaming Event Types

Delta Types

Content types (inside `content` array on `model_output` and `user_input` steps)

Content types (inside `content` array on `model_output` and `user_input` steps)