Run any Skill in Manus with one click

mobile-ai-features

Integrate AI APIs (OpenAI, Anthropic, Google AI) into a React Native/Expo app. Covers vision, text generation, and audio transcription with secure API key management. Use when the user wants to add AI-powered features to their mobile app.

Run Skill in Manus

Stars1

Forks0

UpdatedMay 24, 2026 at 19:35

Source

TMHSDigital

TMHSDigital/Mobile-App-Developer-Tools

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

SKILL.md

readonly

More from this repository

same repository

mobile-accessibility-testing

TMHSDigital/Mobile-App-Developer-Tools

Automated and manual accessibility auditing, WCAG compliance, and screen reader testing for mobile apps

2026-05-241

mobile-analytics

TMHSDigital/Mobile-App-Developer-Tools

Add crash reporting and event tracking to a React Native/Expo or Flutter app. Covers Sentry, Firebase Crashlytics, PostHog, source map upload, user identification, session recording, and GDPR compliance. Use when the user wants visibility into crashes, user behavior, or app performance in production.

2026-05-241

mobile-android-submission

TMHSDigital/Mobile-App-Developer-Tools

Submit an Expo/React Native app to the Google Play Store. Covers Play Console setup, signing keys, AAB format, EAS Build and Submit, service accounts, content ratings, and staged rollouts. Use when the user wants to publish to Google Play.

2026-05-241

mobile-animations

TMHSDigital/Mobile-App-Developer-Tools

Add animations to a React Native/Expo or Flutter app. Covers Reanimated 3 (shared values, worklets, gesture-driven), Lottie and Rive for vector animations, implicit and explicit Flutter animations, Hero transitions, and performance best practices. Use when the user wants smooth UI transitions, loading animations, or gesture-driven motion.

2026-05-241

mobile-api-integration

TMHSDigital/Mobile-App-Developer-Tools

Integrate REST and GraphQL APIs into a React Native/Expo app. Covers fetch, Axios, React Query, urql, auth headers, retry logic, offline queuing, and optimistic updates. Use when the user needs to connect their app to a backend API.

2026-05-241

mobile-app-monitoring

TMHSDigital/Mobile-App-Developer-Tools

Add production application performance monitoring (APM) to a React Native/Expo or Flutter app. Covers Sentry Performance, Datadog RUM, and Instabug for error tracking, performance tracing, session replay, and release health. Includes OpenTelemetry spans, cold/warm start metrics, Apdex scoring, alerting, dashboards, and user impact analysis. Use when the user needs to monitor production errors, track app performance, measure launch times, or set up alerting.

2026-05-241

name	mobile-ai-features
description	Integrate AI APIs (OpenAI, Anthropic, Google AI) into a React Native/Expo app. Covers vision, text generation, and audio transcription with secure API key management. Use when the user wants to add AI-powered features to their mobile app.
standards-version	1.10.0

Mobile AI Features

Trigger

Use this skill when the user:

Wants to add AI-powered features to a mobile app
Needs to send images to a vision API (GPT-4o, Claude, Gemini)
Wants text generation or chatbot functionality
Needs audio transcription (Whisper)
Asks about securing API keys in a mobile app
Mentions "AI", "GPT", "Claude", "Gemini", "vision", "transcription", "chatbot", or "LLM"

Required Inputs

AI provider: OpenAI, Anthropic, or Google AI
Features needed: vision, text generation, audio transcription, or a combination
Backend strategy: edge function (Cloudflare Workers, Vercel Edge), server (Express, Fastify), or serverless (AWS Lambda, Supabase Edge Functions)

Workflow

Never bundle API keys in the app. Mobile app binaries can be decompiled. Anyone with the APK or IPA can extract hardcoded keys.

Instead, use a backend proxy:
```
Mobile App  -->  Your Backend  -->  AI Provider
                 (holds API key)
```
The backend holds the API key. The mobile app authenticates with your backend using user auth tokens.

Set up a minimal backend proxy. Example with Cloudflare Workers:

export default {
  async fetch(request: Request): Promise<Response> {
    const { prompt, image } = await request.json();

    const response = await fetch("https://api.openai.com/v1/chat/completions", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${env.OPENAI_API_KEY}`,
      },
      body: JSON.stringify({
        model: "gpt-4o",
        messages: [
          {
            role: "user",
            content: image
              ? [
                  { type: "text", text: prompt },
                  { type: "image_url", image_url: { url: image } },
                ]
              : prompt,
          },
        ],
        max_tokens: 1024,
      }),
    });

    return response;
  },
};

Deploy this and point your app at its URL.

Create an API client in the app. In lib/ai.ts:

const API_BASE = "https://your-worker.your-domain.workers.dev";

interface AIResponse {
  text: string;
  usage: { prompt_tokens: number; completion_tokens: number };
}

export async function generateText(
  prompt: string,
  authToken: string,
): Promise<AIResponse> {
  const res = await fetch(`${API_BASE}/generate`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${authToken}`,
    },
    body: JSON.stringify({ prompt }),
  });

  if (!res.ok) {
    const error = await res.text();
    throw new Error(`AI request failed: ${res.status} ${error}`);
  }

  return res.json();
}

export async function analyzeImage(
  imageBase64: string,
  prompt: string,
  authToken: string,
): Promise<AIResponse> {
  const res = await fetch(`${API_BASE}/vision`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${authToken}`,
    },
    body: JSON.stringify({
      prompt,
      image: `data:image/jpeg;base64,${imageBase64}`,
    }),
  });

  if (!res.ok) throw new Error(`Vision request failed: ${res.status}`);
  return res.json();
}

Use vision with expo-camera. Capture a photo and send it to the vision API:

import { CameraView } from "expo-camera";
import { useRef } from "react";
import { analyzeImage } from "@/lib/ai";

const cameraRef = useRef<CameraView>(null);

async function captureAndAnalyze() {
  if (!cameraRef.current) return;

  const photo = await cameraRef.current.takePictureAsync({
    base64: true,
    quality: 0.5,
  });

  if (!photo?.base64) return;

  const result = await analyzeImage(
    photo.base64,
    "Describe what you see in this image.",
    userAuthToken,
  );

  console.log(result.text);
}

Use quality: 0.5 or lower to reduce payload size. A full-resolution photo can be 5MB+ in base64.

Streaming responses. For chat UIs, stream tokens as they arrive:

export async function streamText(
  prompt: string,
  authToken: string,
  onToken: (token: string) => void,
): Promise<void> {
  const res = await fetch(`${API_BASE}/stream`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${authToken}`,
    },
    body: JSON.stringify({ prompt, stream: true }),
  });

  if (!res.ok) throw new Error(`Stream failed: ${res.status}`);
  if (!res.body) throw new Error("No response body");

  const reader = res.body.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    const chunk = decoder.decode(value, { stream: true });
    onToken(chunk);
  }
}

Usage in a component:

const [response, setResponse] = useState("");

async function handleSend() {
  setResponse("");
  await streamText(prompt, authToken, (token) => {
    setResponse((prev) => prev + token);
  });
}

Audio transcription. Record audio with expo-av and send to Whisper:

npx expo install expo-av

import { Audio } from "expo-av";

const [recording, setRecording] = useState<Audio.Recording | null>(null);

async function startRecording() {
  await Audio.requestPermissionsAsync();
  await Audio.setAudioModeAsync({ allowsRecordingIOS: true });
  const { recording } = await Audio.Recording.createAsync(
    Audio.RecordingOptionsPresets.HIGH_QUALITY,
  );
  setRecording(recording);
}

async function stopAndTranscribe() {
  if (!recording) return;
  await recording.stopAndUnloadAsync();
  const uri = recording.getURI();
  if (!uri) return;

  const formData = new FormData();
  formData.append("file", {
    uri,
    type: "audio/m4a",
    name: "recording.m4a",
  } as any);

  const res = await fetch(`${API_BASE}/transcribe`, {
    method: "POST",
    headers: { Authorization: `Bearer ${authToken}` },
    body: formData,
  });

  const { text } = await res.json();
  console.log("Transcription:", text);
}

Key References

Example Interaction

User: "I want to take a photo and have AI describe what's in it."

Agent:

Sets up expo-camera with mobile_installDependency
Adds camera permission with mobile_addPermission
Scaffolds an AI client with mobile_integrateAI (provider: openai, features: vision)
Creates a camera screen that captures a photo and sends base64 to the vision endpoint
Displays the AI description below the photo preview
Warns about not bundling API keys and recommends a backend proxy

MCP Usage

Step	MCP Tool	Description
Scaffold AI client	`mobile_integrateAI`	Create API client file with provider config and TypeScript types
Install expo-camera	`mobile_installDependency`	For vision features that use the camera
Install expo-av	`mobile_installDependency`	For audio recording and transcription
Add permissions	`mobile_addPermission`	Add camera/microphone permission rationale
Check build	`mobile_checkBuildHealth`	Verify project builds after adding native modules

Common Pitfalls

Bundling API keys in the app - This is the most common and most dangerous mistake. Mobile apps can be reverse-engineered. Always proxy through a backend you control.
Sending full-resolution images - A 12MP photo is 5-10MB in base64. Use quality: 0.3-0.5 in takePictureAsync and resize before sending. This also reduces API costs.
No error handling for network failures - Mobile apps lose connectivity. Wrap AI calls in try/catch, show user-friendly errors, and implement retry with exponential backoff.
Ignoring costs - Vision API calls cost more than text. Show users an estimate or add rate limiting on your backend. GPT-4o vision is roughly $0.01-0.03 per image.
Blocking the UI during AI calls - AI responses can take 2-10 seconds. Show a loading indicator and use streaming for chat interfaces.
Not setting a timeout - Add AbortController with a 30-second timeout to prevent indefinite hangs on slow connections.

mobile-ai-features

Mobile AI Features

Trigger

Required Inputs

Workflow

Key References

Example Interaction

MCP Usage

Common Pitfalls

See Also

Mobile AI Features

Trigger

Required Inputs

Workflow

Key References

Example Interaction

MCP Usage

Common Pitfalls

See Also