| name | add-connector |
| description | Add or update a Sim knowledge base connector for syncing documents from an external source, including auth mode, config fields, pagination, document mapping, tags, and registry wiring. Use when working in `apps/sim/connectors/{service}/` or adding a new external document source. |
Add Connector Skill
You are an expert at adding knowledge base connectors to Sim. A connector syncs documents from an external source (Confluence, Google Drive, Notion, etc.) into a knowledge base.
Your Task
When the user asks you to create a connector:
- Use Context7 or WebFetch to read the service's API documentation
- Determine the auth mode: OAuth (if Sim already has an OAuth provider for the service) or API key (if the service uses API key / Bearer token auth)
- Create the connector directory: a client-safe
meta.ts (declarative metadata) plus the runtime module that spreads it
- Register it in BOTH the server registry and the client-safe meta registry
Hard Rule: No Guessed Response Or Document Schemas
If the service docs do not clearly show the document list response, document fetch response, pagination shape, or metadata fields, you MUST tell the user instead of guessing.
- Do NOT invent document fields
- Do NOT guess pagination cursors or next-page fields
- Do NOT infer metadata/tag mappings from unrelated endpoints
- Do NOT fabricate
ExternalDocument content structure from partial docs
If the source schema is unknown, do one of these instead:
- Ask the user for sample API responses
- Ask the user for test credentials so you can verify live payloads
- Implement only the documented parts of the connector
- Leave the connector incomplete and explicitly say which fields remain unknown
Directory Structure
Each connector is split into a client-safe metadata file and a server-only runtime file. This mirrors the XBlockMeta / BLOCK_META_REGISTRY split in apps/sim/blocks — client components (the knowledge UI) only need the metadata (icon, name, auth, config fields), so the runtime functions (which pull server-only helpers like input-validation.server → undici → node:net) must stay out of the client bundle.
Create files in apps/sim/connectors/{service}/:
connectors/{service}/
├── index.ts # Barrel export (re-exports the runtime connector)
├── meta.ts # ConnectorMeta — client-safe declarative metadata
└── {service}.ts # ConnectorConfig — spreads the meta + adds runtime functions
meta.ts exports {service}ConnectorMeta: ConnectorMeta. It imports ONLY the icon from @/components/icons, import type { ConnectorMeta } from '@/connectors/types', and any pure-data constants. It must NEVER import server/runtime code.
{service}.ts exports {service}Connector: ConnectorConfig. It imports the meta via import { {service}ConnectorMeta } from '@/connectors/{service}/meta', spreads it as the first property, and holds the runtime functions (which may import server-only helpers like @/lib/knowledge/documents/utils).
Authentication
Connectors use a discriminated union for auth config (ConnectorAuthConfig in connectors/types.ts):
type ConnectorAuthConfig =
| { mode: 'oauth'; provider: OAuthService; requiredScopes?: string[] }
| { mode: 'apiKey'; label?: string; placeholder?: string }
OAuth mode
For services with existing OAuth providers in apps/sim/lib/oauth/types.ts. The provider must match an OAuthService. The modal shows a credential picker and handles token refresh automatically.
API key mode
For services that use API key / Bearer token auth. The modal shows a password input with the configured label and placeholder. The API key is encrypted at rest using AES-256-GCM and stored in a dedicated encryptedApiKey column on the connector record. The sync engine decrypts it automatically — connectors receive the raw access token in listDocuments, getDocument, and validateConfig.
Connector Structure (meta.ts + runtime)
The declarative metadata lives in meta.ts (ConnectorMeta). The runtime functions live in {service}.ts (ConnectorConfig), which spreads the meta as its first property.
meta.ts — client-safe metadata
import { {Service}Icon } from '@/components/icons'
import type { ConnectorMeta } from '@/connectors/types'
export const {service}ConnectorMeta: ConnectorMeta = {
id: '{service}',
name: '{Service}',
description: 'Sync documents from {Service} into your knowledge base',
version: '1.0.0',
icon: {Service}Icon,
auth: {
mode: 'oauth',
provider: '{service}',
requiredScopes: ['read:...'],
},
configFields: [
],
}
Keep meta.ts free of any server/runtime import. Only the icon, the ConnectorMeta type, and pure-data constants belong here.
{service}.ts — runtime (OAuth example)
import { createLogger } from '@sim/logger'
import { fetchWithRetry } from '@/lib/knowledge/documents/utils'
import { {service}ConnectorMeta } from '@/connectors/{service}/meta'
import type { ConnectorConfig, ExternalDocument, ExternalDocumentList } from '@/connectors/types'
const logger = createLogger('{Service}Connector')
export const {service}Connector: ConnectorConfig = {
...{service}ConnectorMeta,
listDocuments: async (accessToken, sourceConfig, cursor) => {
},
getDocument: async (accessToken, sourceConfig, externalId) => {
},
validateConfig: async (accessToken, sourceConfig) => {
},
mapTags: (metadata) => {
},
}
Only map fields in listDocuments, getDocument, validateConfig, and mapTags when the source payload shape is documented or live-verified. If not, tell the user and stop rather than guessing.
API key connector example
The split is identical — auth lives in meta.ts, runtime functions in {service}.ts.
export const {service}ConnectorMeta: ConnectorMeta = {
id: '{service}',
name: '{Service}',
description: 'Sync documents from {Service} into your knowledge base',
version: '1.0.0',
icon: {Service}Icon,
auth: {
mode: 'apiKey',
label: 'API Key',
placeholder: 'Enter your {Service} API key',
},
configFields: [ ],
}
export const {service}Connector: ConnectorConfig = {
...{service}ConnectorMeta,
listDocuments: async (accessToken, sourceConfig, cursor) => { },
getDocument: async (accessToken, sourceConfig, externalId) => { },
validateConfig: async (accessToken, sourceConfig) => { },
}
ConfigField Types
The add-connector modal renders these automatically — no custom UI needed.
Three field types are supported: short-input, dropdown, and selector.
{
id: 'domain',
title: 'Domain',
type: 'short-input',
placeholder: 'yoursite.example.com',
required: true,
}
{
id: 'contentType',
title: 'Content Type',
type: 'dropdown',
required: false,
options: [
{ label: 'Pages only', id: 'page' },
{ label: 'Blog posts only', id: 'blogpost' },
{ label: 'All content', id: 'all' },
],
}
Dynamic Selectors (Canonical Pairs)
Use type: 'selector' to fetch options dynamically from the existing selector registry (hooks/selectors/registry.ts). Selectors are always paired with a manual fallback input using the canonical pair pattern — a selector field (basic mode) and a short-input field (advanced mode) linked by canonicalParamId.
The user sees a toggle button (ArrowLeftRight) to switch between the selector dropdown and manual text input. On submit, the modal resolves each canonical pair to the active mode's value, keyed by canonicalParamId.
Rules
- Every selector field MUST have a canonical pair — a corresponding
short-input (or dropdown) field with the same canonicalParamId and mode: 'advanced'.
required must be set identically on both fields in a pair. If the selector is required, the manual input must also be required.
canonicalParamId must match the key the connector expects in sourceConfig (e.g. baseId, channel, teamId). The advanced field's id should typically match canonicalParamId.
dependsOn references the selector field's id, not the canonicalParamId. The modal propagates dependency clearing across canonical siblings automatically — changing either field in a parent pair clears dependent children.
Selector canonical pair example (Airtable base → table cascade)
configFields: [
{
id: 'baseSelector',
title: 'Base',
type: 'selector',
selectorKey: 'airtable.bases',
canonicalParamId: 'baseId',
mode: 'basic',
placeholder: 'Select a base',
required: true,
},
{
id: 'baseId',
title: 'Base ID',
type: 'short-input',
canonicalParamId: 'baseId',
mode: 'advanced',
placeholder: 'e.g. appXXXXXXXXXXXXXX',
required: true,
},
{
id: 'tableSelector',
title: 'Table',
type: 'selector',
selectorKey: 'airtable.tables',
canonicalParamId: 'tableIdOrName',
mode: 'basic',
dependsOn: ['baseSelector'],
placeholder: 'Select a table',
required: true,
},
{
id: 'tableIdOrName',
title: 'Table Name or ID',
type: 'short-input',
canonicalParamId: 'tableIdOrName',
mode: 'advanced',
placeholder: 'e.g. Tasks',
required: true,
},
{ id: 'maxRecords', title: 'Max Records', type: 'short-input', ... },
]
Selector with domain dependency (Jira/Confluence pattern)
When a selector depends on a plain short-input field (no canonical pair), dependsOn references that field's id directly. The domain field's value maps to SelectorContext.domain automatically via SELECTOR_CONTEXT_FIELDS.
configFields: [
{
id: 'domain',
title: 'Jira Domain',
type: 'short-input',
placeholder: 'yoursite.atlassian.net',
required: true,
},
{
id: 'projectSelector',
title: 'Project',
type: 'selector',
selectorKey: 'jira.projects',
canonicalParamId: 'projectKey',
mode: 'basic',
dependsOn: ['domain'],
placeholder: 'Select a project',
required: true,
},
{
id: 'projectKey',
title: 'Project Key',
type: 'short-input',
canonicalParamId: 'projectKey',
mode: 'advanced',
placeholder: 'e.g. ENG, PROJ',
required: true,
},
]
How dependsOn maps to SelectorContext
The connector selector field builds a SelectorContext from dependency values. For the mapping to work, each dependency's canonicalParamId (or field id for non-canonical fields) must exist in SELECTOR_CONTEXT_FIELDS (lib/workflows/subblocks/context.ts):
oauthCredential, domain, teamId, projectId, knowledgeBaseId, planId,
siteId, collectionId, spreadsheetId, fileId, baseId, datasetId, serviceDeskId
Available selector keys
Check hooks/selectors/types.ts for the full SelectorKey union. Common ones for connectors:
| SelectorKey | Context Deps | Returns |
|---|
airtable.bases | credential | Base ID + name |
airtable.tables | credential, baseId | Table ID + name |
slack.channels | credential | Channel ID + name |
gmail.labels | credential | Label ID + name |
google.calendar | credential | Calendar ID + name |
linear.teams | credential | Team ID + name |
linear.projects | credential, teamId | Project ID + name |
jira.projects | credential, domain | Project key + name |
confluence.spaces | credential, domain | Space key + name |
notion.databases | credential | Database ID + name |
asana.workspaces | credential | Workspace GID + name |
microsoft.teams | credential | Team ID + name |
microsoft.channels | credential, teamId | Channel ID + name |
webflow.sites | credential | Site ID + name |
outlook.folders | credential | Folder ID + name |
ExternalDocument Shape
Every document returned from listDocuments/getDocument must include:
{
externalId: string
title: string
content: string
contentDeferred?: boolean
mimeType: 'text/plain'
contentHash: string
sourceUrl?: string
metadata?: Record<string, unknown>
}
Content Deferral (Required for file/content-download connectors)
All connectors that require per-document API calls to fetch content MUST use contentDeferred: true. This is the standard pattern — listDocuments returns lightweight metadata stubs, and content is fetched lazily by the sync engine via getDocument only for new/changed documents.
This pattern is critical for reliability: the sync engine processes documents in batches and enqueues each batch for processing immediately. If a sync times out, all previously-batched documents are already queued. Without deferral, content downloads during listing can exhaust the sync task's time budget before any documents are saved.
When to use contentDeferred: true
- The service's list API does NOT return document content (only metadata)
- Content requires a separate download/export API call per document
- Examples: Google Drive, OneDrive, SharePoint, Dropbox, Notion, Confluence, Gmail, Obsidian, Evernote, GitHub
When NOT to use contentDeferred
- The list API already returns the full content inline (e.g., Slack messages, Reddit posts, HubSpot notes)
- No per-document API call is needed to get content
Content Hash Strategy
Use a metadata-based contentHash — never a content-based hash. The hash must be derivable from the list response metadata alone, so the sync engine can detect changes without downloading content.
Good metadata hash sources:
modifiedTime / lastModifiedDateTime — changes when file is edited
- Git blob SHA — unique per content version
- API-provided content hash (e.g., Dropbox
content_hash)
- Version number (e.g., Confluence page version)
Format: {service}:{id}:{changeIndicator}
contentHash: `gdrive:${file.id}:${file.modifiedTime ?? ''}`
contentHash: `gitsha:${item.sha}`
contentHash: `dropbox:${entry.id}:${entry.content_hash ?? entry.server_modified}`
contentHash: `confluence:${page.id}:${page.version.number}`
Critical invariant: The contentHash MUST be identical whether produced by listDocuments (stub) or getDocument (full doc). Both should use the same stub function to guarantee this.
Implementation Pattern
function fileToStub(file: ServiceFile): ExternalDocument {
return {
externalId: file.id,
title: file.name || 'Untitled',
content: '',
contentDeferred: true,
mimeType: 'text/plain',
sourceUrl: `https://service.com/file/${file.id}`,
contentHash: `service:${file.id}:${file.modifiedTime ?? ''}`,
metadata: { },
}
}
listDocuments: async (accessToken, sourceConfig, cursor) => {
const response = await fetchWithRetry(listUrl, { ... })
const files = (await response.json()).files
const documents = files.map(fileToStub)
return { documents, nextCursor, hasMore }
}
getDocument: async (accessToken, sourceConfig, externalId) => {
const metadata = await fetchWithRetry(metadataUrl, { ... })
const file = await metadata.json()
if (file.trashed) return null
try {
const content = await fetchContent(accessToken, file)
if (!content.trim()) return null
const stub = fileToStub(file)
return { ...stub, content, contentDeferred: false }
} catch (error) {
logger.warn(`Failed to fetch content for: ${file.name}`, { error })
return null
}
}
Reference Implementations
- Google Drive:
connectors/google-drive/google-drive.ts — file download/export with modifiedTime hash
- GitHub:
connectors/github/github.ts — git blob SHA hash
- Notion:
connectors/notion/notion.ts — blocks API with last_edited_time hash
- Confluence:
connectors/confluence/confluence.ts — version number hash
tagDefinitions — Declared Tag Definitions
Declare which tags the connector populates using semantic IDs. Shown in the add-connector modal as opt-out checkboxes.
On connector creation, slots are dynamically assigned via getNextAvailableSlot — connectors never hardcode slot names.
tagDefinitions: [
{ id: 'labels', displayName: 'Labels', fieldType: 'text' },
{ id: 'version', displayName: 'Version', fieldType: 'number' },
{ id: 'lastModified', displayName: 'Last Modified', fieldType: 'date' },
],
Each entry has:
id: Semantic key matching a key returned by mapTags (e.g. 'labels', 'version')
displayName: Human-readable name shown in the UI (e.g. "Labels", "Last Modified")
fieldType: 'text' | 'number' | 'date' | 'boolean' — determines which slot pool to draw from
Users can opt out of specific tags in the modal. Disabled IDs are stored in sourceConfig.disabledTagIds.
The assigned mapping (semantic id → slot) is stored in sourceConfig.tagSlotMapping.
mapTags — Metadata to Semantic Keys
Maps source metadata to semantic tag keys. Required if tagDefinitions is set.
The sync engine calls this automatically and translates semantic keys to actual DB slots
using the tagSlotMapping stored on the connector.
Return keys must match the id values declared in tagDefinitions.
mapTags: (metadata: Record<string, unknown>): Record<string, unknown> => {
const result: Record<string, unknown> = {}
const labels = Array.isArray(metadata.labels) ? (metadata.labels as string[]) : []
if (labels.length > 0) result.labels = labels.join(', ')
if (metadata.version != null) {
const num = Number(metadata.version)
if (!Number.isNaN(num)) result.version = num
}
if (typeof metadata.lastModified === 'string') {
const date = new Date(metadata.lastModified)
if (!Number.isNaN(date.getTime())) result.lastModified = date
}
return result
}
External API Calls — Use fetchWithRetry
All external API calls must use fetchWithRetry from @/lib/knowledge/documents/utils instead of raw fetch(). This provides exponential backoff with retries on 429/502/503/504 errors. It returns a standard Response — all .ok, .json(), .text() checks work unchanged.
For validateConfig (user-facing, called on save), pass VALIDATE_RETRY_OPTIONS to cap wait time at ~7s. Background operations (listDocuments, getDocument) use the built-in defaults (5 retries, ~31s max).
import { VALIDATE_RETRY_OPTIONS, fetchWithRetry } from '@/lib/knowledge/documents/utils'
const response = await fetchWithRetry(url, {
method: 'GET',
headers: { Authorization: `Bearer ${accessToken}` },
})
const response = await fetchWithRetry(url, { ... }, VALIDATE_RETRY_OPTIONS)
sourceUrl
If ExternalDocument.sourceUrl is set, the sync engine stores it on the document record. Always construct the full URL (not a relative path).
Sync Engine Behavior (Do Not Modify)
The sync engine (lib/knowledge/connectors/sync-engine.ts) is connector-agnostic. It:
- Calls
listDocuments with pagination until hasMore is false
- Compares
contentHash to detect new/changed/unchanged documents
- Stores
sourceUrl and calls mapTags on insert/update automatically
- Handles soft-delete of removed documents
- Resolves access tokens automatically — OAuth tokens are refreshed, API keys are decrypted from the
encryptedApiKey column
You never need to modify the sync engine when adding a connector.
Icon
The icon field on ConnectorConfig is used throughout the UI — in the connector list, the add-connector modal, and as the document icon in the knowledge base table (replacing the generic file type icon for connector-sourced documents). The icon is read from CONNECTOR_REGISTRY[connectorType].icon at runtime — no separate icon map to maintain.
If the service already has an icon in apps/sim/components/icons.tsx (from a tool integration), reuse it. Otherwise, ask the user to provide the SVG.
Registering
Register in BOTH registries, keeping the same alphabetical-by-id ordering in each.
- Server registry —
apps/sim/connectors/registry.server.ts (server-only full registry; holds full connectors with runtime functions, imported by the sync engine and knowledge API routes):
import { {service}Connector } from '@/connectors/{service}'
export const CONNECTOR_REGISTRY: ConnectorRegistry = {
{service}: {service}Connector,
}
- Client-safe meta registry —
apps/sim/connectors/registry.ts (imports each connector's meta.ts only, so client components can use it without pulling server-only code; the metadata counterpart to BLOCK_META_REGISTRY):
import { {service}ConnectorMeta } from '@/connectors/{service}/meta'
export const CONNECTOR_META_REGISTRY: ConnectorMetaRegistry = {
{service}: {service}ConnectorMeta,
}
registry.ts exports CONNECTOR_META_REGISTRY: ConnectorMetaRegistry plus the helpers getConnectorMeta(id) and getAllConnectorMeta(), importing each @/connectors/{service}/meta directly — never the runtime module. registry.server.ts exports CONNECTOR_REGISTRY: ConnectorRegistry.
Reference Implementations
- OAuth + contentDeferred:
apps/sim/connectors/google-drive/google-drive.ts — file download with metadata-based hash, orderBy for deterministic pagination
- OAuth + contentDeferred (blocks API):
apps/sim/connectors/notion/notion.ts — complex block content extraction deferred to getDocument
- OAuth + contentDeferred (git):
apps/sim/connectors/github/github.ts — blob SHA hash, tree listing
- OAuth + inline content:
apps/sim/connectors/confluence/confluence.ts — multiple config field types, mapTags, label fetching
- API key:
apps/sim/connectors/fireflies/fireflies.ts — GraphQL API with Bearer token auth
Checklist