// Author a new Cartography intel module end-to-end (entry point, sync GET/TRANSFORM/LOAD/CLEANUP, declarative data model, integration test, schema docs). Use when the user asks to add a new provider, integration, intel module, or service ingestion to Cartography (e.g. "add a new module for service X", "integrate ServiceY", "create a sync for Z API").
Author a new Cartography intel module end-to-end (entry point, sync GET/TRANSFORM/LOAD/CLEANUP, declarative data model, integration test, schema docs). Use when the user asks to add a new provider, integration, intel module, or service ingestion to Cartography (e.g. "add a new module for service X", "integrate ServiceY", "create a sync for Z API").
create-module
Build a brand new Cartography intel module from scratch using the modern declarative data model. The module must follow the standard sync pattern (get -> transform -> load -> cleanup) and be exercised by an integration test.
Critical rules
Use the data model, not handwritten Cypher. Call load() / load_matchlinks() from cartography.client.core.tx, and GraphJob.from_node_schema() for cleanup.
Sub-resource relationships always point to a tenant-like node (AWSAccount, AzureSubscription, GCPProject, GitHubOrganization, your <Service>Tenant). Never to an infrastructure parent.
Required fields use direct dict access, optional fields use .get() with None default. Do not silently swallow exceptions in get().
Only standard schema fields: any custom field added to a CartographyNodeSchema / CartographyRelSchema subclass is ignored. See the add-node-type and add-relationship skills.
Integration tests must call sync(), not individual load() calls. Mock only external boundaries (API clients, credentials).
The entry point (__init__.py) reads from Config, validates required credentials, builds common_job_parameters, and dispatches to per-domain sync() functions. See references/sync-pattern.md for a copy-paste template.
Step 2 ā Wire CLI + Config
In cartography/cli.py:
Add PANEL_YOUR_SERVICE = "Your Service Options" and register it in MODULE_PANELS.
Add Typer options inside _build_app().run() (use Annotated[Optional[str], typer.Option(...)] with rich_help_panel=PANEL_YOUR_SERVICE).
Resolve secrets from os.environ and pass them into cartography.config.Config(...).
In cartography/config.py, extend Config.__init__ with the new fields. Then in your module entry point, validate them and short-circuit with logger.info("... not configured - skipping module") when missing.
Step 3 ā Register the module in cartography/sync.py
Add one entry to TOP_LEVEL_MODULES using the lazy wrapper. Do not add a top-level import cartography.intel.your_service to sync.py ā that defeats lazy SDK loading and reintroduces the slow-startup problem.
Pick a sensible position relative to neighbors (cloud providers grouped together, etc.). The provider's heavy SDK imports stay where they are ā they only fire when this stage is selected and run.
Step 4 ā Implement the sync pattern
For each domain (users, devices, projects, ...):
@timeitdefsync(
neo4j_session: neo4j.Session,
api_key: str,
tenant_id: str,
update_tag: int,
common_job_parameters: dict[str, Any],
) -> None:
raw = get(api_key, tenant_id) # 1. GET ā dumb, raises on failure
data = transform(raw) # 2. TRANSFORM ā shape for ingest
load_users(neo4j_session, data, tenant_id, update_tag) # 3. LOAD ā data model
cleanup(neo4j_session, common_job_parameters) # 4. CLEANUP ā GraphJob
get() should be minimal: set timeouts, call response.raise_for_status(), and let errors propagate. AWS get-functions wrap with @aws_handle_regions. See references/sync-pattern.md for the long-form template, error-handling rules, and transform examples.
Step 5 ā Define the data model
Create dataclasses in cartography/models/your_service/. Required for every node:
For advanced node configurations (extra labels, conditional labels, scoped cleanup, one-to-many) see the add-node-type skill. For relationships, MatchLinks, and multi-module patterns see the add-relationship skill. See references/data-model.md for the full reference.
If you hand-write a Cypher write query during prototyping, use run_write_query() (managed transaction + retries), never neo4j_session.run().
Step 7 ā Integration test
In tests/integration/cartography/intel/your_service/test_users.py, patch only get() and call sync() end-to-end. Assert outcomes (nodes + relationships) using tests.integration.util.check_nodes / check_rels. Do not assert on mock call counts or internal parameters. See references/testing.md for a full template and the test boundary policy.
Step 8 ā Schema documentation
Add a page at docs/root/modules/your_service/schema.md. Use ### for node names, #### for the "Relationships" subsection, bold indexed/primary fields. If the node has a semantic label, add the standard ontology mapping blockquote (see the enrich-ontology skill).
Step 9 ā Optional: analysis jobs
If the module needs post-ingestion enrichment (internet exposure, permission inheritance, cross-resource linking), call run_analysis_job() / run_scoped_analysis_job() at the end of the entry point. See the analysis-jobs skill.
Step 10 ā Pre-submission checks
make lint
# integration test for the module:
pytest tests/integration/cartography/intel/your_service/ -x
Sign every commit: git commit -s -m "...". Update the PR description to match .github/pull_request_template.md.
Final checklist
Entry point validates config and skips cleanly when unconfigured
CLI panel + Config fields wired, secrets resolved from env vars
Module registered in cartography/sync.py:TOP_LEVEL_MODULES via _LazyStage, with no top-level import cartography.intel.<service> added to sync.py
Sync follows GET -> TRANSFORM -> LOAD -> CLEANUP
All schemas use only standard fields (label, properties, sub_resource_relationship, other_relationships, extra_node_labels, scoped_cleanup)
Sub-resource relationship targets a tenant-like node
Required fields use data["x"], optional use data.get("x") with None default
extra_index=True set on frequently queried fields
Integration test exercises sync(), asserts nodes + rels with check_nodes / check_rels
Schema doc added under docs/root/modules/your_service/schema.md
make lint clean, git commit -s used
Common issues
See the troubleshooting skill for ModuleNotFoundError, PropertyRef validation failed, missing relationships, cleanup misbehavior, and date-handling pitfalls.
References (load on demand)
references/sync-pattern.md ā full templates for __init__.py, sync(), get(), transform(), error-handling rules.