Run any Skill in Manus with one click

$pwd:

analysis-jobs

Name: Analysis Jobs
Author: cartography-cncf

// Add a post-ingestion analysis job (JSON Cypher file) to a Cartography module to enrich the graph after sync. Use when the user asks to compute internet exposure, propagate inherited permissions, link Human / canonical ontology nodes, score risk, or add cross-resource analysis after data is loaded.

Run Skill in Manus

$ git log --oneline --stat

stars:3,904

forks:516

updated:May 12, 2026 at 17:09

File Explorer

2 files

SKILL.md

readonly

related-skills.json

same repository

audit-frameworks.md

from "cartography-cncf/cartography"

Audit Cartography's rules and compliance frameworks under `cartography/rules/data/rules/`. Surfaces TODOs that the schema can now satisfy, per-provider rules that should collapse into one ontology rule, and duplicate detections across frameworks (CIS, ISO 27001, SOC 2, NIST). Use when the user asks to "audit frameworks", "audit rules", "review rule TODOs", "find duplicate rules", "find ontology candidates", "consolidate compliance frameworks", or "map ISO/SOC2 onto CIS".

2026-05-083.9k

create-rule.md

from "cartography-cncf/cartography"

Author a Cartography security rule (one or more Cypher Facts plus a Pydantic Finding output model) under `cartography/rules/data/rules/`. Use when the user asks to add a security check, detection, attack-surface query, compliance control, CIS benchmark rule, or cross-cloud detection.

2026-05-063.9k

create-module.md

from "cartography-cncf/cartography"

Author a new Cartography intel module end-to-end (entry point, sync GET/TRANSFORM/LOAD/CLEANUP, declarative data model, integration test, schema docs). Use when the user asks to add a new provider, integration, intel module, or service ingestion to Cartography (e.g. "add a new module for service X", "integrate ServiceY", "create a sync for Z API").

2026-05-043.9k

add-node-type.md

from "cartography-cncf/cartography"

Define a new node schema under cartography/models/MODULE_NAME/, including required properties, sub-resource relationships, extra labels, conditional labels, scoped cleanup, and one-to-many transforms. Use when the user asks to add a node type, model a new resource, configure extra Neo4j labels (Identity, Asset, UserAccount, Tenant), or wire scoped vs global cleanup.

2026-05-013.9k

add-relationship.md

from "cartography-cncf/cartography"

Define a `CartographyRelSchema` (standard relationship), one-to-many edge, or `MatchLink` connecting existing nodes. Use when the user asks to add a relationship, link nodes, set a `RESOURCE` / `MEMBER_OF` / `ASSOCIATED_WITH` edge, share a node across modules, or model a composite node from two intel sources.

2026-05-013.9k

enrich-ontology.md

from "cartography-cncf/cartography"

Map a Cartography node into the Ontology system using semantic labels (UserAccount, DeviceInstance, Tenant, Database, ObjectStorage, FileStorage) or canonical nodes (User, Device). Use when the user asks to add ontology mapping, expose a node as a semantic label, normalise identity / device data across providers, enable cross-module queries, or wire `_ont_*` properties.

2026-05-013.9k

package.json

"author": "cartography-cncf"

"repository": "cartography-cncf/cartography"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	analysis-jobs
description	Add a post-ingestion analysis job (JSON Cypher file) to a Cartography module to enrich the graph after sync. Use when the user asks to compute internet exposure, propagate inherited permissions, link Human / canonical ontology nodes, score risk, or add cross-resource analysis after data is loaded.

analysis-jobs

Analysis jobs are post-ingestion Cypher queries (JSON files) that enrich the graph with computed relationships and properties. They run after data is loaded and perform cross-node work that cannot be done during the initial load.

When to use analysis jobs

Use them when you need to:

Compute properties that depend on multiple nodes / relationships.
Create relationships that span across resource types.
Perform transitive closure (e.g. inherited permissions).
Enrich data after all resources of a type are loaded.

Do NOT use analysis jobs for:

Simple node-to-node relationships (use the data model — see add-relationship).
Properties that can be computed during transform().
Relationships already present in the source data.

Critical rules

Pick the right scope. Global jobs run after all accounts/projects/tenants (run_analysis_job). Scoped jobs run once per account (run_scoped_analysis_job). Use dependency checking (run_analysis_and_ensure_deps) when a job needs specific upstream modules.
Use iterative queries for large datasets. They must return COUNT(*) AS TotalCompleted.
Document each query with __comment__.
Clean up stale data that the analysis job creates (don't leave orphan edges between syncs).
Order statements correctly to avoid read windows.
- Properties: clean up first (REMOVE n.attr), then SET. Cleanup of attributes can usually run in a single transaction.
- Relationships: MERGE first, then DELETE stale (WHERE r.lastupdated <> $UPDATE_TAG). Iterative DELETE commits per batch, so a leading DELETE of relationships exposes a graph with those edges missing to concurrent readers until the MERGE finishes. MERGE is idempotent and bumps r.lastupdated, so the trailing DELETE only targets edges that genuinely no longer have a current basis. Canonical example: cartography/data/jobs/analysis/aws_lambda_ecr.json.

Instructions

Step 1 — Pick global vs scoped

Type	Runs	Location	Helper
Global	Once after all accounts / projects	`cartography/data/jobs/analysis/`	`run_analysis_job()`
Scoped	Once per account / project / tenant	`cartography/data/jobs/scoped_analysis/`	`run_scoped_analysis_job()`

Examples:

Internet exposure that needs to see all security groups across all accounts -> global.
IAM instance profile analysis that runs per AWS account -> scoped.

Step 2 — Author the JSON file

{
  "name": "Human-readable name for logging",
  "statements": [
    {
      "__comment__": "Optional comment explaining this query",
      "query": "MATCH (n:NodeType) WHERE ... SET n.property = value",
      "iterative": false
    },
    {
      "__comment__": "Iterative queries for large datasets",
      "query": "MATCH (n:NodeType) WHERE n.property IS NULL WITH n LIMIT $LIMIT_SIZE SET n.property = value RETURN COUNT(*) AS TotalCompleted",
      "iterative": true,
      "iterationsize": 1000
    }
  ]
}

Step 3 — Write the queries

Non-iterative — single execution, OK for queries touching a manageable number of nodes:

{
  "query": "MATCH (instance:GCPInstance) WHERE ... SET instance.exposed_internet = true",
  "iterative": false
}

Iterative — required for large datasets. Must return TotalCompleted:

{
  "query": "MATCH (n:Node) WHERE n.stale = true WITH n LIMIT $LIMIT_SIZE DELETE n RETURN COUNT(*) AS TotalCompleted",
  "iterative": true,
  "iterationsize": 1000
}

Step 4 — Available parameters

common_job_parameters is forwarded into the query. Typical params:

$UPDATE_TAG — current sync timestamp.
$LIMIT_SIZE — set automatically by the iterative runner.
Module-specific ($AWS_ID, $PROJECT_ID, ...).

Step 5 — Wire the call into your module

Pattern A — global analysis at end of ingestion

from cartography.util import run_analysis_job

@timeit
def start_your_module_ingestion(neo4j_session: neo4j.Session, config: Config) -> None:
    common_job_parameters = {"UPDATE_TAG": config.update_tag}

    for account in accounts:
        _sync_one_account(neo4j_session, account, config.update_tag, common_job_parameters)

    run_analysis_job(
        "your_module_exposure_analysis.json",
        neo4j_session,
        common_job_parameters,
    )

Pattern B — scoped per account/project

from cartography.util import run_scoped_analysis_job

def _sync_one_account(neo4j_session, account_id, update_tag, common_job_parameters):
    common_job_parameters["ACCOUNT_ID"] = account_id

    sync_resources(neo4j_session, account_id, update_tag, common_job_parameters)

    run_scoped_analysis_job(
        "your_module_account_analysis.json",
        neo4j_session,
        common_job_parameters,
    )

Pattern C — conditional with dependency checking

from cartography.util import run_analysis_and_ensure_deps

def _perform_analysis(requested_syncs, neo4j_session, common_job_parameters):
    run_analysis_and_ensure_deps(
        "your_module_combined_analysis.json",
        {"ec2:instance", "ec2:security_group"},  # required upstream syncs
        set(requested_syncs),
        common_job_parameters,
        neo4j_session,
    )

Step 6 — Test it

Add an integration test that:

Calls sync() with mocked external boundaries.
Asserts the analysis-produced edges / properties using check_nodes / check_rels.

See the create-module skill for testing conventions.

Best practices

Right scope. Global runs after all accounts; scoped runs per-account.
Use dep-checking (run_analysis_and_ensure_deps) when a job requires upstream modules.
Document queries with __comment__.
Test analysis jobs with integration tests.
Use iterative queries for large datasets.
Clean up stale data the job creates.

Common issues

Job runs before the upstream module — switch to run_analysis_and_ensure_deps with the right deps.
Iterative query never terminates — make sure it returns COUNT(*) AS TotalCompleted and the matched set shrinks each iteration.
Wrong scope — global query reading per-account state can be empty if it runs in the wrong place.

For broader troubleshooting, see the troubleshooting skill.

References (load on demand)

references/examples.md — GCP, AWS, Semgrep wiring examples plus the audit table of modules with proper analysis-job integration.

analysis-jobs

More from this repository

More from this repository

analysis-jobs

When to use analysis jobs

Critical rules

Instructions

Step 1 — Pick global vs scoped

Step 2 — Author the JSON file

Step 3 — Write the queries

Step 4 — Available parameters

Step 5 — Wire the call into your module

Pattern A — global analysis at end of ingestion

Pattern B — scoped per account/project

Pattern C — conditional with dependency checking

Step 6 — Test it

Best practices

Common issues

References (load on demand)

analysis-jobs

When to use analysis jobs

Critical rules

Instructions

Step 1 — Pick global vs scoped

Step 2 — Author the JSON file

Step 3 — Write the queries

Step 4 — Available parameters

Step 5 — Wire the call into your module

Pattern A — global analysis at end of ingestion

Pattern B — scoped per account/project

Pattern C — conditional with dependency checking

Step 6 — Test it

Best practices

Common issues

References (load on demand)