with one click
r2000-analyze-blocks
// Analyzes memory regions of a disassembled binary and converts them to the correct block types (code, bytes, words, text, tables, etc.) using MOS 6502 and the target system's expertise.
// Analyzes memory regions of a disassembled binary and converts them to the correct block types (code, bytes, words, text, tables, etc.) using MOS 6502 and the target system's expertise.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | r2000-analyze-blocks |
| description | Analyzes memory regions of a disassembled binary and converts them to the correct block types (code, bytes, words, text, tables, etc.) using MOS 6502 and the target system's expertise. |
Use this skill when the user asks to "analyze blocks", "convert blocks", "identify data regions", "classify the program", or wants the AI to scan a range (or the entire binary) and mark regions with their correct block types.
When a binary is loaded in Regenerator 2000, the auto-analyzer traces reachable code starting from the entry point and marks those regions as Code. Everything else remains Undefined — these are the blocks that have not yet been explored. The goal of this skill is to walk through the undefined/unexplored regions, identify what each one actually is, and convert it to the appropriate block type. This is a fundamental step in reverse engineering — separating code from data, text from tables, and pointers from raw bytes.
Use r2000_set_data_type with the data_type enum value from the right column.
| Block Type | data_type value | When to Use |
|---|---|---|
| Code | code | Executable MOS 6502 instructions. Valid opcode sequences with coherent control flow. |
| Byte | byte | Raw 8-bit data: sprite data, bitmap data, charset data, lookup tables, variables, or unknown data. |
| Word | word | 16-bit little-endian values: 16-bit variables, math constants, SID frequency values. |
| Address | address | 16-bit little-endian pointers to memory locations. Creates cross-references. For jump tables & vectors. |
| PETSCII Text | petscii | PETSCII-encoded strings: game messages, prompts, high score names, print routine data. |
| Screencode Text | screencode | Screen code text: data written directly to Screen RAM ($0400–$07E7). |
| Lo/Hi Address | lo_hi_address | Split address table: first half = low bytes, second half = high bytes. Even byte count required. |
| Hi/Lo Address | hi_lo_address | Split address table: first half = high bytes, second half = low bytes. Even byte count required. |
| Lo/Hi Word | lo_hi_word | Split word table: first half = low bytes, second half = high bytes. E.g., SID frequency tables. |
| Hi/Lo Word | hi_lo_word | Split word table: first half = high bytes, second half = low bytes. |
| External File | external_file | Large binary blobs: SID music files, raw bitmaps, character sets that should be exported as-is. |
| Undefined | undefined | Reset a region to unknown state. Use to undo a wrong classification. |
r2000_get_binary_info to get the origin address, size, system, description, and may_contain_undocumented_opcodes hint.
system field tells you the target computer (e.g., C64, VIC-20). You MUST become an expert in that specific target computer's memory map, hardware registers, and KERNAL routines for the duration of the analysis.filename field (e.g., "burnin_rubber.prg", "turrican.d64") and description (if provided by the user) give you the specific software context. Use this to search for known memory maps, common drivers (music, compression), and game-specific variables.may_contain_undocumented_opcodes is true, the binary may use illegal/undocumented MOS 6502 opcodes (e.g., LAX, SAX, SLO, DCP, ISC). Do NOT misclassify these instructions as data — they are valid code. This is a hint set by the user; it is not guaranteed, but you should be prepared to encounter them.r2000_get_blocks to see what has already been classified. Focus on the Undefined blocks — these are the unexplored regions that need classification.Process the Undefined blocks in multiple passes, in this order:
For each chunk of the binary:
r2000_read_region (with "view": "disasm") to see how it disassembles.r2000_read_region (with "view": "hexdump") to see raw byte patterns.r2000_batch_execute to apply multiple r2000_set_data_type calls at once for efficiency.r2000_get_blocks to verify the result.r2000_undo to revert.r2000_toggle_splitter when you need to separate two adjacent regions of the same type (e.g., two separate byte tables side by side).After classifying blocks, optionally:
r2000_set_label_name to name entry points, tables, and strings.r2000_set_comment (type "line" or "side") to add context (using conventions from the r2000-analyze-routine skill if documenting subroutines).CRITICAL: Do NOT mark an Undefined region as Code just because it disassembles into valid-looking 6502 instructions. Random data frequently produces plausible instruction sequences. You MUST have at least one of the following concrete proofs before converting to Code:
A region should be marked as Code only when at least one of these conditions is met:
JSR/JMP target: Existing analyzed code contains a JSR $addr or JMP $addr that lands in this region. Check cross-references with r2000_get_cross_references.BNE, BEQ, BCC, BCS, BPL, BMI, BVC, BVS) targets this region.$FFFA–$FFFF), an Address or Lo/Hi Address block, or a jump table referenced by JMP ($addr).If none of these conditions are met, leave the region as Undefined or classify it as data — even if the bytes happen to disassemble into valid instructions.
A region is likely Byte data if:
LDA addr,X / LDA addr,Y patterns (table lookups).A region is likely Word data if:
LDA addr / LDA addr+1).A region is likely an Address table if:
JMP ($addr)) or indexed reads.A region is likely a Lo/Hi (or Hi/Lo) Address Table if:
LDA lo_table,X / LDA hi_table,X, then pushes to stack or stores to a pointer.LDA lo,X / STA ptr / LDA hi,X / STA ptr+1 / JMP (ptr).Important: When two split halves are in adjacent memory, use r2000_toggle_splitter at the boundary between the lo and hi halves to prevent the auto-merger from combining them into one block.
A region is likely PETSCII text if:
$FFD2 (CHROUT) or $AB1E (BASIC STROUT).A region is likely Screencode text if:
LDA data,X / STA $0400,X strongly suggest screencode.A region is likely an External File if:
PSID/RSID), bitmap, charset.When converting large ranges, use r2000_batch_execute to group r2000_set_data_type calls. Example:
r2000_batch_execute with calls:
- r2000_set_data_type: start=2049, end=2303, data_type="code"
- r2000_set_data_type: start=2304, end=2367, data_type="byte"
- r2000_set_data_type: start=2368, end=2431, data_type="petscii"
- r2000_set_data_type: start=2432, end=2560, data_type="code"
This avoids making dozens of individual round-trip tool calls.
BRK ($00) floods. These are data, not code.r2000_toggle_splitter at the boundary.LAX, SAX, SLO). Check the may_contain_undocumented_opcodes hint from r2000_get_binary_info. If true, be extra cautious about classifying unfamiliar instruction sequences as data — they may be valid code using undocumented opcodes. Even if false, some programs still use them, so remain vigilant.After completing the analysis, provide a summary:
r2000_save_project.