con un clic
skillsbench
skillsbench contiene 208 skills recopiladas de benchflow-ai, con cobertura ocupacional por repositorio y páginas de detalle dentro del sitio.
Skills en este repositorio
SkillsBench contribution workflow. Use when: (1) Creating benchmark tasks, (2) Understanding repo structure, (3) Preparing PRs for task submission.
SkillsBench task authoring — walk a contributor from idea to submission-ready task following CONTRIBUTING.md and the task-implementation rubric. Use when the user wants to create a new SkillsBench task, scaffold a task from an existing workflow (notebook, Excel workbook, document, dataset), convert a prompt or a benchmark item into a SkillsBench task, write skills for a task, or prepare a SkillsBench PR. Pairs with `task-review` (run that as a self-check before submitting).
SkillsBench task PR review — classifies the task track (standard / research / multimodal), runs static policy checks against the track-specific rubric, benchmarks the task across oracle plus Claude and Codex (with and without skills), audits trajectories for cheating and skill invocation, and produces a `pr-N-task-timestamp-run.txt` review report alongside a `prN.zip` bundle of trajectories. Use when reviewing a SkillsBench task PR (by number, branch, or local task path), when the user asks to review a task, run benchmarks on a PR, audit a submission, classify a task as research or multimodal track, or prepare a comment to post on a SkillsBench PR.
Methodology for clause-by-clause review of a contract against a structured deviation policy ("playbook"). Covers how to walk a playbook, locate the matching provision in the contract, apply rule types (max-value, must-be-present, must-be-absent, acceptable-set, must-have-feature), classify the result (ok / risk / reject), choose the prescribed action, and ground each finding in a verbatim excerpt. Use whenever reviewing any contract — NDA, MSA, vendor DD questionnaire, lease, DPA — against a structured rules-based playbook.
Reference for the standard clauses found in commercial non-disclosure agreements (mutual and one-way) — what each clause does, the surface forms it appears in, and how to recognise it in unfamiliar drafting. Use when reviewing, comparing, or extracting provisions from any confidentiality / NDA / mutual NDA / standstill-and-confidentiality agreement.
Read Microsoft Excel (.xlsx) files robustly with `openpyxl` (or `pandas`). Covers multi-sheet workbooks, header rows, empty cells, merged cells, comma-separated list cells, and converting a sheet to a list-of-dicts the rest of your code can consume. Use when a task input or reference document is an `.xlsx` file rather than JSON/CSV.
Build unified multi-level category taxonomy from hierarchical product category paths from any e-commerce companies using embedding-based recursive clustering with intelligent category naming via weighted word frequency analysis.
Build deterministic, verifiable data visualizations with D3.js (v6). Generate standalone HTML/SVG (and optional PNG) from local data files without external network dependencies. Use when tasks require charts, plots, axes/scales, legends, tooltips, or data-driven SVG output.
A library for building, validating, visualizing, and serializing dialogue graphs. Use this when parsing scripts or creating branching narrative structures.
World-class Java and Spring Boot development skill for enterprise applications, microservices, and cloud-native systems. Expertise in Spring Framework, Spring Boot 3.x, Spring Cloud, JPA/Hibernate, and reactive programming with WebFlux. Includes project scaffolding, dependency management, security implementation, and performance optimization.
World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, real-time streaming, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, Flink, Kinesis, and modern data stack. Includes data modeling, pipeline orchestration, data quality, streaming quality monitoring, and DataOps. Use when designing data architectures, building batch or streaming data pipelines, optimizing data workflows, or implementing data governance.
This skill should be considered when you need to answer reflow machine maintenance questions or provide detailed guidance based on thermocouple data, MES data or defect data and reflow technical handbooks. This skill covers how to obtain important concepts, calculations, definitions, thresholds, and others from the handbook and how to do cross validations between handbook and datasets.
Comprehensive command-line tools for modifying and manipulating images, such as resize, blur, crop, flip, and many more.
Count occurrences of an object in the image using computer vision algorithm.
Extract, normalize, mix, and process audio tracks - audio manipulation and analysis
Convert media files between formats - video containers, audio formats, and codec transcoding
Analyze media file properties - duration, resolution, bitrate, codecs, and stream information
Cut, trim, concatenate, and split video files - basic video editing operations
Apply video filters - scale, crop, watermark, speed, blur, and visual effects
Practical mastering steps for TTS audio: cleanup, loudness normalization, alignment, and delivery specs.
Recover missing spreadsheet values from row and column totals, percentage shares, year-over-year changes, CAGR relationships, and cross-sheet constraints.
Reference for COBOL COMP-3 (packed decimal / BCD) numeric storage -- layout on disk, declaring it in the FD, and pitfalls when carrying the decoded value through working storage and into later arithmetic. Useful when an input record holds a balance, rate, or carry-forward as packed decimal rather than a printable digit string.
Reference for the EBCDIC "overpunch" / zoned-decimal sign convention where the units position of a numeric field is replaced with a letter that encodes both a digit and a sign. Useful when reading mainframe-style fixed-length tapes whose amount fields appear as digits followed by a letter (e.g. "0000000000123D" or "0000000000045M").
Background on general-ledger batch posting codes (HD/DR/CR/RV plus signed "other" rows). Reference only -- follow the task instruction for the exact rules, which may differ per account type.
GnuCOBOL toolchain pointers for mainframe-style batch programs (LINE/RECORD SEQUENTIAL files, PIC clauses, cobc build flags).
Reference for paired-reversal handling on GL batch tapes: an RV row whose ref-trace points at an earlier row cancels BOTH legs (the RV and the row it references) — but ONLY when the two value-dates fall within the shop's reversal settlement window; an RV that references a too-old posting is NOT a cancellation and posts one-sided. Useful when the posting tape carries a reference-trace column alongside reversal codes and the running balance is off by twice the reversal magnitude, or by a whole leg on stale references.
Reference for multi-hop / triangulated currency conversion on batch rate tapes. When a rate row carries an explicit "via" currency, the row's rate is only one leg of the conversion, and the via currency may ITSELF be quoted through another via — so the effective rate is the product of every leg, resolved by walking the chain until a direct quote is reached. Useful when an FX feed lists some pairs directly to the reporting currency and others as triangulated (or doubly-triangulated) quotes through vehicle currencies.
Choose placements that preserve useful residual capacity. Use for bin packing, GPU sharing, accelerator placement, and multi-resource scheduling where stranded capacity hurts future fit.
Validate and repair proposed resource allocations by replaying them against temporary capacity. Use when actions consume several resource dimensions such as CPU, memory, GPUs, or accelerators.
Design deterministic online scheduling policies from current observations. Use when assigning arriving work to limited resources without seeing future requests.
Three.js scene-graph parsing and export workflows: mesh baking, InstancedMesh expansion, part partitioning, per-link OBJ export, and URDF articulation.
Materials science toolkit. Crystal structures (CIF, POSCAR), phase diagrams, band structure, DOS, Materials Project integration, format conversion, for computational materials science.
Use this skill when implementing the inner control loop for a quadrotor — attitude (roll/pitch/yaw) PID control and attitude planning (converting desired acceleration to desired Euler angles). Covers gain layout, integral reset pattern, and the attitude planner inverse kinematics.
Use this skill when converting natural language flight commands into waypoints and timing for a drone simulator. Covers parsing commands like "Take off to X m height in Y seconds", "Hover at X m height for Y seconds", "Fly from (x,y,z) to (x',y',z') in T seconds", and "Land from X m height in Y seconds" into structured (4×n) waypoint arrays and segment mode lists.
Use this skill when simulating quadrotor physical dynamics — mapping desired thrust/moments to individual motor RPMs via a propeller allocation matrix, applying first-order motor lag, and integrating the nonlinear equations of motion (translational and rotational) using RK45.
Use this skill when visualising drone simulation results. Produces three matplotlib figures — desired vs actual trajectories, instantaneous error, and cumulative absolute error — for all 5 state groups (position, orientation, velocity, angular velocity, acceleration). Saves figures to a plots/ directory automatically.
Use this skill when implementing the outer control loop for a quadrotor — position PID control (position/velocity error → thrust and desired acceleration) and trajectory planning from flight-plan waypoints (takeoff, hover, fly, land segments → smooth 15-row state matrix).
Use this skill when computing 3D step-response performance metrics for point-to-point drone flight — rise time, settling time, percent overshoot, and steady-state error based on Euclidean distance to the final target. Use instead of 1D stepinfo for any flight where all three position axes move simultaneously.
Provides database transaction concurrency-control background. Use when reasoning about transactions, commits and aborts, serializability, isolation anomalies, conflicts, dependencies, and protocol families such as locking, timestamp ordering, OCC, MVCC, SSI, TicToc, hybrid protocols.
Teaches how to reason from a transaction protocol’s rules to a working model. Use when analyzing a concurrency-control paper, spec, or algorithm to identify state, metadata, invariants, operation rules, examples, counterexamples, guarantees, false aborts, unsafe commits, or tradeoffs before inspecting concrete traces.