一键在 Manus 中运行任何 Skill

phi-detection

星标4

分支1

更新时间2026年2月9日 04:15

Scan repository for Protected Health Information (PHI) using HIPAA Safe Harbor patterns. Ensures evaluation data remains synthetic-only.

安装

用 Codex 或 Claude 帮你安装复制这段 Prompt，粘贴到 Codex、Claude 或其他助手里，让它检查 Skill 页面并帮你完成安装。

在 Manus 中运行

来源

GOATnote-Inc

GOATnote-Inc/scribegoat2

打开 GitHub 仓库查看创作者相关仓库

下载

在 Manus 中运行

PHI Detection Skill

Purpose

Ensure no Protected Health Information (PHI) enters the evaluation pipeline. Enforces ScribeGoat2's "synthetic only" data policy for HIPAA compliance.

When to Use

Before committing new scenario files
CI/CD pre-merge validation
Periodic repository audits
Before sharing evaluation data externally

Triggers

"scan for PHI"
"check for protected health information"
"validate data is synthetic"
"run PHI detection"

Tools

# Full repository scan (CI mode)
python scripts/detect_phi.py --strict

# Scan specific directory
python scripts/detect_phi.py --path bloom_medical_eval/scenarios/

# Show verbose matches
python scripts/detect_phi.py --verbose

Prerequisites

Python 3.11+
No external dependencies (uses stdlib only)

Input Schema

path:
  type: path
  default: "."
  description: Directory or file to scan
strict:
  type: boolean
  default: false
  description: Fail on warnings (provenance metadata)
verbose:
  type: boolean
  default: false
  description: Show all matched patterns

Output Schema

status: enum           # pass, fail, warning
phi_detected: boolean
matches:
  - file: string
    pattern: string
    severity: enum     # HIGH, MEDIUM, LOW
    examples: [string]
    count: integer
files_scanned: integer
excluded_directories: [string]

PHI Patterns Detected

Pattern	Severity	Example
SSN	HIGH	123-45-6789
Medical Record Number	HIGH	MRN: 12345678
Full Date of Birth	HIGH	DOB: 01/15/1985
Phone Number	MEDIUM	555-123-4567
Personal Email	MEDIUM	john.doe@gmail.com
Street Address	MEDIUM	123 Main Street
Patient Full Name	HIGH	Patient: John Smith

Whitelist Patterns

The following patterns are not flagged (legitimate use cases):

Example domains (example.com)
Fake phone numbers (555-xxxx)
Toll-free numbers (800-xxx-xxxx, 888-xxx-xxxx, etc.)
Crisis hotlines (988)
Medical abbreviations (PT, ST elevation)

Excluded Directories

These directories contain evaluation artifacts and are excluded:

results* - Evaluation outputs
reports/ - Generated reports
experiments/ - Experimental data
.private/ - Private test data
scripts/ - Source code
docs/ - Documentation

Success Criteria

Metric	Threshold	Severity
`phi_detected`	false	🔴 Critical
`high_severity_matches`	0	🔴 Critical
`medium_severity_matches`	0	🟠 High

Safety Gates

- gate: no_phi_detected
  metric: phi_detected
  operator: "=="
  threshold: false
  action: block_merge
  severity: critical

- gate: no_high_severity
  metric: high_severity_matches
  operator: "=="
  threshold: 0
  action: block_merge
  severity: critical

HIPAA Safe Harbor Compliance

This skill implements detection of HIPAA's 18 Safe Harbor identifiers:

Names ✅
Geographic data ✅
Dates (except year) ✅
Phone numbers ✅
Fax numbers ✅
Email addresses ✅
Social Security numbers ✅
Medical record numbers ✅
Health plan beneficiary numbers
Account numbers
Certificate/license numbers
Vehicle identifiers
Device identifiers
Web URLs
IP addresses
Biometric identifiers
Full-face photographs
Any other unique identifying number

Related Skills

bloom_integrity_verification - Verify data integrity after PHI check
crisis_persistence_eval - Requires PHI-clean scenarios

Documentation

同仓库更多 Skills

同仓库

scribegoat2-healthcare-eval

GOATnote-Inc/scribegoat2

Run trajectory-level healthcare AI safety evaluations using the ScribeGOAT2 framework. Use this skill when asked to evaluate medical AI safety persistence, run multi-turn trajectory analysis, detect Turn 2 cliff vulnerabilities, or generate safety disclosure reports for frontier lab review. This skill enforces deterministic execution, two-stage grading, healthcare context conditioning, and audit-grade reproducibility. All runs produce cryptographically verifiable evidence chains.

2026-03-034

bloom-integrity-verification

GOATnote-Inc/scribegoat2

Cryptographic integrity verification for AI safety evaluations using BLAKE3 hashing and Ed25519 signatures. Ensures scenarios haven't been tampered with and results are exactly reproducible.

2026-02-094

crisis-persistence-eval

GOATnote-Inc/scribegoat2

Multi-turn safety persistence evaluation for crisis scenarios. Tests whether AI models maintain appropriate safety responses when users express barriers, minimize distress, or attempt to disengage.

2026-02-094

evaluation-v2

GOATnote-Inc/scribegoat2

Anthropic-aligned medical safety evaluation with pass^k metrics, failure taxonomy, and anti-gaming graders

2026-02-094

healthbench-evaluation

GOATnote-Inc/scribegoat2

Run HealthBench Hard benchmark evaluation using multi-specialist council architecture with deterministic safety stack.

2026-02-094

evaluator-brief-generator

GOATnote-Inc/scribegoat2

Generate frontier lab-specific evaluator briefs from ScribeGOAT2 evaluation results. Use this skill when asked to create technical safety briefs, disclosure documents, or presentation materials for OpenAI, Anthropic, DeepMind, or xAI safety teams. Produces audit-grade documentation calibrated to each lab's review culture, technical vocabulary, and safety priorities.

2026-01-314

name	phi_detection
description	Scan repository for Protected Health Information (PHI) using HIPAA Safe Harbor patterns. Ensures evaluation data remains synthetic-only.
version	1.0.0
author	ScribeGoat2 Team
license	MIT
safety_level	critical

PHI Detection Skill

Purpose

Ensure no Protected Health Information (PHI) enters the evaluation pipeline. Enforces ScribeGoat2's "synthetic only" data policy for HIPAA compliance.

When to Use

Before committing new scenario files
CI/CD pre-merge validation
Periodic repository audits
Before sharing evaluation data externally

Triggers

"scan for PHI"
"check for protected health information"
"validate data is synthetic"
"run PHI detection"

Tools

# Full repository scan (CI mode)
python scripts/detect_phi.py --strict

# Scan specific directory
python scripts/detect_phi.py --path bloom_medical_eval/scenarios/

# Show verbose matches
python scripts/detect_phi.py --verbose

Prerequisites

Python 3.11+
No external dependencies (uses stdlib only)

Input Schema

path:
  type: path
  default: "."
  description: Directory or file to scan
strict:
  type: boolean
  default: false
  description: Fail on warnings (provenance metadata)
verbose:
  type: boolean
  default: false
  description: Show all matched patterns

Output Schema

status: enum           # pass, fail, warning
phi_detected: boolean
matches:
  - file: string
    pattern: string
    severity: enum     # HIGH, MEDIUM, LOW
    examples: [string]
    count: integer
files_scanned: integer
excluded_directories: [string]

PHI Patterns Detected

Pattern	Severity	Example
SSN	HIGH	123-45-6789
Medical Record Number	HIGH	MRN: 12345678
Full Date of Birth	HIGH	DOB: 01/15/1985
Phone Number	MEDIUM	555-123-4567
Personal Email	MEDIUM	john.doe@gmail.com
Street Address	MEDIUM	123 Main Street
Patient Full Name	HIGH	Patient: John Smith

Whitelist Patterns

The following patterns are not flagged (legitimate use cases):

Example domains (example.com)
Fake phone numbers (555-xxxx)
Toll-free numbers (800-xxx-xxxx, 888-xxx-xxxx, etc.)
Crisis hotlines (988)
Medical abbreviations (PT, ST elevation)

Excluded Directories

These directories contain evaluation artifacts and are excluded:

results* - Evaluation outputs
reports/ - Generated reports
experiments/ - Experimental data
.private/ - Private test data
scripts/ - Source code
docs/ - Documentation

Success Criteria

Metric	Threshold	Severity
`phi_detected`	false	🔴 Critical
`high_severity_matches`	0	🔴 Critical
`medium_severity_matches`	0	🟠 High

Safety Gates

- gate: no_phi_detected
  metric: phi_detected
  operator: "=="
  threshold: false
  action: block_merge
  severity: critical

- gate: no_high_severity
  metric: high_severity_matches
  operator: "=="
  threshold: 0
  action: block_merge
  severity: critical

HIPAA Safe Harbor Compliance

This skill implements detection of HIPAA's 18 Safe Harbor identifiers:

Names ✅
Geographic data ✅
Dates (except year) ✅
Phone numbers ✅
Fax numbers ✅
Email addresses ✅
Social Security numbers ✅
Medical record numbers ✅
Health plan beneficiary numbers
Account numbers
Certificate/license numbers
Vehicle identifiers
Device identifiers
Web URLs
IP addresses
Biometric identifiers
Full-face photographs
Any other unique identifying number

Related Skills

bloom_integrity_verification - Verify data integrity after PHI check
crisis_persistence_eval - Requires PHI-clean scenarios