with one click
hipaa-validate
// HIPAA validator: PHI exposure, audit logging, encryption, access control, BAA refs. Triggers: HIPAA, PHI, healthcare compliance, audit log, BAA.
// HIPAA validator: PHI exposure, audit logging, encryption, access control, BAA refs. Triggers: HIPAA, PHI, healthcare compliance, audit log, BAA.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | hipaa-validate |
| description | HIPAA validator: PHI exposure, audit logging, encryption, access control, BAA refs. Triggers: HIPAA, PHI, healthcare compliance, audit log, BAA. |
| user-invocable | true |
| effort | medium |
| disable-model-invocation | true |
| context | fork |
| agent | security-auditor |
| argument-hint | [path] [--mode developer|compliance] [--severity high|warn] [--keywords term1,term2] [--output json] |
| allowed-tools | Read, Grep, Glob, Bash |
$ARGUMENTS
Scan a codebase for HIPAA compliance issues using pattern-matching heuristics. Detects PHI exposure in logs, missing audit trails, unencrypted transmission/storage, hardcoded patient data, access control gaps, and missing Business Associate Agreement references. Read-only โ never modifies files.
Regulation basis: 45 CFR Parts 160, 162, 164 (HIPAA Administrative Simplification, as amended through March 26, 2013). Covers Security Rule (ยง164.302-318), Privacy Rule (ยง164.500-534), Breach Notification Rule (ยง164.400-414), and enforcement penalties (ยง160.400-426).
/hipaa-validate # Scan full project (developer mode โ definitives only)
/hipaa-validate src/ # Scan specific path
/hipaa-validate --mode compliance # Full audit sweep including heuristic categories
/hipaa-validate --severity high # Filter to HIGH findings only
/hipaa-validate --keywords member,enrollee # Extend healthcare keyword list
/hipaa-validate --output json # Structured JSON output for CI integration
Modes:
developer (default): Categories 1, 3, 4, 7, 8 โ definitive regex matches only, low false-positive rate, suited for daily usecompliance: All 8 categories โ includes heuristic checks (Cat 2, 5, 6) for audit sweep coverage, suited for pre-audit sweepsSeverity filtering: --severity high shows only HIGH findings, --severity warn shows HIGH + WARN. Default shows all.
scripts/hipaa_scan.py with passed argumentsExecute the Python scanner with the user's arguments:
python3 "$(dirname "$0")/../app/skills/hipaa-validate/scripts/hipaa_scan.py" [path] [--mode developer|compliance] [--severity high|warn] [--keywords term1,term2] [--output json]
The script handles all scanning logic deterministically:
.hipaaignore support โ honors exclusion patterns from project root.hipaa-config support โ reads covered_vendors for BAA checksIf the script reports "No healthcare context detected", relay the message and suggest the --keywords flag with alternative terminology.
If --output json is used, the script outputs structured JSON suitable for CI pipelines. The exit code is 1 if any HIGH findings exist, 0 otherwise.
For each finding from the script output:
.hipaaignoreThe script implements the following scan categories. This reference is provided so you can explain findings to the user and verify edge cases.
Modes:
developer (default): Categories 1, 3, 4, 7, 8 โ definitive regex matches only, low false-positive ratecompliance: All 8 categories โ includes heuristic checks (Cat 2, 5, 6)Default keywords: patient, diagnosis, medication, clinical, healthcare, medical, fhir, hl7, hipaa, phi, protected.health, health-record, health-plan, health-insurance
Note: Bare
healthis deliberately excluded โ it matches infrastructure health checks in nearly every codebase.
Built-in exclusions: Binary files, lock files, vendored directories (node_modules/, vendor/, .git/, dist/, build/, out/, .next/). Test directories (test/, tests/, __tests__/, spec/, fixtures/, mocks/) are excluded for Category 4 only.
Categories 1 and 2 scan the full project. Categories 3โ8 scan only PHI-adjacent files.
Scan the full project for log/print statements that reference PHI keywords.
| Pattern | Severity | Language | Description |
|---|---|---|---|
console\.log\(.*patient | HIGH | JS/TS | Patient data in console |
console\.\w+\(.*req\.body | WARN | JS/TS | Raw request body may contain PHI |
JSON\.stringify\(.*patient | WARN | JS/TS | Full patient object serialization |
print\(.*\b(patient|ssn|social.security) | HIGH | Python | PHI in print statements |
(logging|logger|pprint)\.\w+\(.*\b(patient|ssn|mrn|dob) | HIGH | Python | PHI in logger/named logger/pprint output |
print\(.*request\.(data|json|form|POST|body) | WARN | Python | Raw request body may contain PHI (Django/Flask/FastAPI) |
(logging|logger)\.\w+\(.*request\.(data|json|form|POST|body) | WARN | Python | Raw request body in logger |
\brepr\(.*\b(patient|ssn|mrn) | WARN | Python | repr() may expose PHI fields |
\bvars\(.*\b(patient|ssn|mrn) | WARN | Python | vars() dumps all PHI fields |
fmt\.Print.*\b(patient|ssn|mrn) | HIGH | Go | PHI in fmt output |
log\.\w+\(.*\b(patient|ssn|mrn) | HIGH | Go/Any | PHI in log calls |
System\.out\.print.*\b(patient|ssn|mrn) | HIGH | Java | PHI in stdout |
logger\.\w+\(.*\b(patient|ssn|mrn|dob) | HIGH | Java/Any | PHI fields in logger |
puts.*\b(patient|ssn|mrn) | HIGH | Ruby | PHI in puts |
Rails\.logger.*\b(patient|ssn|mrn) | HIGH | Ruby | PHI in Rails logger |
Console\.Write.*\b(patient|ssn|mrn) | HIGH | C# | PHI in Console output |
_logger\.\w+\(.*\b(patient|ssn|mrn|dob) | HIGH | C# | PHI in ILogger calls |
Language coverage note: JS/TS and Python patterns are the most comprehensive. Go, Ruby, and Java have baseline coverage for common log patterns. Contributions for additional language-specific patterns are welcome.
Minimum Necessary violations (ยง164.502(b)):
| Pattern | Severity | Language | Description |
|---|---|---|---|
res\.(json|send)\(.*patient without field projection | WARN | JS/TS | Full patient object in API response |
return.*patient in route handler without field selection | WARN | Any | May expose unnecessary PHI fields |
SELECT\s+\*.*FROM.*(patient|member|enrollee) | WARN | SQL | SELECT * on PHI tables violates minimum necessary |
JSON\.stringify\(.*patient | WARN | JS/TS | Full patient object serialization |
json\.dumps\(.*patient | WARN | Python | Full patient object serialization |
JsonConvert\.Serialize.*patient | WARN | C# | Full patient object serialization |
Compliance mode only. Heuristic โ flags potential gaps, not definitive findings.
Developer mode: This category is skipped. Run with
--mode complianceto include audit gap checks.
Scan the full project for files that handle PHI data operations but lack audit-related keywords.
PHI route file definition: A file qualifies if it contains BOTH:
patient, diagnosis, medication, etc.)router, app.get, app.post, app.put, app.delete, @RequestMapping, @GetMapping, @PostMapping, @PutMapping, @DeleteMapping, Model.find, Model.save, Model.update, db.query, db.execute, cursor.execute, repository., findBy, save(, delete(, @app.route, @blueprint.route, @api_view, ViewSet, APIView, \bsession.(query|add|execute|delete|merge)\b (word-anchored โ SQLAlchemy only, avoids matching Express req.session.save)Files with a healthcare keyword but no data operation pattern are excluded.
Audit keywords (co-occurrence check): audit, AuditEvent, auditLog, logAccess, logEvent, createAuditEntry, recordAccess, ActivityLog, trail, writeAudit
| Pattern | Severity | Description |
|---|---|---|
| PHI route file without any audit keywords in same file | HIGH | ยง164.312(b) โ POTENTIAL audit gap: verify audit controls exist in call chain |
| CRUD operations on patient resources without audit keywords in same file | HIGH | POTENTIAL gap: all PHI access must be logged |
| Admin operations without audit trail reference | WARN | POTENTIAL gap: administrative actions need recording |
Bulk data operations (export, download, bulk, batch) on PHI resources without audit keywords in same file | HIGH | POTENTIAL gap: mass PHI access must be tracked |
Note: This category uses co-occurrence heuristics โ checking whether PHI route keywords and audit keywords appear in the same file. False positives are expected when audit logging is handled by middleware or a separate call chain. Use
.hipaaignoreto suppress confirmed false positives.
See: reference/hipaa-rules.md ยง164.312(b) for audit control requirements.
Context-gated: scans PHI-adjacent files only.
| Pattern | Severity | Language | Description |
|---|---|---|---|
http:// in API calls (not localhost/127.0.0.1) | HIGH | Any | ยง164.312(e)(1) requires encryption in transit |
| Missing TLS/SSL config in database connections | HIGH | Any | Database connections must be encrypted |
rejectUnauthorized:\s*false | HIGH | Any | TLS verification disabled |
ws:// (WebSocket without TLS) | WARN | Any | Unencrypted WebSocket may carry PHI |
verify\s*=\s*False | HIGH | Python | TLS verification disabled (requests/httpx) |
InsecureRequestWarning | WARN | Python | TLS warning suppressed |
[,(]\s*ssl\s*=\s*False\b | HIGH | Python | SSL disabled in connector call (anchored to arg position to avoid matching is_ssl_enabled = False) |
ssl\.CERT_NONE | HIGH | Python | TLS certificate verification disabled (anchored to ssl. module) |
check_hostname\s*=\s*False | HIGH | Python | TLS hostname verification disabled |
urllib3\.disable_warnings | WARN | Python | TLS warnings suppressed (urllib3) |
SECURE_SSL_REDIRECT\s*=\s*False | WARN | Python | Django HTTPS redirect disabled (commonly False in dev settings โ verify production config) |
NODE_TLS_REJECT_UNAUTHORIZED.*0 | HIGH | JS/TS | TLS rejection disabled globally |
See: reference/hipaa-rules.md ยง164.312(e)(1) for transmission security requirements.
Context-gated: scans PHI-adjacent files only.
Built-in test directory exclusions: Skip files in test/, tests/, __tests__/, spec/, fixtures/, mocks/, __mocks__/, testdata/, test-data/ โ test fixtures legitimately contain synthetic PHI.
| Pattern | Severity | Description |
|---|---|---|
\d{3}-\d{2}-\d{4} in PHI-adjacent source files | HIGH | Hardcoded SSNs |
| MRN patterns near healthcare keywords | HIGH | Medical record numbers in code |
\b\d{5}(-\d{4})?\b near zip|postal|address keywords | WARN | ZIP codes in healthcare context (ยง164.514(b)(2)(i)(B)) |
| Real-looking patient names in seed/fixture data | WARN | Use synthetic data generators |
| Date of birth + name co-occurrence in same file | WARN | Combined identifiers = PHI |
| Phone/email/IP regex matches in PHI-adjacent files | WARN | HIPAA identifiers in healthcare context |
\d{3}[\s.-]?\d{3}[\s.-]?\d{4} near phone keyword | WARN | Phone numbers in healthcare context |
See: reference/phi-identifiers.md for the full list of 18 HIPAA identifiers and detection patterns.
Context-gated: scans PHI-adjacent files only. Heuristic โ flags potential gaps.
Auth keywords (co-occurrence check): auth, authenticate, requireAuth, isAuthenticated, protect, guard, Authorize, login_required, Permission, permission_required, LoginRequiredMixin, PermissionRequiredMixin, IsAuthenticated, Depends, Security
Data operation patterns (Python frameworks): @app.route, @blueprint.route, @api_view, ViewSet, APIView, cursor.execute, \bsession.(query|execute)\b (word-anchored โ SQLAlchemy only)
| Pattern | Severity | Language | Description |
|---|---|---|---|
| PHI route file without any auth keywords in same file | WARN | Any | POTENTIAL access control gap โ verify auth middleware covers these routes (ยง164.312(d)) |
Access-Control-Allow-Origin:\s*\* or origin:\s*(true|\*) in PHI-adjacent files | HIGH | Any | Unrestricted cross-origin access to PHI endpoints |
Routes marked public, noAuth, anonymous exposing PHI keywords | HIGH | Any | PHI must require authentication |
permission_classes\s*=.*AllowAny | WARN | Python | DRF AllowAny โ verify no PHI exposed |
Note: Auth middleware is commonly applied at router-level or app-level. The co-occurrence heuristic checks the same file only. False positives expected when auth is configured globally. Use
.hipaaignoreto suppress.
See: reference/hipaa-rules.md ยง164.312(a)(1) and ยง164.312(d) for access control and authentication requirements.
Compliance mode only. Context-gated: scans PHI-adjacent files only.
Developer mode: This category is skipped. Run with
--mode complianceto include BAA checks.
Instead of per-finding rows, emit a single BAA Verification Checklist in the compliance report:
fetch(, axios., requests., http.Get, HttpClient, RestTemplate, urllib). Extract external domains.S3, GCS, BlobStorage, putObject, upload). Note each service.mongodb+srv://, postgres://, mysql://, firestore, dynamodb, CosmosClient, MongoClient, connection strings with cloud hostnames). Note each service.SQS, SNS, RabbitMQ, redis://, kafka, EventBridge, PubSub). Note each service.CloudFront, Cloudflare, Akamai, Fastly, cdn.) serving PHI-adjacent paths. Note each service.datadog, splunk, newrelic, sentry, logstash, elasticsearch, bugsnag, rollbar). Note each SDK.analytics., gtag, mixpanel, segment, amplitude, posthog). Note each SDK..hipaa-config at project root if it exists. Suppress vendors listed under covered_vendors..hipaa-config format (suppress known-covered vendors):
{
"covered_vendors": ["aws", "twilio", "sendgrid", "stripe"]
}
Output format for compliance mode:
### BAA Verification Checklist
| Service/Domain | Pattern Detected | BAA Status |
|----------------|-----------------|------------|
| AWS S3 | `putObject` in src/storage/patient-files.ts | โ covered (covered_vendors) |
| sendgrid.com | `axios.post` in src/notifications/email.ts | โ ๏ธ verify BAA exists |
| analytics.google.com | `gtag` in src/components/Dashboard.tsx | โ verify no PHI flows here |
Note: This is a documentation checklist, not a legal review. Items marked โ ๏ธ or โ require human verification, not code changes.
Context-gated: scans PHI-adjacent files only. Developer mode.
Detects PHI storage patterns without encryption references. Per ยง164.312(a)(2)(iv), ePHI must be encrypted when stored.
| Pattern | Severity | Language | Description |
|---|---|---|---|
encrypt\s*[:=]\s*false in database config files | HIGH | Any | ยง164.312(a)(2)(iv) โ Encryption explicitly disabled |
File write operations (writeFile, fs.write, open(.*w, File.Create) in PHI-adjacent code without encryption references | WARN | Any | PHI written to disk may lack encryption at rest |
Database connection without ssl, encrypt, or tls keywords in PHI-adjacent config | WARN | Any | Database storing PHI should enforce encrypted connections |
localStorage.setItem or sessionStorage.setItem with PHI keywords | HIGH | JS/TS | Browser storage is unencrypted โ PHI must not be stored client-side without encryption |
SharedPreferences or UserDefaults with PHI keywords | HIGH | Java/Any | Mobile local storage is unencrypted by default |
pickle\.(dump|dumps)\( with PHI keywords | HIGH | Python | pickle serialization is unencrypted โ PHI must be encrypted at rest |
shelve\.open\( with PHI keywords | HIGH | Python | shelve storage is unencrypted โ PHI must be encrypted at rest |
See: reference/hipaa-rules.md ยง164.312(a)(2)(iv) for encryption requirements.
Context-gated: scans PHI-adjacent files only. Developer mode.
Detects temporary file creation in PHI-adjacent code without secure deletion. Per ยง164.310(d)(2)(iii), media containing PHI must be sanitized before reuse or disposal.
| Pattern | Severity | Description |
|---|---|---|
/tmp/ or tempfile\. or os\.tmpdir\(\) or Path\.GetTempPath in PHI-adjacent code | WARN | ยง164.310(d)(2)(iii) โ Temp files with PHI must be securely deleted |
mktemp or NamedTemporaryFile or createTempFile near PHI keywords | WARN | Verify temp files are cleaned up after use |
Cache directory writes (cache/, .cache, Cache.set) with PHI keywords | WARN | Cached PHI must be encrypted or purged on schedule |
See: reference/hipaa-rules.md ยง164.310(d)(2)(iii) for disposal requirements.
Present the scanner output to the user. Sort by severity (HIGH first), then by file path.
## HIPAA Validation Report
### Summary
| Metric | Value |
|--------|-------|
| Mode | developer / compliance |
| PHI-adjacent files | N |
| Files scanned | N |
| Categories run | 1,3,4,7,8 (developer) / 1,2,3,4,5,6,7,8 (compliance) |
| Severity HIGH | N |
| Severity WARN | N |
### Findings
#### [HIGH] src/api/patients.ts:42
Category: PHI in Logs
Confidence: definitive (regex match)
Pattern: `console.log(patient.name)`
HIPAA Rule: ยง164.502(b) โ Minimum Necessary Standard
Fix: Replace with `safeLog()` or remove PHI from log output
#### [HIGH] src/routes/patient-api.ts:15
Category: Missing Audit Logging
Confidence: heuristic (co-occurrence check โ may be false positive)
Pattern: PHI route file without audit keywords
HIPAA Rule: ยง164.312(b) โ Audit Controls
Fix: Verify audit logging exists in call chain; add AuditEvent creation if missing
#### [WARN] src/services/patient-sync.ts:88
Category: Unencrypted PHI Transmission
Confidence: definitive (regex match)
Pattern: `http://external-api.example.com/patients`
HIPAA Rule: ยง164.312(e)(1) โ Transmission Security
Fix: Use HTTPS for all PHI transmission
Confidence values:
definitive โ Categories 1, 3, 4, 7, 8: regex matched actual codeheuristic โ Categories 2, 5, 6: co-occurrence/absence check, may be false positiveThis distinction helps compliance officers prioritize immediate remediation (definitive) vs. investigation (heuristic).
POTENTIAL and confidence: heuristic*.lock, package-lock.json, yarn.lock, pnpm-lock.yaml), or vendored dirs (node_modules/, vendor/, .git/, dist/, build/, out/, .next/) โ noise and zero signal.hipaaignore exclusion patterns โ teams use it to mark known-safe data fixtures.env or secret-manager references as a WARN category, even when no PHI pattern matches/security-patterns/cve-scan