# PHI (Protected Health Information) regex patterns for use with qsv searchset
# Canonical source: docs/cookbook/searchset/phi-regexes.txt
# These patterns target HIPAA-defined identifiers commonly found in health data.
# For PII patterns (SSN, credit cards, email, phone), see pii-regexes.txt.
(?x)\b[A-Z]{2,4}\d{5,10}\b #MRN (Medical Record Number) - letter prefix + digits
(?x)\b[ABFGMabfgm][A-Za-z]\d{7}\b #DEA (Drug Enforcement Administration number)
(?x)\b\d{10}\b #NPI (National Provider Identifier) - broad, overlaps unformatted phone numbers; cross-check against PII matches
(?x)\b[A-Z]\d{2}(?:\.[A-Za-z0-9]{1,4}|[A-Za-z0-9]{1,4})\b #ICD-10-CM (diagnosis code, 4-7 chars e.g. J45.20, U07.1, T36.0X5A, S72.001A; bare 3-char category codes like A09 intentionally excluded to avoid false positives — cross-validate those against an ICD-10 code list separately)
(?x)\b(?:\d{4}-?\d{4}-?\d{2}|\d{5}-?\d{3}-?\d{2}|\d{5}-?\d{4}-?\d{1,2})\b #NDC (National Drug Code) - covers 4-4-2, 5-3-2, 5-4-1, 5-4-2 formats
