Use Case Examples - Rules
You can quickly set up similar policy rules based on these examples.
Data governance stewards often rely on column naming patterns to automatically tag or classify sensitive data. This helps ensure consistency in data catalogs, lineage, and compliance reporting.
Below are common examples of regex-based patterns used to detect and tag sensitive columns automatically.
Notes:
Regular expressions in Elasticsearch must follow standard regex syntax (e.g.
.*UNID.*
, not*UNID*
).Use the
"flags": "CASE_INSENSITIVE"
parameter to match column names regardless of case.Combine multiple terms using parentheses and
|
, e.g..*(UNID|SERNO).*
.
1. Personal Identifiers
Detect columns that contain unique identifiers for individuals.
customer_unid
, user_serno
, employee_id
.*(UNID|SERNO|EMP(_)?ID|CUSTOMER(_)?ID).*
uuid
, guid
, unique_ref
.*(UUID|GUID|UNIQUE).*
2. Contact Information
Identify fields that hold email addresses, phone numbers, or physical addresses.
email
, user_email
, contact_email_address
.*EMAIL.*
phone_number
, mobile_no
, contact_tel
.*(PHONE|MOBILE|TEL).*
address_line1
, home_address
, postal_address
.*ADDRESS.*
3. Financial Data
Detect columns that may contain sensitive financial details.
credit_card_number
, ccn
, card_no
.*(CREDIT.?CARD|CCN).*
bank_account_number
, iban
, acct_no
.*(ACCOUNT|ACCT|IBAN).*
salary_amount
, income
, wage
.*(SALARY|INCOME|WAGE).*
4. National Identifiers
Detect government-issued identifiers.
ssn
, social_security_number
, nin
, tax_id
.*(SSN|SOCIAL.?SECURITY|NIN|TAX.?ID).*
passport_no
, passport_number
.*PASSPORT.*
5. Authentication Data
Identify columns that may contain credentials or tokens.
password
, user_pwd
, hash
, token
.*(PASSWORD|PWD|HASH|TOKEN).*
api_key
, access_key
, secret_key
.*(API.?KEY|ACCESS.?KEY|SECRET).*
6. Health and Demographic Data
Classify columns related to personal health or demographic details.
medical_record_number
, patient_id
.*(MEDICAL|PATIENT).*
age
, birth_date
, dob
.*(AGE|BIRTH.?DATE|DOB).*
gender
, sex
.*(GENDER|SEX).*
How It Works
When column name classification is enabled:
The system evaluates each column name against the configured regex patterns.
Columns matching any pattern are automatically tagged with the associated classification.
Stewards can review, approve, or override tags directly in the UI or via API.
This ensures sensitive data is consistently identified and categorized across all datasets, even as new data sources are added.
Last updated