Use Case Examples - Rules

You can quickly set up similar policy rules based on these examples.

Data governance stewards often rely on column naming patterns to automatically tag or classify sensitive data. This helps ensure consistency in data catalogs, lineage, and compliance reporting.

Below are common examples of regex-based patterns used to detect and tag sensitive columns automatically.

Notes:

  • Regular expressions in Elasticsearch must follow standard regex syntax (e.g. .*UNID.*, not *UNID*).

  • Use the "flags": "CASE_INSENSITIVE" parameter to match column names regardless of case.

  • Combine multiple terms using parentheses and |, e.g. .*(UNID|SERNO).*.


1. Personal Identifiers

Detect columns that contain unique identifiers for individuals.

Example Column Names
Regex Pattern (Elasticsearch syntax)

customer_unid, user_serno, employee_id

.*(UNID|SERNO|EMP(_)?ID|CUSTOMER(_)?ID).*

uuid, guid, unique_ref

.*(UUID|GUID|UNIQUE).*


2. Contact Information

Identify fields that hold email addresses, phone numbers, or physical addresses.

Example Column Names
Regex Pattern

email, user_email, contact_email_address

.*EMAIL.*

phone_number, mobile_no, contact_tel

.*(PHONE|MOBILE|TEL).*

address_line1, home_address, postal_address

.*ADDRESS.*


3. Financial Data

Detect columns that may contain sensitive financial details.

Example Column Names
Regex Pattern

credit_card_number, ccn, card_no

.*(CREDIT.?CARD|CCN).*

bank_account_number, iban, acct_no

.*(ACCOUNT|ACCT|IBAN).*

salary_amount, income, wage

.*(SALARY|INCOME|WAGE).*


4. National Identifiers

Detect government-issued identifiers.

Example Column Names
Regex Pattern

ssn, social_security_number, nin, tax_id

.*(SSN|SOCIAL.?SECURITY|NIN|TAX.?ID).*

passport_no, passport_number

.*PASSPORT.*


5. Authentication Data

Identify columns that may contain credentials or tokens.

Example Column Names
Regex Pattern

password, user_pwd, hash, token

.*(PASSWORD|PWD|HASH|TOKEN).*

api_key, access_key, secret_key

.*(API.?KEY|ACCESS.?KEY|SECRET).*


6. Health and Demographic Data

Classify columns related to personal health or demographic details.

Example Column Names
Regex Pattern

medical_record_number, patient_id

.*(MEDICAL|PATIENT).*

age, birth_date, dob

.*(AGE|BIRTH.?DATE|DOB).*

gender, sex

.*(GENDER|SEX).*


How It Works

When column name classification is enabled:

  1. The system evaluates each column name against the configured regex patterns.

  2. Columns matching any pattern are automatically tagged with the associated classification.

  3. Stewards can review, approve, or override tags directly in the UI or via API.

This ensures sensitive data is consistently identified and categorized across all datasets, even as new data sources are added.


Last updated