Every property manager running a portfolio of any meaningful size has the same problem: when a new rental application comes in, you need to check it against your historical records. Prior evictions. Lease violations. Do-not-rent flags.

The challenge is that the person applying today may not look like the same person in your records from four years ago — even if they are.

The two datasets

New rental applications contain the applicant’s current legal name, prior addresses, phone number, email, and usually the last four digits of their SSN.

Historical tenant records are an accumulation of past leases, eviction proceedings, and internal notes — often pulled from multiple software systems if the management company has gone through a Yardi-to-AppFolio migration or similar transition.

These two datasets rarely share a clean unique identifier. The join has to happen on name and address, which is exactly where the data gets messy.

Why the matching is hard

People change names. A tenant from three years ago who was Maria Rodriguez may now apply as Maria Kim after marriage. A person who went by Mike on a prior lease shows up as Michael on their current ID. Someone with a hyphenated name has it rendered two different ways across two systems.

Phone numbers change. Emails change. Prior addresses are formatted inconsistently — Apt 4B versus Unit 4B versus #4B versus nothing at all.

And the risk runs both ways:

  • False negatives: You miss a prior eviction because the name didn’t match. The tenant moves in. Three months later you’re filing again.
  • False positives: You incorrectly flag a clean applicant because they share a common name and a ZIP code with someone in your records. You deny the application. That’s fair housing liability.

Both errors are expensive. The costs of a missed eviction — legal fees, vacancy time, unpaid rent — run $3,000–$10,000 per incident. The costs of a wrongful denial are potentially higher.

Cost of matching errors in tenant screening
Error type Scenario Typical cost
False negative Missed prior eviction — tenant moves in $3,000–$10,000 (legal + vacancy)
False negative Missed lease violation flag $500–$3,000 (damages, lost rent)
False positive Clean applicant wrongly flagged Fair housing complaint, legal fees
False positive Common name misidentified Denied application, liability exposure

Costs are estimates based on average landlord eviction expenses in US metros.

Where simple fuzzy matching breaks down

Standard fuzzy matching on name fields catches typos and minor variations reasonably well. It does not handle:

  • Legal name changes (maiden name to married name, name normalization)
  • Nickname-to-formal-name equivalences (Mike vs Michael, Liz vs Elizabeth)
  • Common names in dense ZIP codes where multiple real people match on name and general location
  • Records fragmented across two property management software exports where field formats differ

The core problem is that single-field character similarity doesn’t capture the full picture of whether two records represent the same person.

Correct match rate by scenario — fuzzy string vs AI pipeline
Same name, changed phone AI pipeline
91%
Same name, changed phone Fuzzy string only
14%
Nickname variant (Mike → Michael) AI pipeline
88%
Nickname variant (Mike → Michael) Fuzzy string only
37%
Name change (marriage) AI pipeline
81%
Name change (marriage) Fuzzy string only
9%

Pairs shown for each scenario: AI pipeline (top) vs fuzzy string alone (bottom).

How AI improves accuracy

An AI pipeline matches on the combination of fields, not just the name.

Embeddings over the full record — name, prior address, ZIP, partial phone — create a vector representation that accounts for the totality of the record context. Two records with different names but matching ZIP, phone pattern, and prior address will score higher similarity than two records that share only a common name.

Configurable thresholds let you tune for your risk profile. If you want to minimize false negatives (never miss a flagged tenant), you set a lower threshold and review borderline matches manually. If your portfolio has a high false-positive rate causing applicant denials, you raise the threshold.

LLM confirmation on borderline pairs applies the reasoning a human reviewer would apply: “The name changed from Rodriguez to Kim, the prior address ZIP code matches, the last four digits of the phone number are the same. This warrants human review.” The LLM doesn’t make the final call — it flags the case for your team with a rationale.

The workflow

  1. Export your new applications batch as a CSV
  2. Export your historical tenant records from your property management system
  3. Run them through Match Data Studio with the AI assistant configured for tenant matching
  4. Review the matched pairs — especially the borderline ones flagged for human review
  5. Proceed with applications where no match is found, escalate where there is

This takes the binary pass/fail of a name-only match and replaces it with a scored, explainable similarity ranking that reflects the actual evidence.

A note on compliance

AI-assisted matching doesn’t replace your tenant screening process — it informs it. The output is a list of candidate matches with confidence scores, not automated decisions. Your team makes the call on every application. The AI’s role is ensuring that the right historical records are surfaced for review, not buried under a failed name match.


Match Data Studio handles the matching step: upload two CSVs, configure, review results. Try it with your own data →


Keep reading