HealthAtoms
Data & Analyticsconcept · 6 min · updated Jun 30, 2026

WhiteRabbit & Rabbit-in-a-Hat (OMOP ETL design)

By Rajendra Sharma, RN, CPC, CPBReviewed by Rajendra Sharma, RN, CPC, CPB · Jun 29, 2026

OHDSI tools that profile your source database and help you design the ETL that maps it to the OMOP Common Data Model — the planning stage of data harmonization.

OMOP CDM

In one line

Before you can harmonize data into OMOP, you have to understand what you actually have: WhiteRabbit scans and profiles a source database, and Rabbit-in-a-Hat turns that scan into a visual ETL design mapping each source field to the CDM. Licence: Apache 2.0 (fully open).

source DB scan WhiteRabbitprofile Rabbit-in-a-Hatmap to OMOP
WhiteRabbit profiles a source database; Rabbit-in-a-Hat turns that scan into a documented table-and-field mapping to OMOP.

The problem it solves

The hardest part of an OMOP conversion isn't writing the ETL code — it's understanding the messy source first: what tables exist, what the fields really contain, what codes appear and how often. Guess, and the mapping is wrong. WhiteRabbit and Rabbit-in-a-Hat are the disciplined planning stage that makes the conversion correct.

How the two tools work together

  • WhiteRabbit connects to a source database (or CSVs) and produces a scan report: every table, every field, with value distributions and frequencieswithout exposing raw records (so it's safe to share with mappers).
  • Rabbit-in-a-Hat reads that report and gives you a drag-and-connect canvas to document how source.diagnosis_code becomes condition_occurrence.condition_concept_id, field by field — including the vocabulary mapping (often refined with Usagi for source-code → standard-concept matching).

The output is the ETL specification your engineers then implement — the design, not the code.

Where it shows up in digital health

Every real OMOP conversion — a hospital's EHR, a claims extract, a registry — starts with a WhiteRabbit scan and a Rabbit-in-a-Hat design. It is the methodical counterpart to the "source → standard" scenario in the OMOP Data Harmonization lab: the lab shows the mapping in code; these tools are how teams plan it at scale before writing a line.

Common pitfalls

  • Skipping the scan — mapping from memory or documentation (which is always out of date) instead of from the data's reality.
  • Undocumented decisions — the ETL spec is also the audit record of how data was transformed; vague mappings haunt later analysis.
  • Ignoring frequencies — a code that appears twice and one appearing two million times deserve different attention.

Key takeaways

  • WhiteRabbit profiles the source (tables, fields, value frequencies) — safely, no raw data.
  • Rabbit-in-a-Hat turns that into a documented field-by-field ETL design to OMOP.
  • They're the planning front half of harmonization; engineers implement the spec.
  • Skipping this stage is the most common cause of a broken OMOP dataset.

Check your recall

0 of 2 recalled

Active recall beats re-reading — try to answer, then reveal.

  1. What do WhiteRabbit and Rabbit-in-a-Hat do?

  2. Why is the scan-first step important?

References

  1. OHDSI WhiteRabbit
  2. The Book of OHDSI — ETL

Related entries