Opal vs Armadillo

Datashield in the wild

What They Are

Opal:

  • Mature platform by OBiBa
  • Full data management: storage, harmonisation, metadata, cataloguing
  • Widely used in DataSHIELD documentation and consortia

Armadillo:

  • Lightweight server by MOLGENIS team (Support & Development)
  • Built specifically for DataSHIELD federated analysis
  • Easier deployment, fewer dependencies

Architecture & Storage

Opal

  • Uses relational DBs (MySQL, PostgreSQL, MariaDB) or MongoDB
  • Stores data and metadata in databases
  • Supports complex workflows

Armadillo

  • Stores data on the filesystem (e.g., Parquet files)
  • Uploads via UI or R package
  • Simpler, less infrastructure heavy

Features

Opal

  • Rich functionality: harmonisation, metadata, cataloguing
  • Multiple R servers, horizontal scaling
  • Advanced admin and access control

Armadillo

  • Focused on core federated analysis features
  • Profiles: named/versioned DS package collections
  • Permissions and function whitelisting
  • Lightweight and user-friendly

Pros & Cons

Opal

✅ Mature and feature-rich ✅ Good for large, complex infrastructures ❌ Higher complexity and overhead

Armadillo

✅ Lightweight, quick to deploy ✅ Easier for cohorts with modest resources ❌ Fewer features (e.g., less advanced harmonisation) ❌ Newer, still catching up in some areas

When to Use

Choose Opal if:

  • You need full data management, metadata, and harmonisation
  • You operate in a large consortium with complex workflows

Choose Armadillo if:

  • You want lightweight, easy deployment
  • You mainly need secure federated analysis
  • Your team has limited IT resources and wants Molgenis Support