Get up to speed
This workshop is a hands-on primer on deploying secure, federated DataSHIELD backends with Docker Compose. We will focus on Opal and Armadillo deployments with Rock servers; covering local development, production deployment with TLS, and profile management.
What you’ll learn
- Deploy Opal with Rock servers using Docker Compose (local → production)
- Deploy Armadillo as an alternative DataSHIELD backend
- Configure Nginx reverse proxy: local self‑signed → Let’s Encrypt
- Manage multiple DataSHIELD profiles for different research contexts
- Build custom Rock server images with specific package versions
- Practical Linux/SSH and terminal workflows (great for cloud)
- Production deployment strategies: IT-managed vs self-managed
Course structure
- Armadillo local deployment: Quick start with Armadillo + RServer
- Opal local deployment: Core focus - Opal + Rock + MongoDB locally
- Going live: Production deployment with Nginx, TLS certificates, and DNS
- Managing profiles: Multiple DataSHIELD environments
- Custom Rock images: Building reproducible, version-controlled server images
Audience and setup
- Skill level: beginner-friendly. We’ll explain each step as we go.
- Follow‑along encouraged on your laptop (Mac/Windows/Linux). If you prefer to watch, you’ll still get all commands and files.
Prerequisites checklist
- Laptop: macOS (Apple Silicon is fine), Windows 10/11, or Linux
- Containers:
- Terminal + SSH: basic comfort with shell,
ssh user@host
- Ports:
- Local: allow Docker to bind 80/443 (stop other services using these ports)
- Cloud (optional): if testing live TLS, ensure 22/80/443 are open
- Domain (optional, for live demo): have a test DNS name ready if you want to issue real certificates with Let’s Encrypt; otherwise we’ll stick to self‑signed locally
Target platforms we’ll mention
- On‑prem demo first; cloud notes included
- Cloud suggestion: Ubuntu 24.04 LTS on x86_64 for quick starts
What we’ll deploy (at a glance)
Opal stack
- Local development: Opal/Armadillo + Rock + MongoDB with HTTP access
- Production deployment: Nginx reverse proxy with Let’s Encrypt TLS
- Multiple profiles: Default, survival analysis, genomics environments
- Custom images: Version-controlled Rock servers with specific packages
Armadillo stack
- Armadillo server: Alternative DataSHIELD backend
- RServer integration: DataSHIELD package execution environment
Deployment progression
- Stage 1 (local): HTTP access for development and testing
- Stage 2 (production): HTTPS with proper certificates and DNS
- Stage 3 (advanced): Multiple profiles and custom package management
Tip
No prior sysadmin experience required. We’ll keep commands copy‑pasteable and explain the “why” briefly as we go.
Why start with local deployment?
- Local keeps friction low: no DNS, firewalls, or cloud costs. You can validate the stack in minutes on any laptop.
- Safe sandbox: use HTTP and avoid exposing services to the internet while you explore features.
- Developer‑friendly: iterate on DataSHIELD code and packages against a predictable, reproducible environment.
- Portable: the same Compose foundation scales up later (add DNS, TLS, and hardening without changing core services).
- Teaching aid: a contained lab to learn Opal and Rock.
Pre‑reading (short and optional)
- DataSHIELD overview paper: International Journal of Epidemiology
- Opal (DataSHIELD server) overview: OBiBa Opal
- Armadillo documentation: MOLGENIS Armadillo
- DataSHIELD packages documentation: cran.datashield.org
- Docker basics for this workshop:
- Let’s Encrypt on Ubuntu + Nginx (for production deployment):
- Certbot + Nginx (Ubuntu): DigitalOcean guide
What we will not cover
- Kubernetes deployments (we focus on Docker Compose)
- Advanced identity management (OIDC/Keycloak)
- Firewall hardening beyond TLS basics
- Advanced monitoring and logging (though we will see mention Telegraf/Prometheus + Grafana if time permits)
Outcomes
By the end, you’ll have:
- A working local Opal/Armadillo deployment for development and testing
- Understanding of production deployment options (IT-managed vs self-managed)
- Knowledge of DataSHIELD profile management for different research contexts
- Skills to build and maintain custom Rock server images
- A clear path from development to production deployment