3. Going live: cloud deployment

Goal

Take your local DataShield deployment public! We’ll deploy Opal + Rock + MongoDB to a live server with automatic SSL certificates, proper security, and a production-ready reverse proxy. By the end, you’ll have a fully functional DataShield server accessible from anywhere on the internet.

Two paths to going public

When you’re ready to make your DataShield deployment publicly accessible, you have two main options:

Path 2: Full Cloud Deployment (What we’ll demonstrate)

  • What you do: Everything - from server setup to SSL certificates to security configuration
  • Best for: Cloud deployments (AWS, Azure, GCP), personal projects, or when you need full control
  • Pros: Complete control, faster iteration, great for learning
  • Cons: You’re responsible for security, updates, and maintenance

In this workshop, we’ll demonstrate Path 2 using AWS because it shows the complete process and gives you full understanding of all components. However, in practice, Path 1 is often the most appropriate choice for research environments.

What we’ll build today

We’re going to transform your local setup into a production-ready deployment with:

  • Automatic SSL certificates from Let’s Encrypt
  • Reverse proxy serve your application via https
  • Proper domain access - your colleagues can access it! via https://your-domain.com (not a weird IP address)
  • All the security bells and whistles that make IT departments happy (hopefully)

Architecture (public)

graph LR
  U["User<br/>https://your-domain.com"] -->|DNS resolves| N["Nginx (80/443)"]
  N -->|proxy https->http| O["Opal (8080)"]
  subgraph Profiles["docker network: opalnet"]
    O -->|R/DataSHIELD| R["Rock (8085)"]
    O -->|Data storage| MONGO["MongoDB (27017)"]
  end
  ACME["Certbot webroot<br/>/.well-known/acme-challenge/"] -.->|HTTP-01| N

  %% Define a lighter background for the subgraph
  classDef light fill:#f9f9f9,stroke:#aaa,stroke-width:1px;
  class Profiles light;

Prerequisites

  • Registered domain name, e.g. opal.example.org
  • DNS A/AAAA record pointing to your server’s public IP
  • Ports 80 and 443 open to the internet (cloud SG/firewall/ufw)
  • Linux host recommended (Ubuntu 22.04/24.04), Docker + Compose v2

Architecture transformation

We’re taking your simple local setup and adding production-grade components:

  • Nginx (NEW!): Acts as a security guard and traffic director
    • Handles SSL certificates
    • Adds security headers to protect against attacks
    • Rate limits to prevent abuse (optional)
  • Opal: Your existing DataShield administration server (now behind the proxy)
  • Rock: Your existing R computation server (unchanged)
  • MongoDB: Your existing database backend (unchanged)
  • Certbot (NEW!): Automatically gets and renews SSL certificates from Let’s Encrypt

Files you’ll be working with

Don’t worry - it’s not as complex as it looks! All of these files are provided in our live deployment scripts:

datashield-live/
├── .env                    # 🔧 YOU EDIT: Just your domain name and password
├── docker-compose.yml      # 📋 PROVIDED: Orchestrates all services
├── nginx-template.conf     # 📋 PROVIDED: Nginx configuration with SSL
├── nginx-http-only.conf    # 📋 PROVIDED: Temporary config for getting certificates
├── get-certs.sh            # 🚀 PROVIDED: One-click SSL certificate setup
└── renew-certs.sh          # 🔄 PROVIDED: Automatic certificate renewal

You literally only need to edit ONE file (.env) - everything else is ready to go!

1) Environment Configuration

This is the only file you need to edit! Create your .env file:

# DNS Configuration - CHANGE THIS to your actual domain!
DNS_DOMAIN=datashield.myresearch.org
# Opal Configuration - CHANGE THIS to a strong password!
OPAL_ADMINISTRATOR_PASSWORD=SuperSecurePassword123!
Important

Domain setup is crucial! Before proceeding, make sure your domain’s DNS A record points to your server’s public IP address. You can check this with:

nslookup datashield.myresearch.org
# Should return your server's IP address
Tip

Password security tip: Use a password manager to generate a strong password. This will be the admin password for your DataShield server, so make it count!

Note

As mentioned before, in production, you should use Docker secrets, Kubernetes secrets, or your platform’s native secrets management system. We use a static file here for simplicity.

2) Docker Compose Configuration

The docker-compose.yml includes nginx reverse proxy, automatic SSL certificates, and all DataShield services:

services:
  nginx:
    image: nginx:alpine
    depends_on:
      - opal
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx-template.conf:/etc/nginx/nginx-template.conf:ro
      - ./ssl:/etc/ssl:ro
    environment:
      - DNS_DOMAIN=${DNS_DOMAIN}
    command: /bin/sh -c "sed \"s/DNS_DOMAIN_PLACEHOLDER/$${DNS_DOMAIN}/g\" /etc/nginx/nginx-template.conf > /etc/nginx/nginx.conf && nginx -g 'daemon off;'"
    restart: unless-stopped

  certbot:
    image: certbot/certbot:latest
    volumes:
      - ./ssl:/etc/letsencrypt
    environment:
      - DNS_DOMAIN=${DNS_DOMAIN}
    command: certonly --webroot --webroot-path=/var/www/certbot --register-unsafely-without-email --agree-tos --no-eff-email --expand -d ${DNS_DOMAIN}

  opal:
    image: obiba/opal:latest
    depends_on:
      - rock
      - mongo
    environment:
      - OPAL_ADMINISTRATOR_PASSWORD=${OPAL_ADMINISTRATOR_PASSWORD}
      - MONGO_HOST=mongo
      - MONGO_PORT=27017
      - ROCK_HOSTS=rock:8085
    volumes:
      - ./data/opal:/srv
      - ./logs:/var/log/opal

  mongo:
    image: mongo:6.0
    volumes:
      - ./data/mongo:/data/db

  rock:
    image: datashield/rock-base:latest
    environment:
      - ROCK_ID=new-stack-rock

networks:
  default:
    name: opalnet

3) AWS Setup

EC2 Instance Setup

  1. Launch an EC2 instance (recommended: t3.medium or larger)
  2. Configure Security Group rules:
    • Port 80 (HTTP) - open to 0.0.0.0/0
    • Port 443 (HTTPS) - open to 0.0.0.0/0
    • Port 22 (SSH) - open to your IP

Domain Configuration

Point your domain DNS to the EC2 instance public IP:

A record: your-domain.com -> YOUR_EC2_PUBLIC_IP

4) The Magic Deployment Process ✨

Ready for the exciting part? Let’s go live!

Step 1: Configure your environment

# Edit the .env file with your domain and password
nano .env
# (Or use your favorite editor: vim, code, etc.)

Step 2: The three-step dance to production! 🕺

# 🚀 Step 1: Start the core DataShield services
echo "Starting DataShield services..."
docker-compose up -d mongo rock opal

# ⏳ Wait a moment for services to initialize
echo "Services starting... (this takes about 30 seconds)"
sleep 30

# 🔒 Step 2: Get your shiny SSL certificates!
echo "Getting SSL certificates from Let's Encrypt..."
./get-certs.sh

# 🔄 Step 3: Restart nginx with the new certificates
echo "Configuring nginx with SSL certificates..."
docker-compose stop nginx
docker-compose rm -f nginx
docker-compose up -d nginx

echo "🎉 Deployment complete!"

Step 3: The moment of truth!

Open your browser and visit https://your-domain.com

You should see:

  • 🔒 A beautiful green lock icon (SSL is working!)
  • 🏠 The familiar Opal login page
  • 🎯 No browser security warnings

Log in with:

  • Username: administrator
  • Password: The password you set in .env

If you see the Opal dashboard, congratulations! 🎉 You just deployed DataShield to production!

5) Automatic renewal

Let’s Encrypt certs expire every ~90 days. Renew periodically and reload Nginx.

# Try a dry run first
docker compose run --rm certbot renew --dry-run -w /var/www/certbot

# Example cron (host): renew daily at 03:00 and reload nginx
# crontab -e
0 3 * * * cd /path/to/opal-live && docker compose run --rm certbot renew -w /var/www/certbot && docker compose exec nginx nginx -s reload >> certbot-renew.log 2>&1

If port 80 cannot be opened, consider DNS-01 challenges (requires DNS provider API integration) instead of HTTP-01.

6) Testing your live deployment 🧪

Web Access

  • 🏠 Main DataShield Interface: https://your-domain.com
  • ❤️ Health Check: https://your-domain.com/health (should return “healthy”)

Your new credentials

  • Username: administrator
  • Password: Whatever you set in .env as OPAL_ADMINISTRATOR_PASSWORD

The R test that proves it’s working

Now for the real test - connecting from R with proper SSL security (no more scary certificate warnings!):

# Install packages if you haven't already
# install.packages(c("DSI", "DSOpal", "dsBaseClient"))

library(DSI)
library(DSOpal)
library(dsBaseClient)

# 🎉 Notice: No more SSL verification overrides needed!
# Your server now has a proper certificate!

# Set up your connection
b <- DSI::newDSLoginBuilder()
b$append(
  server   = "production",
  url      = "https://your-domain.com",  # 🔒 HTTPS with real certificate!
  user     = "administrator", 
  password = "youpassword!" # From .env
)

# Connect and test
logins <- b$build()
conns <- DSI::datashield.login(logins)

ds.ls()

🎯 Success indicators:

  • No SSL certificate warnings in R
  • Connection establishes without errors
  • Your browser shows a green lock icon

Share with colleagues!

Your DataShield server is now accessible to collaborators anywhere in the world:

  • Just share the URL: https://your-domain.com
  • They can use the same R connection code (with their own credentials)
  • No VPN or special network setup required!

Troubleshooting & hardening

  • DNS propagation can take time. Check: dig +short opal.example.org resolves to your IP.
  • Ensure 80/443 reach your host (cloud SG, ufw, on-prem firewall). Test: curl -I http://opal.example.org/.well-known/acme-challenge/test (should hit Nginx).
  • Set strong TLS only (we used TLS1.2/1.3 and HIGH ciphers) and enable HSTS.
  • Consider limiting Nginx request sizes, enabling access logs, and fail2ban/WAF if exposed to the internet.

References: