3. Going live: cloud deployment
Goal
Take your local DataShield deployment public! We’ll deploy Opal + Rock + MongoDB to a live server with automatic SSL certificates, proper security, and a production-ready reverse proxy. By the end, you’ll have a fully functional DataShield server accessible from anywhere on the internet.
Two paths to going public
When you’re ready to make your DataShield deployment publicly accessible, you have two main options:
Path 1: Research Center IT (Recommended for most users)
- What you do: Deploy DataShield locally on a server provided by your research center (using the setup from section 2)
- What IT does: Configure firewall rules, domain names, SSL certificates, and network access
- Best for: Most research scenarios where your institution has dedicated IT support
- Pros: IT handles security, compliance, and infrastructure concerns
- Cons: Requires coordination with IT department, may have institutional restrictions
Path 2: Full Cloud Deployment (What we’ll demonstrate)
- What you do: Everything - from server setup to SSL certificates to security configuration
- Best for: Cloud deployments (AWS, Azure, GCP), personal projects, or when you need full control
- Pros: Complete control, faster iteration, great for learning
- Cons: You’re responsible for security, updates, and maintenance
In this workshop, we’ll demonstrate Path 2 using AWS because it shows the complete process and gives you full understanding of all components. However, in practice, Path 1 is often the most appropriate choice for research environments.
What we’ll build today
We’re going to transform your local setup into a production-ready deployment with:
- Automatic SSL certificates from Let’s Encrypt
- Reverse proxy serve your application via
https
- Proper domain access - your colleagues can access it! via
https://your-domain.com
(not a weird IP address) - All the security bells and whistles that make IT departments happy (hopefully)
Architecture (public)
graph LR U["User<br/>https://your-domain.com"] -->|DNS resolves| N["Nginx (80/443)"] N -->|proxy https->http| O["Opal (8080)"] subgraph Profiles["docker network: opalnet"] O -->|R/DataSHIELD| R["Rock (8085)"] O -->|Data storage| MONGO["MongoDB (27017)"] end ACME["Certbot webroot<br/>/.well-known/acme-challenge/"] -.->|HTTP-01| N %% Define a lighter background for the subgraph classDef light fill:#f9f9f9,stroke:#aaa,stroke-width:1px; class Profiles light;
Prerequisites
- Registered domain name, e.g.
opal.example.org
- DNS A/AAAA record pointing to your server’s public IP
- Ports 80 and 443 open to the internet (cloud SG/firewall/ufw)
- Linux host recommended (Ubuntu 22.04/24.04), Docker + Compose v2
Architecture transformation
We’re taking your simple local setup and adding production-grade components:
- Nginx (NEW!): Acts as a security guard and traffic director
- Handles SSL certificates
- Adds security headers to protect against attacks
- Rate limits to prevent abuse (optional)
- Opal: Your existing DataShield administration server (now behind the proxy)
- Rock: Your existing R computation server (unchanged)
- MongoDB: Your existing database backend (unchanged)
- Certbot (NEW!): Automatically gets and renews SSL certificates from Let’s Encrypt
Files you’ll be working with
Don’t worry - it’s not as complex as it looks! All of these files are provided in our live deployment scripts:
datashield-live/
├── .env # 🔧 YOU EDIT: Just your domain name and password
├── docker-compose.yml # 📋 PROVIDED: Orchestrates all services
├── nginx-template.conf # 📋 PROVIDED: Nginx configuration with SSL
├── nginx-http-only.conf # 📋 PROVIDED: Temporary config for getting certificates
├── get-certs.sh # 🚀 PROVIDED: One-click SSL certificate setup
└── renew-certs.sh # 🔄 PROVIDED: Automatic certificate renewal
You literally only need to edit ONE file (.env
) - everything else is ready to go!
1) Environment Configuration
This is the only file you need to edit! Create your .env
file:
# DNS Configuration - CHANGE THIS to your actual domain!
DNS_DOMAIN=datashield.myresearch.org
# Opal Configuration - CHANGE THIS to a strong password!
OPAL_ADMINISTRATOR_PASSWORD=SuperSecurePassword123!
Domain setup is crucial! Before proceeding, make sure your domain’s DNS A record points to your server’s public IP address. You can check this with:
nslookup datashield.myresearch.org
# Should return your server's IP address
Password security tip: Use a password manager to generate a strong password. This will be the admin password for your DataShield server, so make it count!
As mentioned before, in production, you should use Docker secrets, Kubernetes secrets, or your platform’s native secrets management system. We use a static file here for simplicity.
2) Docker Compose Configuration
The docker-compose.yml
includes nginx reverse proxy, automatic SSL certificates, and all DataShield services:
services:
nginx:
image: nginx:alpine
depends_on:
- opal
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx-template.conf:/etc/nginx/nginx-template.conf:ro
- ./ssl:/etc/ssl:ro
environment:
- DNS_DOMAIN=${DNS_DOMAIN}
command: /bin/sh -c "sed \"s/DNS_DOMAIN_PLACEHOLDER/$${DNS_DOMAIN}/g\" /etc/nginx/nginx-template.conf > /etc/nginx/nginx.conf && nginx -g 'daemon off;'"
restart: unless-stopped
certbot:
image: certbot/certbot:latest
volumes:
- ./ssl:/etc/letsencrypt
environment:
- DNS_DOMAIN=${DNS_DOMAIN}
command: certonly --webroot --webroot-path=/var/www/certbot --register-unsafely-without-email --agree-tos --no-eff-email --expand -d ${DNS_DOMAIN}
opal:
image: obiba/opal:latest
depends_on:
- rock
- mongo
environment:
- OPAL_ADMINISTRATOR_PASSWORD=${OPAL_ADMINISTRATOR_PASSWORD}
- MONGO_HOST=mongo
- MONGO_PORT=27017
- ROCK_HOSTS=rock:8085
volumes:
- ./data/opal:/srv
- ./logs:/var/log/opal
mongo:
image: mongo:6.0
volumes:
- ./data/mongo:/data/db
rock:
image: datashield/rock-base:latest
environment:
- ROCK_ID=new-stack-rock
networks:
default:
name: opalnet
3) AWS Setup
EC2 Instance Setup
- Launch an EC2 instance (recommended: t3.medium or larger)
- Configure Security Group rules:
- Port 80 (HTTP) - open to 0.0.0.0/0
- Port 443 (HTTPS) - open to 0.0.0.0/0
- Port 22 (SSH) - open to your IP
Domain Configuration
Point your domain DNS to the EC2 instance public IP:
A record: your-domain.com -> YOUR_EC2_PUBLIC_IP
4) The Magic Deployment Process ✨
Ready for the exciting part? Let’s go live!
Step 1: Configure your environment
# Edit the .env file with your domain and password
nano .env
# (Or use your favorite editor: vim, code, etc.)
Step 2: The three-step dance to production! 🕺
# 🚀 Step 1: Start the core DataShield services
echo "Starting DataShield services..."
docker-compose up -d mongo rock opal
# ⏳ Wait a moment for services to initialize
echo "Services starting... (this takes about 30 seconds)"
sleep 30
# 🔒 Step 2: Get your shiny SSL certificates!
echo "Getting SSL certificates from Let's Encrypt..."
./get-certs.sh
# 🔄 Step 3: Restart nginx with the new certificates
echo "Configuring nginx with SSL certificates..."
docker-compose stop nginx
docker-compose rm -f nginx
docker-compose up -d nginx
echo "🎉 Deployment complete!"
Step 3: The moment of truth!
Open your browser and visit https://your-domain.com
You should see:
- 🔒 A beautiful green lock icon (SSL is working!)
- 🏠 The familiar Opal login page
- 🎯 No browser security warnings
Log in with:
- Username:
administrator
- Password: The password you set in
.env
If you see the Opal dashboard, congratulations! 🎉 You just deployed DataShield to production!
5) Automatic renewal
Let’s Encrypt certs expire every ~90 days. Renew periodically and reload Nginx.
# Try a dry run first
docker compose run --rm certbot renew --dry-run -w /var/www/certbot
# Example cron (host): renew daily at 03:00 and reload nginx
# crontab -e
0 3 * * * cd /path/to/opal-live && docker compose run --rm certbot renew -w /var/www/certbot && docker compose exec nginx nginx -s reload >> certbot-renew.log 2>&1
If port 80 cannot be opened, consider DNS-01 challenges (requires DNS provider API integration) instead of HTTP-01.
6) Testing your live deployment 🧪
Web Access
- 🏠 Main DataShield Interface: https://your-domain.com
- ❤️ Health Check: https://your-domain.com/health (should return “healthy”)
Your new credentials
- Username:
administrator
- Password: Whatever you set in
.env
asOPAL_ADMINISTRATOR_PASSWORD
The R test that proves it’s working
Now for the real test - connecting from R with proper SSL security (no more scary certificate warnings!):
# Install packages if you haven't already
# install.packages(c("DSI", "DSOpal", "dsBaseClient"))
library(DSI)
library(DSOpal)
library(dsBaseClient)
# 🎉 Notice: No more SSL verification overrides needed!
# Your server now has a proper certificate!
# Set up your connection
<- DSI::newDSLoginBuilder()
b $append(
bserver = "production",
url = "https://your-domain.com", # 🔒 HTTPS with real certificate!
user = "administrator",
password = "youpassword!" # From .env
)
# Connect and test
<- b$build()
logins <- DSI::datashield.login(logins)
conns
ds.ls()
🎯 Success indicators:
- No SSL certificate warnings in R
- Connection establishes without errors
- Your browser shows a green lock icon
Troubleshooting & hardening
- DNS propagation can take time. Check:
dig +short opal.example.org
resolves to your IP. - Ensure 80/443 reach your host (cloud SG, ufw, on-prem firewall). Test:
curl -I http://opal.example.org/.well-known/acme-challenge/test
(should hit Nginx). - Set strong TLS only (we used TLS1.2/1.3 and HIGH ciphers) and enable HSTS.
- Consider limiting Nginx request sizes, enabling access logs, and fail2ban/WAF if exposed to the internet.
References:
- Opal Docker image and configuration: Installation — Docker Image, Configuration
- Nginx/Certbot flow: DigitalOcean — Secure Nginx with Let’s Encrypt (Ubuntu)