Getting Started¶
Get your data science environment up and running in under 5 minutes.
Prerequisites¶
Before you begin, ensure you have:
- Docker Desktop (Mac/Windows) or Docker Engine (Linux)
- Docker Compose (included with Docker Desktop)
- At least 8GB RAM available for Docker
- 20GB free disk space for images and volumes
Check Docker Installation
Both commands should return version numbers. If not, install Docker first.
Step 1: Clone the Repository¶
Step 2: Run the Setup Script¶
The setup script creates necessary directories and configuration files:
This script:
- Creates the
volumes/directory structure for persistence - Copies
.env.exampleto.envfor configuration - Initializes the R package library volume
- Sets correct permissions
Step 3: Configure Your Environment¶
Edit the .env file to set your credentials:
Required settings:
# Authentication
RSTUDIO_PASSWORD=your-secure-password
JUPYTER_TOKEN=your-secure-token
# Optional: Custom username (default: rstudio)
RSTUDIO_USER=your-username
Security Note
The .env file contains secrets and is excluded from git. Never commit it to version control.
Step 4: Pull the Docker Image¶
The pre-built image is available from GitHub Container Registry (GHCR) and Docker Hub:
Image Size
The image is approximately 8GB. Initial pull may take a few minutes depending on your connection.
Step 5: Start the Services¶
The -d flag runs containers in detached mode (background).
Verify the container is running:
You should see:
Step 6: Access Your Environment¶
Open your browser and navigate to:
| Service | URL | Credentials |
|---|---|---|
| RStudio Server | http://localhost:8787 | Username from .env, password from .env |
| JupyterLab | http://localhost:8888 | Token from .env |
Verify Everything Works¶
Test RStudio¶
- Open http://localhost:8787
- Log in with your credentials
- In the console, run:
You should see a scatter plot in the Plots pane.
Test JupyterLab¶
- Open http://localhost:8888
- Enter your token
- Create a new Python notebook and run:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'x': range(10), 'y': range(10)})
df.plot(x='x', y='y')
plt.show()
You should see a line plot.
Test R Kernel in Jupyter¶
- In JupyterLab, create a new notebook with the "R" kernel
- Run:
Common Commands¶
# Start services
docker-compose up -d
# Stop services (data persists)
docker-compose down
# View logs
docker-compose logs -f
# Restart services
docker-compose restart
# Check status
docker-compose ps
Next Steps¶
Troubleshooting¶
Container won't start¶
# Check logs for errors
docker-compose logs
# Common fixes:
# - Port already in use: Change ports in .env
# - Permission denied: Run chmod -R 755 volumes/
# - Out of memory: Allocate more RAM to Docker
Can't access the web interface¶
- Verify the container is running:
docker-compose ps - Check if ports are exposed:
docker port datasci-homelab - Try accessing via
127.0.0.1instead oflocalhost
Authentication not working¶
# Reset RStudio password
docker-compose exec homelab passwd $RSTUDIO_USER
# Get Jupyter token from logs
docker-compose logs homelab | grep token
For more issues, see the Troubleshooting Guide.