Compute Providers
Biom supports five compute providers for running scientific models. The system automatically selects the cheapest GPU that meets the model’s VRAM requirements, or you can choose a provider manually.Modal Labs (default cloud GPU)
Modal is the default compute provider for all GPU-requiring models.- Serverless GPU — no instances to manage, pay only for compute time
- Auto-scaling — scales to zero when idle
- GPU selection — system auto-selects cheapest GPU meeting VRAM requirements
- Cost estimation — estimated cost displayed before execution
Local Docker
Run models on your own machine using Docker:- GPU passthrough — uses nvidia-docker for GPU access
- GPU detection — auto-detects GPU via nvidia-smi
- Log streaming — real-time progress updates
- Configurable resources:
| Setting | Default |
|---|---|
| Memory limit | 16 GB |
| CPU limit | 8 cores |
| Timeout | 3600 seconds |
| Auto-pull images | On |
HPC / SLURM
Submit jobs to your institution’s HPC cluster:- SSH connection — connect to cluster head node via SSH
- Singularity containers — or bare-metal execution (module/conda)
- SLURM resource specs:
- Partition selection
- GPU type (gres)
- Time limit
- Memory per node
- CPUs per task
| Setting | Default |
|---|---|
| Poll interval | 5 seconds |
| Max wait time | 4 hours |
| SSH connection pool | Max 5 connections |
- SFTP file transfer — for clusters without shared filesystems
User GPU Server (REST API)
Connect any GPU server that exposes a REST API:- Custom endpoint — point to your own server
- API key auth — Bearer token authentication
- SSL verification — configurable SSL settings
- Standard contract —
/executeand/healthendpoints - Concurrency — max 10 concurrent jobs (configurable)
HuggingFace Spaces
Run models hosted on HuggingFace Spaces:- Gradio client — connects via the Gradio API
- Public and private — supports token-gated private spaces
- No GPU management — compute handled by HuggingFace