Skip to main content

Quick Start

Get up and running with Cumulus in minutes.

Installation

pip install cumulus-sdk

Initialize Client

from cumulus import CumulusClient

client = CumulusClient(api_key="YOUR_API_KEY")

Upload a Model

Register your local model with Cumulus:

endpoint = client.upload_model(
name="detector_model",
path="./models/detector.pth"
)

Deploy a Model

endpoint = client.deploy(
model="detector_model",
workload_type="inference"
)

# Get your endpoint ID
print(endpoint.id) # Returns: "ep_1a2b3c4d5e6f7g8h"

Run Inference

result = endpoint("input_data_here")
print(result.output)

That's it. Cumulus automatically:

  • Uploads and registers your model
  • Analyzes your model size
  • Finds the best GPU in the closest region
  • Handles fractional allocation if needed
  • Routes requests optimally

No configuration required. Just upload, deploy, and run.

Container Configuration & Dependencies

Cumulus handles container setup automatically, but you can specify custom dependencies:

endpoint = client.deploy(
model="detector_model",
workload_type="inference",
dependencies=[
"torch==2.4.0",
"transformers==4.40.0",
"accelerate==0.30.0"
],
env_vars={
"HF_HUB_ENABLE_HF_TRANSFER": "1" # faster model downloads
}
)

Cumulus automatically:

  • Installs dependencies in the container
  • Configures environment variables
  • Optimizes the base image for your model type

Common Dependencies

PackageDescription
torchPyTorch for deep learning
transformersHugging Face model hub
vllmHigh-performance LLM serving
accelerateDistributed inference