The DGX Spark Homelab Setup Guide

Why This Matters

The NVIDIA DGX Spark is a 1U rackmount workstation running the GB10 Grace Blackwell Superchip — 20 ARM v9.2-A cores, integrated Blackwell GPU, 128 GB LPDDR5x coherent unified memory. At $7,999 it's expensive for a homelab toy, and it demands serious infrastructure planning to run reliably 24/7.

This guide covers what matters: power draw, acoustics, network configuration, and integration with existing homelab tools. Skip the NVIDIA marketing and focus on what you actually need to worry about when this thing is running in your basement at 2 AM.

Power Requirements

The GB10 superchip has a TDP of ~350W under full load. The whole DGX Spark GX10 workstation draws approximately 300-550W depending on workload:

State	Power Draw	Notes
Idle / web browsing	~120W	Nearly silent, all fans off or at < 20%
Light ML inference	~250-400W	Single model, batch-size 1-4
Full GPU compute	400-550W	All Blackwell SMs active, ARM cores at 100%
Sustained max load	~550W	Full system stress test territory

⚠️ Important: Peak draw of ~550W on a 120V line is ~4.6A. A standard 15A circuit is perfectly adequate. You don't need a dedicated 20A circuit for the Spark alone — though sharing a circuit with high-draw appliances (EV chargers, HVAC) is still a bad idea.

Cost to run at full load 24/7 on average US electricity ($0.12/kWh): approximately $30-36/month. Light inference (200W avg) drops to about $7/month.

Cooling and Acoustics

The Spark is a 1U rackmount — tiny footprint, dense hardware, and it gets loud when it works hard. Here's what to expect:

Load	Acoustic Output	Placement Guidance
Idle to light	~25-35 dB(A)	Office-friendly if properly mounted
Moderate (inference)	~35-40 dB(A)	Garage or basement acceptable
Heavy compute	~50-60 dB(A)	Sound-dampened enclosure required
Full stress	~70+ dB(A)	Not human-occupied space

💡 Pro tip: The GB10's unified architecture means thermal throttling is shared between CPU and GPU cores. If you're running inference workloads, the Spark rarely exceeds 60-70°C on the superchip. It only screams under sustained full-SM workloads or during fine-tuning.

Cooling recommendations:

Rack mount in a 42U cabinet with proper airflow front-to-back. At 1U, you need 2-3 120mm intake fans minimum in your rack.
Free-standing setup: Elevate on 15mm feet, ensure 6 inches clearance on all sides. Don't shove it in a closed cabinet without active cooling.
Environment temperature: Keep room at 20-25°C. For every degree above, GPU junction temps climb ~2°C.
Consider a rack PDU with per-outlet monitoring so you can track actual power draw without plugging a Kill-A-Watt into the wall.

Network Configuration

The DGX Spark ships with Mellanox 200GbE networking ports — massive bandwidth that's overkill for most homelabs and requires some planning to use effectively.

Option A: Direct to Switch (Recommended)

# On the DGX Spark (Ubuntu Linux)
# Check which interfaces are up
ip link show

# You should see:
# eth0 and eth1 — both 200GbE

# Assign static IP for homelab use
sudo ip addr add 192.168.0.100/24 dev eth0
sudo ip link set eth0 up

# Make it persistent (Netplan)
cat > /etc/netplan/01-spark-config.yaml <<'EOF'
network:
  version: 2
  ethernets:
    eth0:
      dhcp4: no
      addresses:
        - 192.168.0.100/24
      nameservers:
        addresses: [192.168.0.1, 8.8.8.8]
    eth1:
      dhcp4: no
      addresses:
        - 192.168.0.101/24
      # eth1 is available for GPU direct/InfiniBand or future expansion
EOF

sudo netplan apply

Connect eth0 to your homelab switch. eth1 can be connected to a 200GbE switch for inter-node GPU direct communication if you plan to scale up.

Option B: USB-to-Ethernet Adapter (Budget)

If you don't have a 200GbE switch (good luck finding one at consumer prices), use a high-quality USB 3.2 Gen 2 to 2.5GbE or 10GbE adapter for homelab connectivity:

# USB 10GbE adapter typically shows as eth2
ip link show eth2
ip addr add 192.168.0.100/24 dev eth2
ip link set eth2 up

⚠️ Caveat: USB adapters bottleneck the Spark's networking. For local inference you won't notice — the GB10's internal memory bandwidth (273 GB/s) dwarfs any external NIC. But if you're doing distributed inference or model serving, the native 200GbE is the way to go.

Firewall and Security

Ubuntu on the Spark ships with UFW disabled by default. Set up a basic firewall before exposing any services:

# Basic firewall setup
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh
sudo ufw allow 8080/tcp   # Ollama API
sudo ufw allow 443/tcp    # HTTPS (if configured)
sudo ufw enable

# Verify
sudo ufw status verbose

Also set up SSH key authentication:

ssh-keygen -t ed25519 -C "spark-admin@gx10"
ssh-copy-id -i ~/.ssh/id_ed25519.pub spark@192.168.0.100

# Disable password auth (after confirming key works)
sudo nano /etc/ssh/sshd_config
# Set: PasswordAuthentication no
sudo systemctl restart sshd

Homelab Integration

Getting the Spark to play nicely with your existing infrastructure:

Docker / Podman

The Spark supports both Docker and Podman. Docker requires the NVIDIA Container Toolkit for GPU access:

# Add NVIDIA repo
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
    sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

# Test GPU passthrough
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi

Monitoring

Set up Grafana + Prometheus for observability:

# Install node-exporter for system metrics
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-arm64.tar.gz
tar xzf node_exporter-*.tar.gz
sudo mv node_exporter-*/node_exporter /usr/local/bin/

# Install NVIDIA DCGM exporter for GPU metrics
docker run -d --name dcgm-exporter --gpus all -p 9400:9400 nvcr.io/nvidia/k8s/dcgm-exporter:3.3.6-3.4.0-ubuntu22.04

# Check GPU metrics
curl http://localhost:9400/metrics | head -20

Ansible for Automation

Manage the Spark from your main server via Ansible playbooks:

# inventory.yml
[dgx_spark]
spark ansible_host=192.168.0.100 ansible_user=spark ansible_python_interpreter=/usr/bin/python3

What Doesn't Work (And Why)

Common Idea	Why It Fails	Workaround
Run in a bedroom	~70dB under load. Your partner will divorce you.	Garage, basement, shed
100W PoE power	TDP is ~350W, peak system ~550W. Needs real power.	Standard 15A circuit is fine
Direct to 1GbE router	200GbE ports have no RJ45 native port	USB adapter or 200GbE switch
VM passthrough from Proxmox host	SBC-style architecture limits VM nesting	Use Docker containers instead
Dual-socket PCIe expansion	GB10 has limited PCIe lanes (8x PCIe 5.0)	Plan carefully what goes in the one slot

Summary Checklist

☐ Standard 15A circuit with proper grounding
☐ Rack mount or elevated standalone setup with airflow
☐ Network: static IP on eth0, UFW configured, SSH keys only
☐ NVIDIA Container Toolkit installed
☐ Monitoring: node_exporter + DCGM exporter running
☐ Room temperature maintained at 20-25°C
☐ Power monitoring (PDU or smart plug) set up

💡 Bottom line: The DGX Spark is a serious piece of hardware that works great once you've got the infrastructure sorted. The hard part isn't the AI — it's the power and acoustics. Plan those first, and everything else is just software configuration.