Free Nemotron-3 120B API on Cloudflare

Free Nemotron-3 120B-A12B API via Cloudflare Workers AI — NVIDIA's large MoE reasoning model within 10,000 free Neurons/day.

About the Model

Why use Nemotron-3 120B-A12B?

Nemotron-3 120B-A12B is NVIDIA's large mixture-of-experts model (~12B active of 120B), tuned for reasoning, tool use, and agentic pipelines.

Capabilities

Strong at multi-step reasoning, coding, and structured outputs — a capable open alternative for demanding workloads.

Why it's free

Runs within Cloudflare Workers AI's 10,000 free Neurons per day allowance.

> Note: as an NVIDIA partner model, accept its model license once in your Cloudflare dashboard before the first call.

How to Access for Free (via Cloudflare Workers AI)

Free access on Cloudflare Workers AI

Call Nemotron-3 120B-A12B through Cloudflare's global edge using the unified /ai/run REST endpoint. Every Cloudflare account includes 10,000 Neurons per day for free, so you can explore NVIDIA's large MoE reasoning model at no cost.

Authentication (BYOK)

Use your own Cloudflare Account ID and an API token with Workers AI access via Authorization: Bearer <token>.

> As an NVIDIA partner model, accept its model license once in your Cloudflare dashboard before the first call.

Try it in your browser

Pick a model, paste your own free key, and run. Your key is sent once to call the provider and never stored on our servers.

The model's reply will stream here.

Code Examples

curl
curl https://api.cloudflare.com/client/v4/accounts/$CF_ACCOUNT_ID/ai/run/@cf/nvidia/nemotron-3-120b-a12b \
  -H "Authorization: Bearer $CF_API_TOKEN" \
  -d '{"messages":[{"role":"user","content":"Outline a plan to learn Rust in 30 days."}]}'
python
import os, requests

account = os.environ["CF_ACCOUNT_ID"]
token = os.environ["CF_API_TOKEN"]

resp = requests.post(
    f"https://api.cloudflare.com/client/v4/accounts/{account}/ai/run/@cf/nvidia/nemotron-3-120b-a12b",
    headers={"Authorization": f"Bearer {token}"},
    json={"messages": [{"role": "user", "content": "Outline a plan to learn Rust in 30 days."}]},
)
print(resp.json()["result"]["response"])