Free Gemma 3 12B API on Cloudflare

Free Gemma 3 12B API via Cloudflare Workers AI — multilingual, vision-enabled open model within 10,000 free Neurons/day.

About the Model

Why use Gemma 3 12B?

Gemma 3 12B is Google's well-rounded open model with long context, broad multilingual support and image understanding (vision).

Capabilities

Great for multilingual chat, summarization, and vision-grounded Q&A, with a strong quality-to-size ratio.

Why it's free

Covered by Cloudflare Workers AI's 10,000 free Neurons per day via the /ai/run endpoint.

How to Access for Free (via Cloudflare Workers AI)

Free access on Cloudflare Workers AI

Call Gemma 3 12B through Cloudflare's global edge using the unified /ai/run REST endpoint. Every Cloudflare account includes 10,000 Neurons per day for free, covering multilingual chat and vision-grounded tasks at no cost.

Authentication (BYOK)

Use your own Cloudflare Account ID and an API token with Workers AI access via Authorization: Bearer <token>.

Try it in your browser

Pick a model, paste your own free key, and run. Your key is sent once to call the provider and never stored on our servers.

The model's reply will stream here.

Code Examples

curl
curl https://api.cloudflare.com/client/v4/accounts/$CF_ACCOUNT_ID/ai/run/@cf/google/gemma-3-12b-it \
  -H "Authorization: Bearer $CF_API_TOKEN" \
  -d '{"messages":[{"role":"user","content":"List three uses for a paperclip."}]}'
python
import os, requests

account = os.environ["CF_ACCOUNT_ID"]
token = os.environ["CF_API_TOKEN"]

resp = requests.post(
    f"https://api.cloudflare.com/client/v4/accounts/{account}/ai/run/@cf/google/gemma-3-12b-it",
    headers={"Authorization": f"Bearer {token}"},
    json={"messages": [{"role": "user", "content": "List three uses for a paperclip."}]},
)
print(resp.json()["result"]["response"])