/ inference · private alpha

Open models, one API call away.

Turion Inference runs open ML models — speech-to-text, image generation, and your own Cog-packaged models — on dedicated GPUs behind a simple key-authed API, with a Python SDK and CLI. Now in private alpha.

Statusprivate alpha

HardwareRTX PRO 6000

InterfaceAPI · SDK · CLI

Alpha pricingfree

Join the waitlist no card · small-batch onboarding

/ 01 · the api

A model run is one call.

~/demo — zsh

$ export TURION_API_TOKEN=tk_...

$ turion run whisper -i audio=@meeting.wav
[06a3b2] status=starting
[06a3b2] status=processing
[06a3b2] status=succeeded
{
  "text": "And so my fellow Americans, ask not what your
           country can do for you...",
  "language": "en"
}

import turion

client = turion.Client()  # reads TURION_API_TOKEN

out = client.run("whisper", input={"audio": "meeting.wav"})
print(out["text"])

art = client.run("illustrious-xl", input={
    "prompt": "(masterpiece)1.2, watercolor lighthouse at dawn",
    "width": 832, "height": 1216,
})
print(art["images"][0])  # hosted result URL

/ 02 · how it works

Waitlist to first prediction.

Join the waitlist

Tell us what you want to run. We onboard in small batches so every alpha user gets real capacity, not a queue.

Get your API key

You receive a key and a base URL. Auth is a bearer token — no OAuth dance, no console required.

Run models

Call the REST API directly, or use the Python SDK and CLI. File inputs upload automatically; results come back as JSON and hosted URLs.

/ 03 · models

A small catalog, curated on purpose.

Whisper

speech-to-text

OpenAI's Whisper, rebuilt on current PyTorch for our GPUs. Send audio, get a transcript with language detection — validated end-to-end on the same hardware that serves the alpha.

audio → text · json

Illustrious XL

image generation

SDXL-class illustration checkpoint with the full parameter set — prompt weighting, PAG, CLIP skip, scheduler choice. Returns hosted image URLs.

text → image · png

Your Cog image

on request

The platform runs Cog-packaged models — the same container format Replicate uses. During the alpha we onboard additional models per request; tell us what you need on the waitlist form.

cog · custom models

/ 04 · the alpha, plainly

What you're signing up for.

Access: Private alpha. We onboard from the waitlist in small batches and work directly with each user.
Hardware: Dedicated NVIDIA RTX PRO 6000 (Blackwell) GPUs that we operate — your jobs are not resold spot capacity.
Pricing: Free during the alpha, within fair-use capacity limits. Pricing lands with the beta; alpha users get first access.
Interfaces: REST API, Python SDK, and CLI today. Web dashboard and self-serve keys arrive later in the program.

/ 05 · waitlist