Welcome to OctoAI! Our mission is to enable users to harness value from the latest AI innovations by delievering efficient, reliable, and customizable AI systems for your apps. Run your models or checkpoints on our cost-effective API endpoints, or run our optimized GenAI stack in your environment.

Get started with inference

  1. Sign up for an account - new users get $10 of free credits
  2. Run your first inference:
  • Navigate to a model page and click Get Token:
  • Copy the code sample to run an inference:
cURL
curl -X POST "https://text.octoai.run/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $OCTOAI_TOKEN" \
    --data-raw '{
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello world"
            }
        ],
        "model": "mixtral-8x7b-instruct",
        "max_tokens": 512,
        "presence_penalty": 0,
        "temperature": 0.1,
        "top_p": 0.9
    }'

Next steps

Additional Resources