Skip to main content

Bring your provider

ViscribeAI focuses on the image workflow: source handling, prompts, strict structured output, parsing, and typed results. You bring the model provider and credentials that fit your stack. The built-in clients use OpenAI-compatible Chat Completions. That means you can use the default OpenAI SDK configuration or pass compatible client options for providers that expose an OpenAI-style API.
https://mintcdn.com/viscribeai-147892a8/JRzEzplq6pig5sOk/assets/openai.svg?fit=max&auto=format&n=JRzEzplq6pig5sOk&q=85&s=57215096d5dd62f063565fdbcbf3ad98

OpenAI Compatible

Use any vision-capable model provider that exposes an OpenAI-style Chat Completions API with image inputs and structured output.

Environment variables

For the default OpenAI client, store credentials in your environment.
export OPENAI_API_KEY="sk-your-provider-key"
export OPENAI_MODEL="gpt-5-mini"
OPENAI_MODEL is only used by the examples in this repository. In application code, pass the model explicitly through model_config or modelConfig.

Python configuration

from viscribe.images import describe

result = describe(
    image_path="examples/venice.png",
    model_config={
        "model": "gpt-5-mini",
        "api_key": "sk-your-provider-key",
        "base_url": "https://example-compatible-provider.com/v1",
        "temperature": 1,
        "max_retries": 2,
    },
)
Python client options such as api_key, base_url, timeout, and max_retries are passed to the underlying OpenAI client. Other keys are sent with the model request.

TypeScript configuration

import { images } from "viscribe";

const result = await images.describe({
  imagePath: "examples/venice.png",
  modelConfig: {
    model: "gpt-5-mini",
    apiKey: "sk-your-provider-key",
    baseURL: "https://example-compatible-provider.com/v1",
    temperature: 1,
    maxRetries: 2,
  },
});
TypeScript client options such as apiKey, baseURL, timeout, and maxRetries are passed to the underlying OpenAI client. Other keys are sent with the model request.
Use a vision-capable model that supports image inputs and structured output through an OpenAI-compatible Chat Completions interface.