OpenAI Integration

Route your OpenAI API traffic through Rivaro for runtime enforcement. SDK configuration, supported endpoints, streaming, and function calling.

SDK Configuration

Python

from openai import OpenAI

client = OpenAI(
    api_key="sk-your-openai-key",
    base_url="https://your-org.rivaro.ai/v1",
    default_headers={
        "X-Detection-Key": "detect_live_your_key_here"
    }
)

Node.js

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-your-openai-key',
  baseURL: 'https://your-org.rivaro.ai/v1',
  defaultHeaders: {
    'X-Detection-Key': 'detect_live_your_key_here'
  }
});

curl

curl https://your-org.rivaro.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-openai-key" \
  -H "X-Detection-Key: detect_live_your_key_here" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Supported Endpoints

Endpoint	Method	Description
`/v1/chat/completions`	POST	Chat completions (GPT-4, GPT-3.5, o1, o3, o4)
`/v1/completions`	POST	Text completions (legacy)
`/v1/embeddings`	POST	Text embeddings
`/v1/moderations`	POST	Content moderation
`/v1/audio/transcriptions`	POST	Audio transcription (Whisper)
`/v1/audio/translations`	POST	Audio translation
`/v1/models`	GET	List available models
`/v1/batches`	POST	Batch API

All request and response formats match the OpenAI API exactly. Rivaro is a transparent proxy — your existing code works unchanged.

Streaming

Streaming works out of the box. Rivaro forwards content chunks to your application in real time.

stream = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

How enforcement interacts with streaming

Content chunks (delta.content) are forwarded to your application immediately as they arrive from OpenAI.
Egress detection runs on the accumulated full response after the stream completes.
If a policy violation is detected in the response, enforcement is applied after accumulation (e.g. the violation is logged or the actor's trust score is adjusted).
Ingress detection on your prompt messages runs before the request is forwarded to OpenAI. If a policy blocks the input, OpenAI is never called and you receive a block response immediately.

Function / Tool Calling

OpenAI function calling and tool use works through the proxy. Rivaro inspects tool definitions in your request and tool calls in the response.

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "What's the weather in London?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string"}
                },
                "required": ["city"]
            }
        }
    }]
)

What Rivaro does with tool calls

Request side: Rivaro reads the tools[] array in your request. If a policy blocks specific tools (e.g. shell execution, database writes), those tools are filtered from the request before it reaches OpenAI.
Response side: Rivaro extracts tool_calls[] from the response (choices[0].message.tool_calls). Each tool call's function.name and function.arguments are inspected against detection rules.
Streaming: During streaming, tool call chunks (delta.tool_calls) are accumulated. Detection runs on the complete tool call after the stream finishes.

Embeddings

Embeddings requests are proxied through the same enforcement pipeline. Ingress detection runs on the input text.

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quarterly revenue report shows..."
)

Streaming is not supported for embeddings (this matches OpenAI's behavior).

Allowed Models

If your AppContext is configured with an allowed models list, only those models can be used through the proxy. Requesting a model not in the list returns a 403 error:

{"error": "The requested model is not permitted by your API key's policy."}

If no allowed models list is configured, all models are permitted.

Blocked Requests

When Rivaro blocks a request (ingress policy violation), the response format matches OpenAI's structure but with enforcement content:

Non-streaming:

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Content blocked due to policy violations"
    },
    "finish_reason": "content_filter"
  }]
}

Streaming:

data: {"blocked":true,"message":"Content blocked due to policy violations"}

Your application can check for finish_reason: "content_filter" to detect enforcement blocks programmatically.

Headers

Header	Required	Description
`X-Detection-Key`	Yes	Your Rivaro detection key
`Authorization`	Yes	`Bearer sk-...` (your OpenAI API key)
`Content-Type`	Yes	`application/json`

All other headers are passed through to OpenAI unchanged.

Next steps

Error Handling — Handle Rivaro-specific errors in your application
Understanding Detections — What Rivaro scans for
API Reference — Full endpoint reference

SDK Configuration​

Python​

Node.js​

curl​

Supported Endpoints​

Streaming​

How enforcement interacts with streaming​

Function / Tool Calling​

What Rivaro does with tool calls​

Embeddings​

Allowed Models​

Blocked Requests​

Headers​

Next steps​