Skip to main content

OpenAI Integration

Route your OpenAI API traffic through Rivaro for runtime enforcement. SDK configuration, supported endpoints, streaming, and function calling.

SDK Configuration

Python

from openai import OpenAI

client = OpenAI(
api_key="sk-your-openai-key",
base_url="https://your-org.rivaro.ai/v1",
default_headers={
"X-Detection-Key": "detect_live_your_key_here"
}
)

Node.js

import OpenAI from 'openai';

const client = new OpenAI({
apiKey: 'sk-your-openai-key',
baseURL: 'https://your-org.rivaro.ai/v1',
defaultHeaders: {
'X-Detection-Key': 'detect_live_your_key_here'
}
});

curl

curl https://your-org.rivaro.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-openai-key" \
-H "X-Detection-Key: detect_live_your_key_here" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello"}]
}'

Supported Endpoints

EndpointMethodDescription
/v1/chat/completionsPOSTChat completions (GPT-4, GPT-3.5, o1, o3, o4)
/v1/completionsPOSTText completions (legacy)
/v1/embeddingsPOSTText embeddings
/v1/moderationsPOSTContent moderation
/v1/audio/transcriptionsPOSTAudio transcription (Whisper)
/v1/audio/translationsPOSTAudio translation
/v1/modelsGETList available models
/v1/batchesPOSTBatch API

All request and response formats match the OpenAI API exactly. Rivaro is a transparent proxy — your existing code works unchanged.

Streaming

Streaming works out of the box. Rivaro forwards content chunks to your application in real time.

stream = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Explain quantum computing"}],
stream=True
)

for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")

How enforcement interacts with streaming

  • Content chunks (delta.content) are forwarded to your application immediately as they arrive from OpenAI.
  • Egress detection runs on the accumulated full response after the stream completes.
  • If a policy violation is detected in the response, enforcement is applied after accumulation (e.g. the violation is logged or the actor's trust score is adjusted).
  • Ingress detection on your prompt messages runs before the request is forwarded to OpenAI. If a policy blocks the input, OpenAI is never called and you receive a block response immediately.

Function / Tool Calling

OpenAI function calling and tool use works through the proxy. Rivaro inspects tool definitions in your request and tool calls in the response.

response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What's the weather in London?"}],
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}
}]
)

What Rivaro does with tool calls

  • Request side: Rivaro reads the tools[] array in your request. If a policy blocks specific tools (e.g. shell execution, database writes), those tools are filtered from the request before it reaches OpenAI.
  • Response side: Rivaro extracts tool_calls[] from the response (choices[0].message.tool_calls). Each tool call's function.name and function.arguments are inspected against detection rules.
  • Streaming: During streaming, tool call chunks (delta.tool_calls) are accumulated. Detection runs on the complete tool call after the stream finishes.

Embeddings

Embeddings requests are proxied through the same enforcement pipeline. Ingress detection runs on the input text.

response = client.embeddings.create(
model="text-embedding-3-small",
input="The quarterly revenue report shows..."
)

Streaming is not supported for embeddings (this matches OpenAI's behavior).

Allowed Models

If your AppContext is configured with an allowed models list, only those models can be used through the proxy. Requesting a model not in the list returns a 403 error:

{"error": "The requested model is not permitted by your API key's policy."}

If no allowed models list is configured, all models are permitted.

Blocked Requests

When Rivaro blocks a request (ingress policy violation), the response format matches OpenAI's structure but with enforcement content:

Non-streaming:

{
"choices": [{
"message": {
"role": "assistant",
"content": "Content blocked due to policy violations"
},
"finish_reason": "content_filter"
}]
}

Streaming:

data: {"blocked":true,"message":"Content blocked due to policy violations"}

Your application can check for finish_reason: "content_filter" to detect enforcement blocks programmatically.

Headers

HeaderRequiredDescription
X-Detection-KeyYesYour Rivaro detection key
AuthorizationYesBearer sk-... (your OpenAI API key)
Content-TypeYesapplication/json

All other headers are passed through to OpenAI unchanged.

Next steps