AI Configuration

DataReporter’s AI query feature allows users to ask questions about their data in plain English. As an administrator, you need to configure at least one AI provider before users can use this feature.

Supported Providers

ProviderAPI Key RequiredSelf-HostedBest For
OpenAIYesNoGeneral-purpose, widely used
Google GeminiYesNoFast responses, large context window
Anthropic ClaudeYesNoPrecise SQL generation, strong reasoning
OllamaNoYesPrivacy-sensitive environments, no external API calls

Configuration Methods

AI providers can be configured in two ways. Settings configured via the Admin UI take precedence over environment variables.

  1. Log in as an administrator
  2. Go to Settings > General
  3. Scroll down to the AI Query Settings section
  4. Set your preferred Default Provider and Default Model
  5. Enter API keys for the providers you want to enable
  6. Click Save

API keys are stored encrypted in the database and are masked in the UI after saving (only the last 4 characters are visible).

Tip: Leave a field blank to fall back to the environment variable value. This lets you set a base configuration via environment variables and override per-organization via the UI.

Method 2: Environment Variables

Add these variables to your .env file or container environment:

VariableDescriptionDefault
OPENAI_API_KEYOpenAI API key(empty)
GEMINI_API_KEYGoogle Gemini API key(empty)
ANTHROPIC_API_KEYAnthropic Claude API key(empty)
OLLAMA_API_URLOllama server URLhttp://ollama:11434
AI_PROVIDERDefault AI provider (openai, gemini, anthropic, ollama)(auto-detect)
AI_MODELDefault model override(provider default)
AI_QUERY_TIMEOUTMaximum time in seconds for AI API calls60
AI_MAX_RESULT_ROWSMaximum rows returned from AI-generated queries10000
AI_SHOW_FAILED_SQLShow the generated SQL even when validation failsfalse

Provider Setup

OpenAI

  1. Create an account at platform.openai.com
  2. Go to API Keys and create a new key
  3. Set OPENAI_API_KEY in your environment or enter it in the Admin UI

Available models: gpt-4o, gpt-4o-mini, gpt-4.1-mini, gpt-4.1

Default model: gpt-4o-mini

Google Gemini

  1. Go to aistudio.google.com
  2. Create an API key
  3. Set GEMINI_API_KEY in your environment or enter it in the Admin UI

Available models: gemini-2.5-flash, gemini-2.5-pro

Default model: gemini-2.5-flash

Note: The google-genai Python package must be installed. It is included in the default DataReporter Docker image.

Anthropic Claude

  1. Create an account at console.anthropic.com
  2. Go to API Keys and create a new key
  3. Set ANTHROPIC_API_KEY in your environment or enter it in the Admin UI

Available models: claude-sonnet-4-20250514, claude-haiku-4-5-20251001

Default model: claude-sonnet-4-20250514

Ollama (Self-Hosted)

Ollama lets you run open-source AI models locally without sending data to external APIs.

  1. Install Ollama on a server accessible from your DataReporter instance (ollama.com)
  2. Pull a model: ollama pull deepseek-r1:7b
  3. Set OLLAMA_API_URL to your Ollama server address (e.g., http://ollama-host:11434)

Recommended models: deepseek-r1:7b, llama3:8b, codellama:13b

Default model: deepseek-r1:7b

Note: Ollama is always shown as available if a URL is configured, since it does not require an API key. Make sure the Ollama server is running and accessible.

Choosing a Default Provider

If no default provider is set, DataReporter auto-detects the first available provider (based on which API keys are configured). To set a specific default:

  • Admin UI: Select from the “Default Provider” dropdown in Settings > General
  • Environment: Set AI_PROVIDER=gemini (or openai, anthropic, ollama)

Choosing a Default Model

Each provider has a sensible default model. To override:

  • Admin UI: Enter the model name in the “Default Model” field
  • Environment: Set AI_MODEL=gpt-4o

Leave blank to use the provider’s default model.

Security Considerations

  • SQL validation: All AI-generated queries are validated to ensure they are read-only SELECT statements. INSERT, UPDATE, DELETE, DROP, and other dangerous statements are automatically rejected.
  • Injection protection: The validator blocks SQL injection patterns including multi-statement execution, file operations (INTO OUTFILE), and timing attacks (SLEEP, BENCHMARK).
  • Schema-only access: The AI receives only table and column names from the database schema. It does not have access to actual data rows.
  • Permission enforcement: Users can only query data sources they have been granted access to. The AI feature respects DataReporter’s existing permission model.
  • API key storage: Keys configured via the Admin UI are stored in the database and masked when displayed. They are never exposed to non-admin users or included in API responses.
  • Result limits: Query results are capped at AI_MAX_RESULT_ROWS (default 10,000) to prevent accidentally loading massive datasets.

Troubleshooting

ProblemSolution
”AI provider not configured” messageSet at least one API key via Admin UI or environment variables
”No schema available” errorGo to Settings > Data Sources, click your data source, and click “Refresh Schema”
Timeout errorsIncrease AI_QUERY_TIMEOUT or use a faster model (e.g., gpt-4o-mini instead of gpt-4o)
Poor query qualityTry a more capable model, or be more specific in your questions. Include table names if you know them.
Ollama connection refusedVerify the Ollama server is running and the URL is correct. Check network/firewall rules between containers.