dotAI
dotAI integrates powerful AI tools into your dotCMS instance, allowing new horizons of automation — content and image generation, semantic searches, and more. Through workflows, dotAI is capable of performing batch operations — such as adding images to any content that's missing an image, or automatically generating SEO metadata to large swaths of content, adding content tags, and numerous other tasks.
dotAI supports multiple AI service providers, configured via the providerConfig JSON in the App settings. OpenAI is a commonly used provider. Additionally, the dotAI features will soon be available by default; at this moment, it requires one of two activation methods:
- Enable the dotAI tool through the admin panel (dotCMS 24.04.05 or later), or
- Deploy the dotAI plugin (earlier versions that meet the requirements)
Requirements#
This feature requires the following:
- An API key for your chosen AI provider (e.g., OpenAI), obtained separately;
- Postgres 18 with the pgvector extension installed.
- If you're on dotCMS Cloud, we'll handle it!
- For self-hosted customers, see below.
Self-Hosted#
For embeddings to function, a vector extension must be added to the Postgres database. The dotAI plugin will add this extension automatically, but this process requires dotCMS's database user has superuser privileges, ensuring extensions can be installed.
If the database user does not have sufficient rights, it may be necessary for IT or administrators to manually add the extension. The simplest implementation is via the pgvector/pgvector Docker tag, easily accessible via the command docker pull pgvector/pgvector. The image can be applied to a docker-compose.yml by adding it to the database section:
db: image: pgvector/pgvector
Note also that these privileges are only required for the extension's installation, and not for its subsequent use.
App Configuration#
dotAI is configured via a single Provider Config (JSON) field in Settings > Apps > dotAI. All configuration — provider credentials, model selection, prompts, and behavioral settings — is expressed as a JSON object in this field.
The Provider Config JSON approach is relatively new; for the prior config, see Legacy Configuration. For transitioning, see our migration guide.
The JSON has up to four top-level properties: chat, embeddings, image, and settings.
The chat, embeddings, and image properties accept the following fields:
| Field | Description |
|---|---|
provider | The AI provider to use (e.g., "openai"). |
apiKey | API key for this provider. Masked as ***** in the UI after saving. |
model | Model name(s) to use. Supply a comma-separated list to enable fallback behavior — when the first model is unavailable, the next is tried. Example: "gpt-4o,gpt-4o-mini" |
endpoint | Custom API endpoint URL. Omit for standard provider endpoints. |
maxTokens | Maximum tokens per response. |
maxRetries | Number of retry attempts on failure. |
temperature | (chat section only) Controls response randomness (0–2). |
The settings property carries behavioral and prompt configuration:
| Setting | Default | Description |
|---|---|---|
rolePrompt | "You are dotCMSbot..." | Prompt describing the role the AI plays. |
textPrompt | "Use Descriptive writing style." | Prompt describing the overall writing style of generated text. |
imagePrompt | "Use 16:9 aspect ratio." | Aspect ratio or visual style guidance for image generation. |
imageSize | "1024x1024" | Default dimensions of generated images. |
listenerIndexer | {} | JSON object mapping index names to Content Types for auto-indexing. Most useful on the System Host to propagate indexes across sites. Example: { "default": "blog,news,webPageContent" } |
temperature | 1 | Default temperature for chat completions. |
embeddingsSplitAtTokens | 512 | Token chunk size for splitting content during embedding. |
embeddingsMinimumTextLength | 64 | Minimum character length for a text chunk to be embedded. |
embeddingsMinimumFileSize | 1024 | Minimum file size (bytes) for binary files to be embedded. |
embeddingsFileExtensions | pdf,doc,docx,txt,html | File extensions eligible for embedding. |
embeddingsSearchThreshold | .25 | Default similarity threshold for embedding searches. |
embeddingsThreads | 3 | Number of concurrent embedding threads. |
embeddingsThreadsMax | 6 | Maximum concurrent embedding threads. |
embeddingsThreadsQueue | 10000 | Embedding thread queue depth. |
embeddingsCacheTtlSeconds | 600 | Embeddings cache TTL in seconds. |
embeddingsCacheSize | 1000 | Embeddings cache maximum size. |
embeddingsDeleteOldOnUpdate | true | Whether to delete old embeddings when content is updated. |
debugLogging | false | Enable verbose debug logging. |
Only include settings that differ from the defaults shown above — omitted keys fall back to their default values.
Each site can have its own providerConfig, or inherit the configuration from SYSTEM_HOST. To configure a specific site, select it from the site picker in Settings > Apps > dotAI before saving.
Once installed and configured, the dotCMS surfaces a variety of utilities. The section below provides a brief overview of each, and a link to further documentation.
Configuration Examples#
Minimal (standard OpenAI, default behavior)#
Sufficient for most deployments. Omit any section you don't use.
{ "chat": { "provider": "openai", "apiKey": "sk-...", "model": "gpt-4o", "maxTokens": 16384, "maxRetries": 3 }, "embeddings": { "provider": "openai", "apiKey": "sk-...", "model": "text-embedding-ada-002" }, "image": { "provider": "openai", "apiKey": "sk-...", "model": "dall-e-3" } }
With Custom Settings#
Use the settings block only for values that differ from the defaults.
{ "chat": { "provider": "openai", "apiKey": "sk-...", "model": "gpt-4o,gpt-4o-mini", "maxTokens": 16384, "temperature": 0.7, "maxRetries": 3, "endpoint": "https://your-proxy.example.com/v1/chat/completions" }, "embeddings": { "provider": "openai", "apiKey": "sk-...", "model": "text-embedding-ada-002" }, "image": { "provider": "openai", "apiKey": "sk-...", "model": "dall-e-3" }, "settings": { "rolePrompt": "You are a helpful assistant for Acme Corp.", "textPrompt": "Be concise and professional.", "imagePrompt": "Use a clean, corporate visual style.", "imageSize": "1792x1024", "listenerIndexer": { "default": "blog,news,webPageContent" }, "embeddingsSplitAtTokens": 256, "embeddingsSearchThreshold": 0.3, "debugLogging": false } }
This configuration is more advanced, including model fallbacks and a custom endpoint.
Per-Site Config#
To configure a specific site, go to Settings > Apps > dotAI, select the target site from the site picker, and save a separate providerConfig JSON.
To verify which host's config is being applied, check the configHost field in the GET response:
GET /api/v1/ai/completions/config?siteId=your-site-id
{ "providerConfig": "{ ... }", "configHost": "SYSTEM_HOST" }
If configHost returns the target site's hostname, the per-site config is active.
Legacy Configuration#
To transition from the legacy fields to the new JSON providerConfig approach, see our migration guide.
Before the implementation of the provider configuration specified above, the dotAI App Configuration followed a different, multiple-field pattern. A full list of the legacy fields follows:
| Field | Description |
|---|---|
| API Key | Your account's API key; must be present to utilize OpenAI services. |
| Model Names | A comma-separated list of the models used to generate OpenAPI responses. Including multiple models also enables fallback behavior; when a specified model is not found, the next one is used. Example: gpt-4o-mini,gpt-3.5-turbo-16k,gpt-4o |
| Role Prompt | A prompt describing the role (if any) the text generator will play for the dotCMS user. |
| Text Prompt | A prompt describing the overall writing style of generated text. |
| Tokens per Minute | Permits configurable rate limiting for text responses based on token use. |
| API per Minute | Permits configurable rate limiting for text responses based on API call volume. |
| Max Tokens | Permits configurable rate limiting for token consumption per API response. |
| Completion model enabled | If checked, causes text responses to incorporate completions. Completions are useful for interactive chat modes and other dynamic uses, capable of incorporating response histories into future responses. |
| Image Model Names | A comma-separated list of the image models used to generate graphical responses. Including multiple models also enables fallback behavior; when a specified model is not found, the next one is used. |
| Image Prompt | A specification of output aspect ratio. If the ratio specified differs significantly from the Image Size (below), the image will "letterbox" accordingly. |
| Image Size | Selects the default dimensions of generated images. |
| Image Tokens per Minute | Permits configurable rate limiting for image responses based on token use. |
| Image API per Minute | Permits configurable rate limiting for image responses based on API call volume. |
| Image Max Tokens | Permits configurable rate limiting for token consumption per image generation API response. |
| Image Completion model enabled | If checked, causes image responses to incorporate completions. Completions are useful for interactive chat modes and other dynamic uses, capable of incorporating response histories into future responses. |
| Embeddings Model Names | A comma-separated list of the image models used to generate graphical responses. Including multiple models also enables fallback behavior; when a specified model is not found, the next one is used. |
| Embeddings Tokens per Minute | Permits configurable rate limiting for embeddings responses based on token use. |
| Embeddings API per Minute | Permits configurable rate limiting for embeddings responses based on API call volume. |
| Embeddings Max Tokens | Permits configurable rate limiting for token consumption per embeddings API response. |
| Embeddings Completion model enabled | If checked, causes embedding responses to incorporate completions. Completions are useful for interactive chat modes and other dynamic uses, capable of incorporating response histories into future responses. |
| Auto Index Content Config | Allows App-level configuration of content indexes used as the basis for text generation. Takes a JSON mapping; each property name becomes an index, and each value is the Content Type it will take as its target content. Optional; indexes are also fully configurable under the dotAI Tool. Most useful when configured in the System Host, as this will instantiate the indexes across multiple sites. |
| Custom Properties | Additional key-value pairs for dotAI configuration. |
Using dotAI#
The dotAI feature includes several components, detailed separately:
| Component | Description |
|---|---|
| dotAI Tool | The dotAI admin-panel interface can be found via Tools -> dotAI, allowing direct usage, index definition, and general configuration of the feature. |
| AI Blocks | dotAI's integration with the Block Editor field provides the most straightforward way to get started generating content. |
| AI Workflows | AI Workflow Sub-Actions permit a range of asynchronous automations utilizing AI — such as generating entire contentlets on demand. |
| AI Viewtool | The AI Viewtool, accessible through $ai, allows AI operations via Velocity script. |
| API Resources | REST API endpoints allow AI operations to be performed headlessly. |