AWS Rekognition Integration for dotCMS - Setup Guide

Native integration that provides automated image classification and tagging using Amazon's AI-powered Rekognition service.

What It Does#


The AWS Rekognition integration in dotCMS provides AI-powered automatic image tagging through workflow automation. When an image passes through a workflow action configured with the Rekognition actionlet, AWS analyzes the image and automatically generates relevant tags based on detected objects, scenes, and activities.

Key capabilities:

  • Automatic object and scene detection in images
  • AI-generated tags added directly to content
  • Workflow-based image analysis automation
  • Configurable confidence thresholds and label limits
  • Per-site AWS credential configuration
  • Prevents duplicate tagging with automatic tracking

Prerequisites#


Before setting up this integration, you need:

  1. dotCMS instance with admin access
  2. AWS account with billing enabled
  3. Content types with both:
    • Binary field (for images)
    • Tag field (for storing generated tags)
  4. Custom workflow to trigger the Rekognition actionlet

AWS Setup#


Step 1: Create or Select an AWS Account#

  1. Go to AWS Console
  2. Sign in or create a new AWS account
  3. Make sure billing is enabled

Step 2: Enable Amazon Rekognition Service#

  1. In AWS Console, go to Services → Rekognition
  2. Click Get Started with Amazon Rekognition if this is your first time
  3. The service is enabled automatically in your account

Step 3: Create IAM User for dotCMS#

Best practice: Create a dedicated IAM user with limited permissions for dotCMS.

  1. Go to IAM → Users
  2. Click Create user
  3. Enter username: dotcms-rekognition
  4. Click Next

Step 4: Assign Rekognition Permissions#

  1. Select Attach policies directly (third option)
  2. Search for Rekognition
  3. Check the box for AmazonRekognitionReadOnlyAccess (recommended for security)
    • Or AmazonRekognitionFullAccess if you need full access
  4. Click Next
  5. Review the user details
  6. Click Create user

Step 5: Create Access Keys#

After the user is created, you need to generate access keys separately:

  1. Go to IAM → Users
  2. Click on the dotcms-rekognition user you just created
  3. Go to the Security credentials tab
  4. Scroll down to the Access keys section
  5. Click Create access key
  6. Select use case:
    • Choose Third-party service or Application running outside AWS
    • Check the confirmation checkbox
  7. Click Next
  8. (Optional) Add a description tag like "dotCMS Rekognition Integration"
  9. Click Create access key
  10. IMPORTANT: Copy both values immediately:
    • Access key ID - Looks like AKIAIOSFODNN7EXAMPLE
    • Secret access key - Looks like wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
  11. Store these securely - the secret key is only shown once
  12. Click Done

Security Note: Never commit AWS credentials to source control. Store them only in dotCMS Apps configuration.

dotCMS Configuration#


Step 1: Configure AWS Rekognition App#

  1. Go to Settings → Apps
  2. Find Amazon Rekognition in the list
  3. Click to configure
  4. Select the site you want to configure (or System Host for global)
  5. Fill in the configuration:
    • AWS Access Key: Paste your AWS Access Key ID
    • AWS Secret Access Key: Paste your AWS Secret Access Key
    • Max Labels: Maximum number of tags to generate (default: 15)
      • Range: 1-100
      • Higher values = more tags generated
    • Min Confidence: Minimum confidence threshold (default: 75)
      • Range: 0-100
      • Higher values = only high-confidence tags
      • Lower values = more tags but potentially less accurate
  6. Click Save

Configuration is per-site: Each site can have its own AWS credentials and settings.

Step 2: Verify Content Type Structure#

Your content type must have:

  1. A Binary field - For storing the image

    • Field type: Binary
    • Example variable name: image, photo, asset
  2. A Tag field - For storing generated tags

    • Field type: Tag
    • Example variable name: tags, imageTags

If your content type is missing either field, add them before proceeding.

Step 3: Create or Edit Workflow#

  1. Go to Settings → Workflow Schemes
  2. Click Add Workflow or edit an existing workflow
  3. Create or edit a workflow action (e.g., "Auto Tag Image")
  4. Under Sub-Actions, click Add Sub-Action
  5. Select Auto Tag Images - AWS from the dropdown
  6. The actionlet has no additional parameters (configuration comes from Apps)
  7. Click Save

Important: The Rekognition actionlet should be placed after the Save actionlet in the workflow to ensure the image is saved before analysis.

Step 4: Assign Workflow to Content Type#

  1. Go to Content Types
  2. Select the content type that contains images
  3. Go to the Workflow tab
  4. Assign the workflow scheme that contains your Rekognition action
  5. Click Save

Usage#


Auto-Tagging Images#

  1. Create or edit content with an image
  2. Upload or select an image in the Binary field
  3. Save the content (this saves the image)
  4. Click the workflow action button (e.g., "Auto Tag Image")
  5. AWS analyzes the image and generates tags automatically
  6. Tags appear in the Tag field on the content
  7. Special tag added: TAGGED_BY_AWS is automatically added to prevent re-tagging

Viewing Generated Tags#

After triggering the workflow action:

  1. Refresh the content editor or navigate away and back
  2. Check the Tag field - you should see AI-generated tags
  3. Tags represent objects, scenes, and activities detected in the image
  4. The special tag TAGGED_BY_AWS indicates this image has been processed

How It Works Under the Hood#

When you trigger the Rekognition workflow action:

  1. Validation checks:

    • Verifies content type has a Tag field
    • Checks if image has already been tagged (TAGGED_BY_AWS tag exists)
    • Confirms a Binary field with an image exists
    • Validates AWS Rekognition app is configured
  2. Image processing:

    • If image is larger than 5MB, it's automatically resized to 1000px width
    • Image is sent to AWS Rekognition API in US-WEST-2 region
  3. Tag generation:

    • AWS analyzes the image using DetectLabels API
    • Returns labels matching the configured Max Labels and Min Confidence
    • Tags are added to the content's Tag field
    • Special TAGGED_BY_AWS tag is added to prevent duplicate processing
  4. Content refresh:

    • Content is refreshed to show new tags
    • Cache is cleared for the contentlet

Understanding Confidence Scores#

Min Confidence determines the quality threshold for tags:

  • 90-100: Only extremely confident matches (fewer tags, higher precision)
  • 75-89 (default): Good balance of accuracy and coverage
  • 50-74: More tags but potentially less accurate
  • Below 50: Not recommended - many false positives

Example:

  • Image of a golden retriever playing in a park
  • 95% confidence: "Dog", "Golden Retriever", "Animal"
  • 75% confidence: Above + "Park", "Grass", "Outdoors", "Playing"
  • 50% confidence: Above + "Mammal", "Canine", "Pet", "Nature", "Daylight"

Preventing Duplicate Tagging#

The integration automatically prevents re-tagging images:

  1. First time workflow runs: Tags are generated and TAGGED_BY_AWS is added
  2. Subsequent runs: Actionlet detects TAGGED_BY_AWS tag and skips processing
  3. To re-tag an image: Manually remove the TAGGED_BY_AWS tag, then run the workflow again

Best Practices#


Confidence Threshold Selection#

High-value content (75-90 confidence):

  • Marketing assets
  • Product images
  • Hero images
  • Use higher threshold for precision

Large image libraries (60-75 confidence):

  • DAM organization
  • Internal assets
  • Stock photos
  • Lower threshold for better coverage

Max Labels Selection#

Product images (5-10 labels):

  • Focused tagging
  • Specific objects
  • Cleaner tag sets

General DAM (15-25 labels):

  • Comprehensive tagging
  • Better discoverability
  • Multiple search angles

Workflow Design#

Recommended workflow order:

  1. Save Content actionlet (save the image first)
  2. Auto Tag Images - AWS actionlet (analyze and tag)
  3. (Optional) Notification actionlet (alert editors)
  4. (Optional) Publish actionlet (auto-publish if confidence is high)

Image Format Requirements#

Supported formats:

  • JPEG
  • PNG

NOT supported:

  • WebP (will fail with "Invalid image format" error)
  • GIF
  • BMP
  • TIFF
  • SVG

If using WebP images: Convert to JPEG or PNG before uploading, or configure dotCMS to auto-convert WebP to JPEG/PNG.

Image Size Optimization#

  • Images over 5MB are automatically resized to 1000px width
  • For faster processing, resize images before upload
  • Maximum file size accepted by AWS: 15MB
  • Recommended: Use JPEG for photos, PNG for graphics with transparency

Security#

  • Use AmazonRekognitionReadOnlyAccess IAM policy (minimum required)
  • Create dedicated IAM user for dotCMS (don't use root account)
  • Configure per-site credentials for multi-tenant environments
  • Rotate AWS access keys periodically

Troubleshooting#


"There is no config set, please set it via Apps Tool"#

Cause: AWS Rekognition app is not configured for the current site.

Solution:

  1. Go to Settings → Apps → Amazon Rekognition
  2. Select the correct site from dropdown
  3. Configure AWS credentials
  4. Save configuration

"There is no Tag Field in the Content Type"#

Cause: The content type doesn't have a Tag field.

Solution:

  1. Go to Content Types
  2. Edit the content type
  3. Add a new field with type "Tag"
  4. Save the content type

"There is no Binary Field or an Image is not set in it"#

Cause: Either the content type has no Binary field, or no image is uploaded.

Solution:

  1. Verify content type has a Binary field
  2. Upload an image to the Binary field
  3. Save the content before running the workflow

"Tags already generated"#

Cause: The image has already been tagged (has TAGGED_BY_AWS tag).

Solution:

  • This is expected behavior to prevent duplicate API calls
  • To re-tag: Remove the TAGGED_BY_AWS tag manually, then run workflow again

"Invalid image format" error#

Error in logs:

Unable to autogenerate the rekognition tags: Request has invalid image format
(Service: AmazonRekognition; Status Code: 400; Error Code: InvalidImageFormatException)

Cause: The image format is not supported by AWS Rekognition.

Solution:

  • AWS Rekognition only supports JPEG and PNG formats
  • Common unsupported formats: WebP, GIF, BMP, TIFF, SVG
  • Convert the image to JPEG or PNG before uploading
  • Check file extension matches actual format (sometimes .jpg files are actually WebP)

No tags generated / Empty tag field#

Possible causes:

  1. Unsupported image format (see above)

    • Convert to JPEG or PNG
  2. Min Confidence too high

    • Lower the Min Confidence threshold in Apps configuration
    • Try 60-70 for broader tag coverage
  3. Image quality is poor

    • AWS may not detect objects in low-quality/blurry images
    • Try with a clearer, higher-resolution image
  4. Image content is abstract

    • Rekognition works best with concrete objects and scenes
    • Abstract art or patterns may generate few tags
  5. AWS credentials are invalid

    • Verify Access Key ID and Secret Access Key are correct
    • Check IAM user has Rekognition permissions
    • Test credentials in AWS CLI or Console

AWS API Errors#

"The security token included in the request is invalid"

  • AWS Secret Access Key is incorrect
  • Re-copy the secret key from IAM user credentials

"Access Denied"

  • IAM user lacks Rekognition permissions
  • Add AmazonRekognitionReadOnlyAccess policy to IAM user

"Region not supported"

  • Integration uses US-WEST-2 region (hardcoded)
  • Ensure Rekognition is available in US-WEST-2
  • Contact dotCMS support if you need different region

Performance Issues#

Slow tagging:

  • Large images (>5MB) are automatically resized, which takes time
  • Resize images before upload for faster processing
  • AWS API response time varies (typically 1-3 seconds)

Workflow timeout:

  • Check dotCMS logs for detailed error messages
  • Verify network connectivity to AWS services
  • Check AWS service status page for outages

Technical Details#


API Implementation#

The integration uses:

  • AWS SDK for Java - com.amazonaws.services.rekognition
  • AWS Region: US-WEST-2 (Oregon)
  • API Method: DetectLabels - Object and scene detection
  • Authentication: Static AWS credentials (Access Key + Secret)

Code Location#

In dotCMS core repository:

  • App Config: dotCMS/src/main/resources/apps/dotAmazonRekognition-config.yml
  • API Client: dotCMS/src/main/java/com/dotcms/rekognition/api/RekognitionAPI.java
  • Workflow Actionlet: dotCMS/src/main/java/com/dotcms/rekognition/actionlet/RekognitionActionlet.java

Supported Rekognition Features#

Currently implemented:

  • Object detection (DetectLabels)
  • Scene detection (DetectLabels)
  • Activity detection (DetectLabels)

Not currently implemented:

  • Face detection (DetectFaces)
  • Celebrity recognition (RecognizeCelebrities)
  • Text extraction (DetectText)
  • Unsafe content detection (DetectModerationLabels)
  • Custom label models

Note: The integration focuses on general-purpose image tagging. For advanced features, custom development would be required.

AWS Costs#


Amazon Rekognition pricing (as of 2026):

  • First 1 million images/month: $1.00 per 1,000 images analyzed
  • Over 1 million images/month: Volume discounts apply

Example costs:

  • 1,000 images = $1.00
  • 10,000 images = $10.00
  • 100,000 images = $100.00

See AWS Rekognition Pricing for current rates.

Cost optimization:

  • Use the TAGGED_BY_AWS mechanism to prevent re-analyzing images
  • Set appropriate Min Confidence to reduce noise
  • Limit Max Labels to control API response size

Resources#


Personal Test Credentials#


For testing/demo purposes only:

Access key: AKIAVUQRUO6SURVOULG2
Secret access key: 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

⚠️ WARNING: These credentials should be rotated before production use.

Next Steps#


This guide will be used as the basis for:

  1. AWS Rekognition integration page on dotcms.com
  2. Updated documentation on dev.dotcms.com/docs
  3. Sales enablement materials for AI-powered DAM capabilities