Amazon S3 annotations: attach rich, queryable context directly to your objects

AI summary · key takeaways

• S3 annotations support up to 1,000 named annotations per object, each up to 1 MB in size, totaling 1 GB per object—significantly more than the 2 KB limit of user-defined metadata or 10 object tags • Annotations are mutable and can be modified or deleted independently without rewriting the underlying object, enabling context to evolve alongside data • When S3 Metadata is enabled, annotations automatically flow into managed tables queryable via Amazon Athena and can be accessed for objects in any storage class without retrieval charges • Annotations automatically move with objects during copy, replication, and cross-region transfers, eliminating the need for complex metadata synchronization workflows • The feature supports AI-driven workflows through integration with tools like the S3 Tables MCP server, enabling natural language queries for data discovery

Today, we’re announcing a new metadata capability for Amazon Simple Storage Service (Amazon S3) called annotations, enabling you to attach rich, large-scale business context directly to your objects. You can store up to 1,000 named annotations per object, each up to 1 MB in size, totaling up to 1 GB per object, in flexible formats like JSON, XML, YAML, or plain text.

You can modify or delete an annotation at any time, without re-writing your objects, making it easy to keep your object context current. Organizations are building AI agents and autonomous workflows that need to find, understand, and act on data without human intervention. To support these agentic workflows, you need metadata that can evolve alongside the data, scale to petabytes of objects, and remain queryable without expensive retrieval.

With S3 annotations, you can store context such as AI-generated transcripts, content ratings, or technical specifications directly alongside your objects. Your context moves automatically with the object during copy, replication, and cross-region transfers, and S3 removes it when you delete the object. When you enable S3 Metadata , annotations automatically flow into fully managed annotation tables that you can query with Amazon Athena and other analytics engines.

Common use cases Annotations solve complex metadata challenges across industries: Media & Entertainment : Track transcripts, content moderation results, subtitle files, and licensing metadata as separate annotations on video assets, eliminating the need to synchronize metadata across multiple media asset management systems. Financial Services : Attach AI-generated investment summaries and sentiment analysis to research documents, enabling autonomous research agents to discover relevant datasets through natural-language queries without maintaining separate metadata databases.

Life Sciences : Annotate clinical trial data with regulatory status, patient cohort details, and approval chains, making compliance audits faster while keeping full context accessible for archived data in Amazon S3 Glacier storage classes without retrieval charges. How annotations address metadata challenges Amazon S3 already supports several ways to describe your objects.

System-defined metadata captures properties like size and storage class. Object tags support operational tasks like access control and lifecycle management. User-defined metadata lets you add small amounts of custom information at upload time.

While these capabilities work well for their intended purposes, they have limitations when you need to attach much richer context without building and maintaining separate metadata systems. Annotations address these needs by providing metadata capabilities at a fundamentally different scale and flexibility, offering mutable, queryable context per object compared to 10 immutable tags or 2 KB of headers.

Capability Max size Mutable? Best for System-defined metadata Fixed No Object properties (size, storage class, creation time) User-defined metadata 2 KB No (set at upload) Small custom key-value pairs Object tags 10 tags, 128/256 characters per key/value Yes Access control, lifecycle rules, cost allocation Annotations 1 GB (1,000 × 1 MB) Yes Rich business context (JSON, XML, YAML, plain text) Today, metadata describing S3 objects often lives in separate databases or sidecar files, requiring complex synchronization workflows that can exceed data storage costs.

When you enable S3 Metadata annotation tables, this context becomes queryable at scale through Amazon Athena. AI agents can discover your data through natural language with the S3 Tables MCP server , which provides a standardized interface for AI models to query your annotations. You can query annotations for objects in any storage class, without restoring the objects or paying retrieval charges.

Getting started with annotations To start using annotations, make sure your AWS Identity and Access Management (IAM) policy or bucket policy grants permissions for the s3:PutObjectAnnotation and s3:GetObjectAnnotation actions. You can then add annotations to any existing or new S3 object using the PutObjectAnnotation API. For example, a media company can attach technical specifications and AI-produced summaries to a video asset using the AWS Command Line Interface (AWS CLI) : # Create a JSON file with technical metadata cat > mediainfo.

json << 'EOF' {"codec":"H. 265","resolution":"3840x2160","audio_tracks":8,"frame_rate":29. 97} EOF # Attach it as an annotation aws s3api put-object-annotation \ --bucket my-media-bucket \ --key videos/documentary-2026.

mp4 \ --annotation-name mediainfo \ --annotation-payload . /mediainfo. json # Attach a plain-text AI-generated summary as a separate annotation echo "A 90-minute nature documentary covering wildlife migration patterns across three continents, featuring aerial footage and underwater sequences.

Languages: English, Spanish, Portuguese." > ai_summary. txt aws s3api put-object-annotation \ --bucket my-media-bucket \ --key videos/documentary-2026.

mp4 \ --annotation-name ai_summary \ --annotation-payload . /ai_summary. txt These commands attach two separate annotations to the same video object.

The mediainfo annotation stores structured technical specifications as JSON, while the ai_summary annotation stores a text description. Each annotation is identified by a unique name, and you can read and modify each one independently. With unique names for each annotation, you can use different annotations to support multiple concurrent enrichment workflows, for example, one team adding technical metadata while another team adds content classifications, without interfering with each other.

Retrieve a specific annotation using the GetObjectAnnotation API: aws s3api get-object-annotation \ --bucket my-media-bucket \ --key videos/documentary-2026. mp4 \ --annotation-name mediainfo \ . /mediainfo-output.

Originally published at aws.amazon.com

#AI-workflows #Aws #Cloud #Cloud-Storage #Infrastructure #Metadata-Management

Amazon S3 annotations: attach rich, queryable context directly to your objects

Talk to an architect about applying this to your stack.

More from the journal

Cloudflare DMARC Management is now generally available

AWS WAF adds AI traffic monetization capability to help content owners charge AI bots for content access

Growing the Cloudflare AI team with talent from Ensemble AI