What's Happening On Screen? Audio Description for Videos

I'll be honest: before researching this post, audio description was the accessibility standard I knew the least about. I understood captions for deaf users - that's straightforward. But audio description? I knew it existed, but had never actually implemented it or really understood when it was necessary.

As we plan to create more video content - particularly interview-style videos and live demos - I realized I needed to understand this standard properly. And as a heavy podcast listener myself, I really appreciate being able to consume content while doing other things - driving, exercising, cooking. I don't always want to be glued to a screen. That preference helped me understand why audio description matters: it lets blind users experience video content the same way I experience podcasts - through audio alone.

Here's what I learned.

The Standard

WCAG 1.2.5 Audio Description (Prerecorded) - Audio description is provided for all prerecorded video content in synchronized media.

Audio description is narration that describes the visual elements of a video - actions, scene changes, on-screen text, facial expressions - anything important that's happening visually but isn't obvious from the audio alone. Think of it as the visual equivalent of captions.

When Is Audio Description Actually Necessary?

This was my first big question. Not every video needs audio description. Here's what I learned:

Videos That Don't Need Audio Description

  • Talking heads with minimal visual content - If your video is primarily someone speaking to camera with no important visual information, audio description may not be needed
  • Podcasts with static imagery - Audio-only content with a still image doesn't require description
  • Videos where speakers describe what they're doing - If the presenter says "Now I'm clicking the Save button," you've already described the action

Videos That Do Need Audio Description

  • Screen recordings or demos - Where mouse movements, menu selections, and visual changes are critical
  • Videos with on-screen text - Titles, captions, bullet points, or other text that doesn't appear in the audio
  • Videos showing physical actions - Cooking demonstrations, craft tutorials, repair instructions
  • Videos with important facial expressions or body language - Where non-verbal communication conveys meaning
  • Videos with scene changes - Where location or context shifts matter to understanding

The key test: If someone listened to your video with their eyes closed, would they understand everything that's happening? If not, you need audio description.

How Audio Description Works

Audio description is an additional audio track that plays alongside your video. A narrator describes the visual elements during natural pauses in the dialogue. The description is timed to fit between the existing audio, providing context without talking over the main content.

For example, in a cooking video:

  • Main audio: "Next, we'll add the butter to the pan."
  • Audio description: "She places a tablespoon of butter in the center of the hot pan. It immediately begins to melt and bubble."
  • Main audio: "Once it's melted, we can add the garlic."

Two Types of Audio Description

Standard Audio Description

Fits descriptions into the natural pauses in your video's audio. This works well for content with frequent breaks in dialogue or narration.

Extended Audio Description

Pauses the video when necessary to provide complete descriptions. This is useful for videos with dense visual content and little downtime in the audio. While not suitable for live broadcasts, it's ideal for educational content, detailed tutorials, or any pre-recorded video where comprehensive description is important.

Tools and Services for Audio Description

Here's where I really had to do my homework. Audio description is more specialized than captioning, and the options vary widely:

Professional Services

Several companies specialize in audio description:

  • 3Play Media - Offers audio description alongside captioning services, uses AI voices with human review
  • Descriptive Video Works - Specializes exclusively in audio description with professional narrators, supports 22+ languages
  • Verbit - Provides both standard and extended audio description in English, Spanish, and French
  • Video Caption Corporation - In-house audio description from script writing to audio mixing

Professional services typically charge per minute of video and have turnaround times of 3-5 business days. Expect to pay more for human narrators versus AI voices, and more for extended description versus standard.

DIY and Community Options

  • YouDescribe - Free tool that lets anyone add audio descriptions to YouTube videos without modifying the original video. Great for community-contributed descriptions.
  • ViddyScribe - AI-powered tool for generating audio descriptions, designed for teams managing large video libraries

All of those are ways to offload the technical work to other services -- but it turns out the mechanics of creating an audio description track using a VTT file is not difficult. A VTT file is a transcription that many video players can use to show captions sync'd to video. They basically contain a time-stamp range, and then the text to show. I've created these before. An Audio Description track is pretty much the same -- you just set it to a "description" track in a player like Able Player that supports audio descriptions. Here's a great blog post describing more.

Planning Ahead: The Best Approach

The Harvard Digital Accessibility Services guide makes an excellent point: the best time to add audio description is during production, not post-production.

When planning video content:

  • Have speakers identify themselves on camera rather than relying on lower-third graphics
  • Have presenters describe what they're doing: "I'm clicking the File menu, then selecting Save As"
  • Read on-screen text aloud as part of your script
  • Describe visual elements as part of your narration

This approach - sometimes called "integrated described video" - reduces or eliminates the need for separate audio description tracks.

Audio Description and Drupal

This is where things get interesting - and where my research revealed both capabilities and gaps.

The Able Player Module

Drupal has the Able Player module, which wraps videos in an accessible HTML5 player. Able Player supports audio description tracks and can handle multiple caption languages. It's designed to work with WebVTT files for captions and can reference separate audio description tracks.

The player includes a button that lets users toggle between the standard version and the audio-described version of the video - a seamless user experience.

The Reality: Manual Work Required

While Drupal's media system makes it easy to upload videos and attach caption files, audio description is more manual. You'll need to:

  1. Create the audio description (via service or DIY)
  2. Mix it with your video or provide it as a separate track
  3. Upload both versions to your media library
  4. Configure the player to reference both

There's no "generate audio description" button in Drupal (unlike captions, where automated transcription services are common). Audio description requires human creativity and judgment to decide what's important to describe and how to describe it.

What About Live Content?

This standard applies to pre-recorded content. For live streams, webinars, or live demos, a related standard (WCAG 1.2.9, Level AAA) addresses live audio description, but it's not required at Level AA. However, if you're doing live events, you should at least consider:

  • Live captions (which are required at Level AA)
  • Describing your actions as you perform them
  • Recording the session and adding audio description to the recording afterward

Practical Next Steps

As we plan our video strategy, here's what I'm taking away:

  1. Script with description in mind - Build visual descriptions into our scripts from the start
  2. Evaluate each video - Not every video needs separate audio description if we describe as we go
  3. Budget for professional services - When we do need separate audio description, professional services provide the best quality
  4. Test the Able Player module - For videos that need audio description, ensure our video player supports it properly
  5. Provide transcripts - As an alternative, full text transcripts benefit everyone and can supplement audio description

The Bottom Line

Audio description is more complex than I initially realized - but that complexity comes from the creative work of deciding what to describe and how to describe it effectively. It's not something you can fully automate, because it requires judgment about what visual information matters for understanding.

The good news? If you're thoughtful about how you produce video content, you can dramatically reduce the need for post-production audio description. Build accessibility into your production process, and you'll create content that works better for everyone.

For those of us just starting to think seriously about video accessibility, the key takeaway is this: audio description exists to ensure blind and low-vision users get the same information from your videos that sighted users do. Whether you achieve that through integrated description during production or separate audio tracks afterward, the goal is the same - make sure everyone can access your content.

Video file

Advent 2025 - 24 days of accessibility

Add new comment

The content of this field is kept private and will not be shown publicly.

Filtered HTML

  • Web page addresses and email addresses turn into links automatically.
  • Allowed HTML tags: <a href hreflang> <em> <strong> <blockquote cite> <cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h1> <h2 id> <h3 id> <h4 id> <h5 id> <p> <br> <img src alt height width>
  • Lines and paragraphs break automatically.