Introduction

All videos should have closed captions available, which are captions that can be turned on or off. Captions are the text alternative to the audio component of any video. Captions are different than subtitles, as subtitles only display the spoken dialogue but do not include audio elements such as sound effects. Captions provide a comprehensive and full experience of the video content, whereas only providing subtitles potentially eliminates important content from the video. This guide covers best practices when writing captions.

Basic Guidelines

  1. Captions need to match the timing while remaining on the screen long enough to be read. Between 1.5 and 6 seconds is best practice.
  2. There should be maximum of two lines of captions on the screen at a time.
  3. Lines should not exceed about 30 characters.
  4. While attention should be paid to grammar conventions, it is important that the captions match what the speakers are saying. See Other Info for some specific examples.

back to top

When to Caption

The best rule of thumb is to caption everything that holds value for someone with full access to the soundtrack; this sometimes includes a lack of sound. The following list will identify moments you will need to add captions.

  1. Important Sounds: scenes with multiple sounds (e.g. a conversation in public place) can be chaotic.
    1. Caption the most prominent sound or the sound that offers the most value.
  2. Dialogue: speakers must always be captioned.
    1. Identify speakers and tones when they cannot be inferred.
    2. Caption verbal/oral bridges (e.g. “um” and “uh”) unless they hold no value (i.e. during a live speech).
  3. Sound Effects: sound effects almost always need to be captioned at every iteration.
    1. An exception would be a sound repeated at regular intervals (e.g. footsteps), especially when the sounds can be inferred (i.e. the feet can be seen walking), and other sounds take precedent.
  4. Music/Background Noise: music and other noise is usually added to build mood or a tone and needs to always be captioned.
  5. Lack of Sound: silence can be just as valuable as sound.
    1. Identify moments when the sound cuts or fades out.
    2. Identify moments when speakers are not heard (i.e. a character is moving their lips without speaking).
  6. Muffled/Distorted Sound: much like lack of sound, muffled and distorted sounds need to be identified.

back to top

What to Write

The hardest part about captioning is deciding what to write. Except for dialogue, the sounds of a video can be hard to describe effectively. The following list will give you tips on what to write when creating captions.

  1. General Rule: Sound effects should be enclosed in square brackets [  ] to separate them from dialogue.
  2. Dialogue: Write exactly what the speakers are saying (see 2b above for an exception).
    1. When the speaker cannot be seen, use italics or write “off-screen” or “VO” (for Voice Over).
    2. Identify speakers by name or role (e.g. Man #1) when it is not visually clear who is speaking.
    3. Identify hard to hear dialogue as “unintelligible.”
  3. Sound Effects: Always name the object making the sound (e.g. “[engine revving]” or “[clock ticking]”) as meaning is created with what makes the sound to those with access to the soundtrack.
    1. Avoid using descriptive onomatopoeias as they can be subjective or valueless.
  4. Music/Background Noise: 
    1. Music: name the song and artist or the instrument and identify with music notes (keyboard shortcut varies by program)(e.g. “♪ This is the greatest show ♪”). If either is unclear, do your best to describe the tone or mood of the music (e.g. “somber music” or “eerie music”).
    2. Background Noise: Identify the noise (e.g. “[crowd cheering]” or “[birds chirping]”)
  5. Lack of Sound: 
    1. Depending on the context, it may be appropriate to establish that a sound stopped (e.g. “[clapping stops]”) or to simply identify the sudden silence (e.g. “[silence]”). A combination of both might be appropriate.
    2. Always identify when a sound fades away slowly.
    3. Write “[mouths words]” or “[inaudible]” if a character is moving their lips without speaking audibly.
  6. Muffled/Distorted Sound: Identify the type of distortion (e.g. “[muffled]” or “[echoing]”). Identify when it fades or ends; you can also identify when the sound is “normal.”

Remember, if the audio element holds value for someone with access to the full audible soundtrack, it needs to be captioned.

back to top

Other Caption Considerations

  • While attention should be paid to grammar conventions, the captions should nearly always match what the speaker is saying verbatim.
    • For example, some speakers whose primary language is different from American English might not be used to using plurals and their speech and the corresponding captions should match (e.g. “Thirty-five bird were affected by this event.”)
    • Some American English dialects differ from conventional grammar and their captions should match their speech (e.g. “They was very invested in change.”)
    • One exception to matching exactly what the speaker is saying is for folks who pronounce the word “ask” as “ax.” Using “ax” could be contextually confusing, so we recommend writing it as “ask.”
  • For non-English languages, caption the actual foreign words. If it is not possible to caption the words, use a description (e.g. “[speaking French]”). Use accent marks, diacritical marks, and other indicators.
  • Indicate regional accents at the beginning of the first caption.
  • Keep the flavor of dialect and the speaker’s language.

back to top

Connect with the Instructional Accessibility Group

Improve your instructional accessibility through the IAG live trainings, access checks for individual materials, or course reviews.

Have more questions or need additional assistance? Email the Instructional Accessibility Group.