saas messaging 101

Text-to-speech mismatch in SaaS videos

September 13, 2024
3-min read

Written by Victoria Rudi

Educating on SaaS messaging & team comms. Helping SaaS people with messaging across all touchpoints.
This doc explains what text-to-speech mismatch is and how it harms SaaS videos.

Text-to-speech mismatch is when you use text for reading as a base for video scripts or voiceovers. And you do that without adapting the text for spoken delivery.

It’s an error that often occurs in SaaS videos—such as product descriptions, updates, demos, and more. Teams use text meant for reading as their narration scripts and voiceovers.

And that’s a big problem because there’s a big difference between text meant for reading and text meant for listening.

Here’s a real-life video narration script:

We have done some new updates to improve bulk action in Data Shares, which include, number one, the option to ‘select all’ and ‘clear’ on all possible selections when choosing which Data Shares to copy from another workspace. Number two, the Data Warehouse and Excel list pages now have bulk action to delete, restore, and enable, disable.

Read it first. You may be able to understand it. However, chances are you won’t be able to make sense of it when hearing it.

Written language doesn’t translate well when spoken. That’s the effect of text-to-speech mismatch. Let’s analyze why is this happening.

There’s a set of conditions at play when people read a text. Just think about it.

When reading a text, people:

  • May take their time. Usually, there’s nothing rushing them to understand the text.
  • Don’t have to move to the next sentence, until they’ve understood the previous one.
  • Can always go back to previous sentences for repeated reading.
  • Have the space to process complicated ideas or terms.
  • Decide the pace at which they want to read the text.

That’s why, text meant for reading may include:

  • Complex sentences that require some attention
  • Technical terms that need time to process
  • Extensive explanations that allow the reader to pause and re-read
  • Longer paragraphs that include detailed information
  • In-text references or citations

However, all these elements don’t translate well in spoken narration. Why is that the case?

When listening to a video, people:

  • Have to exert mental effort to keep up with the narration and understand quickly.
  • Have to move along with the speaker even if they didn’t understand the previous idea.
  • Can’t go back to the previous sentence for revision.
  • Have no time to sit with new ideas or terms to process them.
  • Are at the mercy of the speaker’s pace. They have to focus on the sentence that comes next.

People won’t enjoy your videos if you use text meant for reading as your narration scripts or voiceovers. The sentence’s complexity alone may hinder understanding. This might backfire by causing people to disconnect from your videos.

To keep this from happening, never use text for reading as your video narration script or voiceover. Always adapt the text for listening. Even better, create your scripts and voiceovers from scratch, following these guidelines:

  • Break down overly long sentences.
  • Use simple sentence structure.
  • Change technical jargon for simple words
  • Reduce the information you deliver per sentence.
  • Mix short and long sentences to create a dynamic rhythm
  • Add hooks, such as questions, to engage your viewers.
  • Use active voice.
  • Include intentional pauses to let key ideas sink in.

It may take you some practice. But with time, you’ll learn to create easy-to-follow narration scripts and voiceovers for your SaaS videos.

free services

Get help with your SaaS messaging

The catch? Service for a testimonial.

Free pdf messaging roast

know your messaging issues
get actionable solutions
it’s called a roast, but it’s friendly

Free 30-min consulting

we’ll work on 1 messaging issue
we’ll identify quick solutions
we’ll outline your next steps