Google Cloud Speech-to-Text vs. IBM Watson Text to Speech

Overview
ProductRatingMost Used ByProduct SummaryStarting Price
Google Cloud Speech-to-Text
Score 8.4 out of 10
N/A
Speech-to-Text on Google Cloud is a tool used to convert speech into text using an API powered by Google’s AI technologies. The vendor states users can transcribe content in real time or from stored files; deliver a better user experience in products through voice commands; and, gain insights from customer interactions to improve service.
$0.02
per min
IBM Watson Text to Speech
Score 9.0 out of 10
N/A
IBM Watson Text to Speech is an API cloud service that enables users to convert written text into natural-sounding audio in a variety of languages and voices within an existing application or within Watson Assistant. It can be used to give a brand a voice and interact with users in their native language. Increase accessibility for users with different abilities, provide audio options to avoid distracted driving, or automate customer service interactions to eliminate hold times.N/A
Pricing
Google Cloud Speech-to-TextIBM Watson Text to Speech
Editions & Modules
Speech-to-Text V2 API
$0.016
per min
Speech-to-Text V1 API
$0.024
per min
No answers on this topic
Offerings
Pricing Offerings
Google Cloud Speech-to-TextIBM Watson Text to Speech
Free Trial
YesNo
Free/Freemium Version
YesNo
Premium Consulting/Integration Services
NoNo
Entry-level Setup FeeNo setup feeNo setup fee
Additional DetailsSpeech-to-Text V1 API V1 offers data residency for multi region only. Models include short, long, phone call, and video. V1 does not include audit logging. New customers get $300 in free credits and 60 minutes for transcribing and analyzing audio free per month, not charged against your credits. Speech-to-Text V2 API V2 offers data residency for multi and single region. Models include short, long, telephony, video, and Chirp. V2 does include audit logging and support for customer managed encryption keys.
More Pricing Information
Community Pulse
Google Cloud Speech-to-TextIBM Watson Text to Speech
Top Pros

No answers on this topic

Top Cons

No answers on this topic

Best Alternatives
Google Cloud Speech-to-TextIBM Watson Text to Speech
Small Businesses
RingCentral Contact Center
RingCentral Contact Center
Score 7.9 out of 10

No answers on this topic

Medium-sized Companies
Zoom Contact Center
Zoom Contact Center
Score 9.2 out of 10

No answers on this topic

Enterprises
Verint Speech Analytics
Verint Speech Analytics
Score 8.9 out of 10

No answers on this topic

All AlternativesView all alternativesView all alternatives
User Ratings
Google Cloud Speech-to-TextIBM Watson Text to Speech
Likelihood to Recommend
8.0
(20 ratings)
9.1
(7 ratings)
Usability
7.3
(1 ratings)
-
(0 ratings)
User Testimonials
Google Cloud Speech-to-TextIBM Watson Text to Speech
Likelihood to Recommend
Google
Google Cloud speech-to-text is best suited when you want to work on live calls and transcribe interviews, meetings, customer service calls, and other audio or video recordings into text format. This helps create searchable archives, generate meeting minutes, and improve accessibility for individuals with hearing impairments. The service can provide real-time captioning for live events, webinars, broadcasts, and presentations. This enhances accessibility for individuals who are deaf or hard of hearing and those viewing content in noisy environments or without sound. It does not work well where the internet bandwidth is not that good; it requires a very good and strong internet connection to work well. And also where there are strong accents, especially in the Mandarin language.
Read full review
IBM
I advise Watson for all scenarios where you need AI to speak. I don't use this for this goal, but I suppose that could be a good solution for tourism and travel: companies in the hospitality industry can make it easier for people to
get around and offer tours in numerous languages, all at the same time. In telecommunications, could be used to create customized messaging that the caller can use with customers, and it can generate words from a customer’s records that are read to them in a professional and friendly voice. It's very good for English, for Italian is a little "robotic" but the pronunciation is right.
Read full review
Pros
Google
  • An amazing tool which helps a lot in a meetings.
  • It's an efficient tool for improving efficiency by saving a lot of time typing. It saves at least 40-50% of our time, thus increasing efficiency.
  • Incredible accuracy with multiple accents & multiple language.
  • It takes punctuation into consideration.
Read full review
IBM
  • Improve customer experience and engagement.
  • Offers both on-premise and cloud deployments.
  • Automate requests and transactions in our agency.
  • Voice recognition.
  • Allows me to choose dialect which help select accent of the selected speaker.
Read full review
Cons
Google
  • The software does occasionally get confused by confusing terminology.
  • Its web-based interface can also feel a tad hard to use compared to more appealing desktop apps.
  • I've experienced the occasional technical issue, though the provider's support team is quick to troubleshoot.
Read full review
IBM
  • Add sentiment variation.
  • Add parameter in order to change the characteristic of the voice.
  • Use custom voice in order to tune the speaker.
Read full review
Usability
Google
I can share insights with stakeholders in record time. And robust API connections let me pipe text into my CRM, marketing automation, and other mission-critical systems
Read full review
IBM
No answers on this topic
Alternatives Considered
Google
The accuracy of Google Cloud Speech-to-Text is much better than any other tool. It has better API integration with 3rd party tools. The transcription is on at real-time basis with the best efficiency. It has good language support from across the globe. It provides better noise robustness compare to other tools.
Read full review
IBM
I've never been a fan of the robotic voice of text-to-speech tools and I've always thought they sounded flat and bland. Until I tried IBM Watson Text to Speech, that is. This service uses advanced deep learning techniques to synthesize speech output in natural-sounding voices. I was blown away by how good the voices were that I've been using it for personal stuff like journaling and brainstorming sessions. And I'm definitely going to be using this for podcasting and my YouTube channel.
Read full review
Return on Investment
Google
  • Automating the transcription process saved time and resources compared to manual transcription.
  • Speech-to-text enabled us to make audio content accessible to a wider audience, including individuals with disabilities.
  • We gained valuable insights into customer preferences, behaviors, and sentiment by analyzing voice data.
Read full review
IBM
  • Enhance rapid and convenient customer interaction.
  • Client issues are easy to solve by allocating critical info in their native language.
  • Automating requests and transactions reduce hold time which improves customer satisfaction.
Read full review
ScreenShots

Google Cloud Speech-to-Text Screenshots

Screenshot of audio transcription creation -  Using the Speech-to-Text API from within the Cloud Console by creating an audio transcription is done in just a few steps. It can transcribe short, long, and streaming audio.Screenshot of creating subtitles for videos using AI -  Transcriptions with captions and subtitles can be added to existing content or in real time to streaming content. Google's video transcription model can be used for indexing or subtitling video and/or multispeaker content and uses similar machine learning technology as YouTube does for video captioning.Screenshot of adding Speech-to-Text to apps - The video pictures covers how to add AI to an application without extensive machine learning model experience. The pretrained Speech-to-Text API lets users enable AI for applications.Screenshot of Language, speech, text, and translation with Google Cloud API - The pictures displays a section of Google training course, where learners use the Speech-to-Text API to transcribe an audio file into a text file, translate with the Google Cloud Translation API, and create synthetic speech with Natural Language AI.