Speech-to-Text on Google Cloud is a tool used to convert speech into text using an API powered by Google’s AI technologies. The vendor states users can transcribe content in real time or from stored files; deliver a better user experience in products through voice commands; and, gain insights from customer interactions to improve service.
$0.02
per min
Vertex AI
Score 8.6 out of 10
N/A
Vertex AI on Google Cloud is an MLOps solution, used to build, deploy, and scale machine learning (ML) models with fully managed ML tools for any use case.
$0
Starting at
Pricing
Google Cloud Speech-to-Text
Vertex AI
Editions & Modules
Speech-to-Text V2 API
$0.016
per min
Speech-to-Text V1 API
$0.024
per min
Imagen model for image generation
$0.0001
Starting at
Text, chat, and code generation
$0.0001
per 1,000 characters
Text data upload, training, deployment, prediction
$0.05
per hour
Video data training and prediction
$0.462
per node hour
Image data training, deployment, and prediction
$1.375
per node hour
Offerings
Pricing Offerings
Google Cloud Speech-to-Text
Vertex AI
Free Trial
Yes
Yes
Free/Freemium Version
Yes
Yes
Premium Consulting/Integration Services
No
No
Entry-level Setup Fee
No setup fee
Optional
Additional Details
Speech-to-Text V1 API
V1 offers data residency for multi region only. Models include short, long, phone call, and video. V1 does not include audit logging. New customers get $300 in free credits and 60 minutes for transcribing and analyzing audio free per month, not charged against your credits.
Speech-to-Text V2 API
V2 offers data residency for multi and single region. Models include short, long, telephony, video, and Chirp. V2 does include audit logging and support for customer managed encryption keys.
Pricing is based on the Vertex AI tools and services, storage, compute, and Google Cloud resources used.
More Pricing Information
Features
Google Cloud Speech-to-Text
Vertex AI
AI Development
Comparison of AI Development features of Product A and Product B
Real-time meeting notes for the smaller group audience. Strong language coverage of over 125+ languages. Handles mobile phone recordings and environmental noise effectively. Fast transcription turnaround also supports phrases, which improves industry-specific terminology. Generating QA/compliance audit logs. Also builds the sentences with accurate punctuation and sentence boundaries. It has vast global support centers whose primary focus in resolving customer issues and help multinational engineering in building great products
we used Vertex AI on our automation process the model very useful and working as expected we have implemented in our monitoring phase this very helpful our analysis part. real time response is very effective and actively provide detailed overview about our products.this phase is well suited in our org. this model could not applicable for small level projects why because this model not needed for small level projects and without related resource of ML this model not useful. strictly on non cloud org not suitable means on pram not suitable
Vertex AI comes with support for LOTs of LLMs out of the box
MLOps tools are available that help to standardize operational aspects
Document AI is an out of the box feature that works just perfectly for our use cases of automating lots to tedious data extraction tasks from images as well as papers
Integration outside of the google eco system is challenging here.
Google Cloud Speech-to-Text works only with active internet connection if the internet bandwidth is low it effect the transcription process and can lead to data inaccuracy.
In terms of the pricing also this is at higher range which all the companies cannot afford like small scale organisation if they would like to use the tool they would look over the price to make the decision. Reducing the price can increase the product usage more
The reasoning behind my 10 is that the UI is very intuitive; I didn't require any formal training to use it. Google's speech-to-text is not just a conversion tool; it helps automate mundane tasks, saves time, and has an almost human-like understanding.
Google is always top notch with their security and user interface performance. We use Google's entire suite in our business anyways, so using Vertex became second nature very quickly. I will say, though, that Google does need to come down on the price somewhat with their token allocation. Also, their UI is very robust, so it does require some time for training to really master it.
Google Cloud Speech-to-Text outperformed its competitors significantly in terms of accuracy, surpassing any other product available. Additionally, its support for multiple languages was unrivaled in the market. Moreover, for clients with robust bandwidth, Google Cloud Speech-to-Text offered real-time transcription capabilities, enabling users to transcribe live audio streams with minimal delay.
We tend to adapt and use the platform that suits the customers needs the best. We return to Vertex AI because it is the most in-depth option out there so we can configure it any which way they want. However, it is not quick to market and constantly changing or updating it's feature-set. This makes it suitable for bigger customers that have the capital and time to spend on a bigger project that is well researched and not quick to market like some of the other options that feel like a light-version of this.