Google BigQuery for analyzing large ML datasets using SQL
Updated December 20, 2019
Google BigQuery for analyzing large ML datasets using SQL
Score 8 out of 10
Vetted Review
Verified User
Overall Satisfaction with Google BigQuery
Google BigQuery is used for data warehouse as a ML analytics engine company wide specifically for consumer behavioral analytics with data streams coming coming out of website as well as internal data sets.
Pros
- It is easy to create and then execute machine learning models in BigQuery using SQL queries using BigQuery ML. Everyone knows SQL.
- Google BigQuery is fully serverless/cloud based and can be up and running in few hours without need for any specific coding or integration if your data is already is Google Storage.
- Google BigQuery executes the SQL statements very fast and can can be used for real-time analytics especially if you use Google infrastructure ( GCP).
Cons
- Google BigQuery is great for large data sets where you need a familiar SQL interface but it is still slower than running the same SQL query on RDBMS, assuming your data is mostly structured.
- It is expensive if you have a lot of data that needs to be queried each time the query is run due to the license metrics used in Google BigQuery.
- Some of the SQL operations like table join are not optimized and can be slow compared to a full database.
- Being server-less, fully managed cloud server, Google BigQuery has a positive impact on the business in terms of amount of setup time and deployment resources needed to analyze a data set.
- Positive impact on ROI due to reduction in CapEx and OpEx needed to provision a data warehouse upfront.
- Positive impact on ROI due to no improvement in the speed of analyzing consumer data using Google BigQuery in real-time and proactively take action when/if needed based on the results.
Google BigQuery needs minimal setup to get it up and running while Amazon Redshift and Oracle Analytics Cloud need moderate expertise and time to load a data set and run a query. Hadoop (open source) and its commercial version Cloudera do not provide a full out of the box solution for data warehousing and need additional components and installs. Databricks is a smaller vendor and does not come into picture if you are already an Oracle or a Google shop (=using their cloud, DB, et al.)
Do you think Google BigQuery delivers good value for the price?
Yes
Are you happy with Google BigQuery's feature set?
Yes
Did Google BigQuery live up to sales and marketing promises?
Yes
Did implementation of Google BigQuery go as expected?
Yes
Would you buy Google BigQuery again?
Yes
Comments
Please log in to join the conversation