Cloudera Data Platform (CDP), launched September 2019, is designed to combine the best of Hortonworks and Cloudera technologies to deliver an enterprise data cloud. CDP includes the Cloudera Data Warehouse and machine learning services as well as a Data Hub service for building custom business applications.
$0.04
per CCU (hourly rate)
HashiCorp Terraform
Score 8.8 out of 10
N/A
Terraform from HashiCorp is a cloud infrastructure automation tool that enables users to create, change, and improve production infrastructure, and it allows infrastructure to be expressed as code. It codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned. It is available Open Source, and via Cloud and Self-Hosted editions.
$0
Pricing
Cloudera Data Platform
HashiCorp Terraform
Editions & Modules
CDP Public Cloud - Data Hub
$0.04
per CCU (hourly rate)
CDP Public Cloud - Data Warehouse
$0.054
per CCU (hourly rate)
CDP Public Cloud - Data Engineering
$0.07
per CCU (hourly rate)
CDP Public Cloud - Operational Database
$0.08
per CCU (hourly rate)
CDP Public Cloud - Flow Management
$0.15
per CCU (hourly rate)
CDP Public Cloud - Machine Learning
$0.17
per CCU (hourly rate)
CDP Private Cloud - Plus Edition
$400
CCU (annual subscription)
CDP Private Cloud - Base Edition
$10,000.00
node + variable (annual subscription)
Open Source
$0
Team & Governance
$20/user
per user/per month
Enterprise
Contact sales team
Offerings
Pricing Offerings
Cloudera Data Platform
HashiCorp Terraform
Free Trial
No
No
Free/Freemium Version
No
Yes
Premium Consulting/Integration Services
No
No
Entry-level Setup Fee
No setup fee
No setup fee
Additional Details
—
—
More Pricing Information
Community Pulse
Cloudera Data Platform
HashiCorp Terraform
Features
Cloudera Data Platform
HashiCorp Terraform
Configuration Management
Comparison of Configuration Management features of Product A and Product B
I have seen that Cloudera Data Platform is well suited for large batch processes. It works really well for our indication analyses that are performed by the actuaries. I feel that rapid streaming operations may be a situation where additional technology would be needed to provide for a robust solution.
Anything that needs to be repeated en masse. Terraform is great at taking a template and have it be repeated across your estate. You can dynamically change the assets they're generating depending on certain variables. Which means though templated assets will all be similar, they're allowed to have unique properties about them. For example flattening JSON into tabular data and ensuring the flattening code is unique to the file's schema.
The language itself is a bit unusual and this makes it hard for new users to get onboarded into the codebase. While it's improving with later releases, basic concepts like "map an array of options into a set of configurations" or "apply this logic if a variable is specified" are possible but unnecessarily cumbersome.
The 'Terraform Plan' operation could be substantially more sophisticated. There are many situations where a Terraform file could never work but successfully passes the 'plan' phase only to fail during the 'apply' phase.
Environment migrations could be smoother. Renaming/refactoring files is a challenge because of the need to use 'Terraform mv' commands, etc.
I love Terraform and I think it has done some great things for people that are working to automate their provisioning processes and also for those that are in the process of moving to the cloud or managing cloud resources. There are some quirks to HCL that take a little bit of getting used to and give picking up Terraform a little bit of a learning curve, thus the rating
Terraform's performance is quite amazing when it comes to deployment of resources in AWS. Of course, the deployment times depend on various parameters like the number of resources to deploy and different regions to deploy. Terraform cannot control that. The only minor drawback probably shows up when a terraform job is terminated mid way. Then in many cases, time-consuming manual cleanup is required.
We have utilized Cloudera support quite frequently and are very satisfied with the capability and responsiveness of that team. Often, the new features delivered with the platform give us an opportunity to mature the way we're doing things, and the support team have been valuable in developing those new patterns.
I have yet to have an opportunity to reach out directly to HashiCorp for support on Terraform. However, I have spent a great deal of time considering their documentation as I use the tool. This opinion is based solely on that. I find the Terraform documentation to have great breadth but lacking in depth in many areas. I appreciate that all of the tool's resources have an entry in the docs but often the examples are lacking. Often, the examples provided are very basic and prompt additional exploration. Also, the links in the documentation often link back to the same page where one might expect to be linked to a different source with additional information.
IBM's offering of the Cloud Pak for Data has been a moving target and difficult to compare to Cloudera Data Platform. We have implemented our solution on Amazon Web Services, which appears to be supported by IBM at this point, but the migration would be very expensive for us to endeavor.
Terraform is the solid leader in the space. It allows you to do more then just provisioning within a pre-existing servers. It is more extensible and has more providers available than it competitors. It is also open source and more adopted by the community then some of the other solutions that are available in the market place.
we are able to deploy our infrastructure in a couple of ours in an automated and repeatable way, before this could take weeks if the work was done manually and was a lot of error prone.
having the state file, you can see a diff of what things have changed manually out side of Terraform which is a huge plus
if state file gets corrupted, it is very hard to debug or restore it without an impact or spending hours ..
writing big scale code can be very challenging and hard to be efficient so it's usable by the whole team