Red Hat Ceph Storage is the most cost effective and resilient storage solution when operating at petabyte scale!
September 09, 2016

Red Hat Ceph Storage is the most cost effective and resilient storage solution when operating at petabyte scale!

Colby Shores | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User

Overall Satisfaction with Red Hat Ceph Storage

We where planning on using Ceph storage at one point as a replacement for our Netapp. We had the equipment available on hand in order to make it work but in the end we in our experimentation it wasn't quit the fit we where looking for. We where looking for a highly resilient storage medium to hold our production data and eventually to hold the VMs themselves.
  • Highly resilient, almost every time we attempted to destroy the cluster it was able to recover from a failure. It struggled to when the nodes where down to about 30%(3 replicas on 10 nodes)
  • The cache tiering feature of Ceph is especially nice. We attached solid state disks and assigned them as the cache tier. Our sio benchmarks beat the our Netapp when we benchmarked it years ago (no traffic, clean disks) by a very wide margin.
  • Ceph effectively allows the admin to control the entire stack from top to bottom instead of being tied to any one storage vendor. The cluster can be decentralized and replicated across data centers if necessary although we didn't try that feature ourselves, it gave us some ideas for a disaster recovery solution. We really liked the idea that since we control the hardware and the software, we have infinite upgradability with off the shelf parts which is exactly what it was built for.
  • Ceph is very difficult to set up when we used it. One had to be very careful in how they assigned their crush maps and cache tiering to get it to work right otherwise performance would be impacted and data would not be distributed evenly. From the .96 version I ran, it really is intended to be used for massive data centers in the petabytes. Beyond that the command line arguments for ceph-deploy and ceph are very involved. I would strongly recommend this as a back end for Open Stack with a dedicated Linux savvy storage engineer. Red Hat also said they are working to turn Calamari in to a full featured front end to manage OSD nodes which should make this much easier to manage in the future.
  • It should not be run off of VMs themselves since it is not optimized for a VM Kernel. This advice is coming directly from Red Hat. Unfortunately this means that smaller use cases are out of the question since it literally requires 10 physical machines, each with their own OS to become individual OSD nodes.
  • I believe this is an issue with the OSDs and not the monitors which ran fine for us in a virtual machine environment.
  • We where looking at using this as a NFS work alike and in our experiments encountered a couple of issues. the MDS server struggled to mount the CephFS file system on more than a few systems without seizing up. This isn't a huge concern when it is used as a back end for Open Stack however when using this as shared storage for production data on a web cluster proved to be problematic to us. We also would have liked to have NFS access to the Ceph monitors so we could attach this to VMWare in order to store our VMDKs since VMWare does not support mounting CephFS. When we spoke with VMWare about 7 months ago they said NFS support is in the pipeline which will address all of these concerns.
  • CephFS was unable to handle several mounts at the same time. We will revisit NFS capabilities once available.
  • We gained quit a bit of experience with Ceph and we have a cluster on hand if our storage vendor doesn't pan out at any time in the future.
  • It had a negative impact in the time it took for us to test set up and test the cluster. Like I explained earlier, it was quite difficult to set up for experimentation. That said though, we have a very broad understanding of Ceph for our future products.
Red Hat Ceph storage is most comparable with VMware Virtual SAN which we currently use in production. It had about the same default resiliency although we had far more customization options with Ceph albeit more difficult to configure. VMware Virtual SAN is such an expensive item that it was worth it for us to explore Ceph as an alternative. Both had similar cons of being best mated with their preferred hypervisors (Open Stack as opposed to VMware ESXi) and neither had NFS access.

Netapp was less performant than our Ceph cluster with cache tiering and far more proprietary however it does have NFS support which is crucial when being used as a storage back end for VMWare. We found our Netapp to be far less resilient as well as it does require regular maintenance.

We found for our medium sized business that the Nutanix had the upgradability, NFS support, and performance and much of the resiliency of Ceph so we decided to go in that direction. If our company where operating at petabyte scale however, Ceph is hands down the best solution available!
It is absolutely, hands down the best storage solution for Open Stack. I would even argue it is the only solution if a company is operating at petabyte scale and need resiliency. The storage solution allows any organization to scale their environment using commodity hardware from top to bottom. It has a battle tested track record where it is even being used as the data storage back end for the Large Hadron Collider at Cern.