Kubernetes, a good cluster management system to place bets on
February 04, 2019

Kubernetes, a good cluster management system to place bets on

Nitin Pasumarthy | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User

Overall Satisfaction with Kubernetes

Kubernetes is currently used as an experimental product for building and managing Machine Learning pipelines (ML) at LinkedIn. It is currently used by very few teams to access GPU clusters. Kubernetes makes it easy to deploy training and monitoring workloads on clusters really simple with a robust CLI. It has a very small learning curve as is mainly driven by config files.
  • Complex cluster management can be done with simple commands with strong authentication and authorization schemes
  • Exhaustive documentation and open community smoothens the learning process
  • As a user a few concepts like pod, deployment and service are sufficient to go a long way
  • We had several problems with its NFS, which is responsible for syncing the code across the cluster
  • On several instances the pods go into UNKNOWN state in which case restarting the entire node is the only solution
  • As a user of the existing setup given to me, I wasn't able to allocate only some CPU cores on a single host. It was either all or zero making cluster utilization sub-optimal
  • It enabled us to move faster with our experimental ML pipeline
  • Being an experimental setup, we faced several hiccups during deployments like pods going into UNKNOWN state demanding immediate attention
  • Though it had rough edges, NFS was quite useful for distributed Machine Learning training. It made development very simple
  1. Kubernetes is very easy to get started and to set up
  2. It has various deployment options, file systems and service types making it suitable for several use cases besides Machine Learning
  3. Extends the functionality of Docker's rich functionality making it a deadly combination
  4. The rough edges in file system, utilization and resource management should be fixed to be adopted as a standard in a company
  5. Its extremely vast Python library makes it easy to build services on top of kubernetes. However the API is quite complex and documentation is quite poor