Overall Satisfaction with MemSQL
SingleStore DB (formerly MemSQL) is used as a persistent storage solution for Spark. We use SingleStore DB (formerly MemSQL) spark connector (Scala code) to bridge two techs. I am leading projects of using spark and SingleStore DB (formerly MemSQL) to process life science data. It solved the spark storage issue.
- Faster query speed than traditional SQL database.
- It con server in the pipeline to deal with streaming data with Kafka, spark streaming and SingleStore DB (formerly MemSQL).
- It is very scalable.
- Better tuning of SingleStore DB (formerly MemSQL) performance on Scale-up server
- SingleStore DB (formerly MemSQL) connection between spark failed when more than around 48 partitions data processing
- Provide faster python API for invoking SingleStore DB (formerly MemSQL)
- It offers me solution to solve spark storage problem.
- It adds more complexity of my application since multiple tech softwares are involved.
- More types of bugs will be encountered when doing streamliner, including hardware connection.
I have tried using CSV as a back-end storage, yet I/O is very heavy, direct transit from spark to SingleStore DB (formerly MemSQL) in memory really beats.