Seamless environment setup
March 05, 2018

Seamless environment setup

Anonymous | TrustRadius Reviewer
Score 7 out of 10
Vetted Review
Verified User

Overall Satisfaction with IBM Data Science Experience (DSx)

IBM Data Science Experience is used in my lab to get insights from data. We have a grant with my company to use IBM services and I am very happy for it. In my lab, we are targeting human-computer interaction and trying to extract user's behaviors from data. We have small amounts of data. Nonetheless, IBM DSx is a great tool to investigate them. In fact, it avoids the setup of Python and Spark, all the cumbersome settings are done on the cloud so data scientists can focus on the analysis. I believe the setup provided on IBM Data Science is a major "pro" for using the platform.
  • Setting up Python environment and Spark. Allowing developers to choose the version of the language
  • Getting the credentials automatically to import data.
  • Importing CSV data (not at all the same when I tried with json data)
  • Nice integration of Python notebooks
  • Data visualization - not all data are visualized in a seamless manner (DSX tried to complement Matplotlib, but their tool is not as effective)
  • Facilitate developers in integrating DSX output in their own website
  • Saving the state of a notebook might help (I understand that python notebook must be re-run when interrupting the kernel, but avoiding to re-run everything might help - especially in long notebooks)
  • Positive impact - Online notebooks used in presentation are very effective
  • Positive Impact - Avoid cumbersome Excel calculation, Python is way, way faster
  • Negative Impact - Not flexible enough to be easily integrated in your website
  • elastic search
Elastic Search is based only on json format, while with IBM DSX I have no restrictions on this. One main limitation however appears in DSX when there are issues in importing different types of datasets in the notebook. In particular, the json importing fails somehow with nested structures. I think this will be fixed easily, but since json is a very popular format it is one disadvantage if not integrated correctly.
Best suited: Analyzing great amount of data on a distributed cloud platform - manipulating data is easy thanks to all the setup done by DSX
Less Appropriate: integrating graphs. Even if it is possible to use matplot lib in python the data visualization part in IBM DSx has a lot of shortcomings. Maybe because there is not a specific visualization tool associated to it yet. For example, Elastic Search provides Kibana on top of it for the data visualization. Hope this example can be inspiring to make DSx an even greater tool.