TrustRadius: an HG Insights company

What is OCI Data Science?

Oracle Cloud Infrastructure (OCI) Data Science is a fully managed platform for teams of data scientists to build, train, deploy, and manage machine learning (ML) models using Python and open source tools.

It features a JupyterLab-based environment so users can experiment and develop models, and scale up model training with NVIDIA GPUs and distributed training. Users can also take models into production and keep them healthy with ML operations (MLOps) capabilities, such as automated pipelines, model deployments, and model monitoring.

Some features include:

  • Data preparation - Data scientists can access and use any data source in any cloud or on-premises. This provides more potential data features that lead to better models.
  • Data labeling - Oracle Cloud Infrastructure (OCI) Data Labeling is a service for building labeled datasets to more accurately train AI and machine learning models. With OCI Data Labeling, developers and data scientists assemble data, create and browse datasets, and apply labels to data records.
  • JupyterLab interface - Built-in, cloud-hosted JupyterLab notebook environments enable data science teams to build and train models using a familiar user interface.
  • Oracle Accelerated Data Science (ADS) library - A Python toolkit that supports the data scientist through their entire end-to-end data science workflow.
  • NVIDIA GPUs - So that data scientists can build and train deep learning models in less time. When compared to CPUs, performance speedups can be 5 to 10 times faster.
  • Jobs - Use Jobs to run repeatable data science tasks in batch mode. Scale up your model training with support for bare metal NVIDIA GPUs and distributed training.
  • Model catalog - Data scientists use the model catalog to preserve and share completed machine learning models. The catalog stores the artifacts and captures metadata around the taxonomy and context of the model, hyperparameters, definitions of the model input and output data schemas, and detailed provenance information about the model origin, including the source code and the training environment.
  • Model evaluation and comparison - Automatically generate a comprehensive suite of metrics and visualizations to measure model performance against new data and compare model candidates.
  • Managed model deployment - Deploy machine learning models as HTTP endpoints for serving model predictions on new data in real time. Simply click to deploy from the model catalog, and OCI Data Science handles all infrastructure operations, including compute provisioning and load balancing.
  • ML pipelines - Operationalize and automate model development, training, and deployment workflows with a fully managed service to author, debug, track, manage, and execute ML pipelines.
  • ML monitoring - Continuously monitor models in production for data and concept drift. Enables data scientists, site reliability engineers, and DevOps engineers to receive alerts and quickly assess model retraining needs.
  • ML applications - Originally designed for Oracle’s own SaaS applications to embed AI features, ML applications are now available to automate the entire MLOps lifecycle, including development, provisioning, and ongoing maintenance and fleet management, for ISVs with hundreds of models for each of their thousands of customers.
  • AI Quick Action, e.g. No-code access - Use LLMs from Mistral, Meta, and others without writing a single line of code via a seamless user interface in OCI Data Science notebooks.
  • Fine-tuning - To achieve optimal performance, leverage distributed training with PyTorch, Hugging Face Accelerate, or DeepSpeed for fine-tuning LLMs. Enable storage of fine-tuned weights with object storage. Additionally, service-provided Condas eliminate the requirement for custom Docker environments and enable sharing with less slowdown.

Technical Details

Technical Details
Mobile ApplicationNo

FAQs

What is OCI Data Science?
Oracle Cloud Infrastructure (OCI) Data Science is a fully managed platform for teams of data scientists to build, train, deploy, and manage machine learning (ML) models using Python and open source tools.