Your guide to getting started with IBM Cloud Pak for Data

Are you new to IBM Cloud Pak for Data?  Are you looking for more information and guides to get you started?  Well, IBM has you covered.  There is a wealth of support documents, community articles and online videos to assist you.  We’ve pulled together some links to some of the most useful to get you started.


IBM Cloud Pak for Data overview – watching the IBM Cloud Pak for Data overview is a great place to start.  This 5-minute video will give you a brief introduction into this set of platform services.  Discover how IBM Cloud Pak for Data as a Service empowers users at all technical levels to easily tap into the power of trustworthy AI, no matter where the data resides.

Data Fabric use cases

Take a deeper dive into the data fabric use cases with the following videos:

Data governance and privacy use case

Multicloud data integration use case

Customer 360 use case


Planning – the information in this IBM Documentation can help you plan your IBM Cloud Pak for Data installation. It includes information on architecture, roles, licenses and entitlements, cloud deployment environments, storage considerations and system requirements.

Installing IBM Cloud Pak for Data  – provides step by step instructions for installation

Services and integrations – You can extend the functionality of IBM Cloud Pak for Data by installing services and by integrating Cloud Pak for Data with other applications. This document outlines some of the possibilities.

Getting Started

Setting your preferences – customize the IBM Cloud Pak for Data web client by customizing the home page contents, changing your profile photo, and setting your notification preferences.

Find the Cloud Pak for Data services you can use – use the services catalog to see which services are installed. (This determines the tools that you have access to and the tasks that you can complete.)

Check your Cloud Pak for Data permissions – you can see what permissions you have from your profile. Your permissions are determined by the roles that are assigned to you.

Submit a request for data – if you can’t find the data that you need, or if you don’t have access to a data source where the data is stored, you can submit a data request.

Identify tasks that need to be completed – you might be assigned tasks with IBM Cloud Pak for Data.  These could be data requests, publishing requests or tasks.  Find out more about these different types of tasks.

Generate API keys for authentication – find out how you can generate platform and instance API keys to automatically authenticate to the IBM Cloud Pak for Data platform.

Getting started with preparing data

Overview – take a look at the basic steps of the data preparation workflow

Refine and visualize data with Data Refinery – this tutorial provides a description of the tool, a video and step by step tutorial tasks to help you become familiar with the Data Refinery tool.  The tutorial will take approximately 30 minutes to complete.

Resources – additional resources including documents on adding data to your project and choosing a tool, as well as links to training in Watson Studio Methodology and Watson Studio.

Getting started with analyzing and visualizing data

Overview – lists what is involved in an analyzing and visualizing data workflow

Tell a story with a dashboard – a beginner tutorial on how you can use the dashboard editor to create and share an interactive data without any need for coding.  Includes video and 30-minute tutorial.

Analyze data in a Jupyter notebook – this is a 15-minute tutorial aimed at those at an intermediate level with some knowledge of Python code. A Jupyter notebook is a web-based environment for interactive computing.  You can create a notebook in which you run code to prepare, visualize, and analyze data, or build and train a model. Again, an introduction into notebooks and a video are provided.

Resources – additional resources with documents on analyzing data and building models, as well as links to additional videos and training.


Getting started with building, deploying, and trusting models

Overview – overview of the three main steps of building a model asset, deploying the model and building trust in the model.

Build and deploy a machine learning model with AutoAI – this tutorial is suitable for beginners.  The 30-minute tutorial explains how to automate the process of building a machine learning model with the AutoAI tool.  Includes information about the tool, a video and a tutorial.

Build and deploy a machine learning model in a notebook – a tutorial on building a model by updating and running a notebook that uses Python code and the Watson Machine Learning APIs. This 30-minute tutorial is suitable for intermediate users with knowledge of building, deploying and testing a scikit-learn model using Python code.

Build and deploy a machine learning model with SPSS Modeler – a 30-minute tutorial on using SPSS Model to create, train and deploy models. The tutorial provides a description of the tool, a video and instructions to build a C5.0 model using SPSS Modeler tool.  Suitable for beginners.

Build and deploy a Decision Optimization model – an intermediate user tutorial – though no coding is required.  Automatically build scenarios with the Modeling Assistant. Use the Decision Optimization tool to build Decision Optimization models to decide on the best approach for solving business problems based on sets of data.

Resources – additional resources including documents on deploying and managing models and links to training on building models using Jupyter Notebooks in IBM Watson Studio.

Getting started with curating and governing data

Overview – note you must have specific roles and permissions for curating data and creating governance artifacts like data classes and business terms.

Protect data with data protection rules – tutorial to create a rule to mask sensitive data.  This tutorial is aimed at intermediate users and will take approximately 20 minutes to complete. Learn how you can protect data with Watson Knowledge Catalog by creating data protection rules that specify the type of data to protect and the protection method.

Resources– includes documents on catalogs, governance artefacts, categories, and user roles and permission. There are links to videos for data stewards.

As you can see, IBM provides a wide range of support materials to assist you with your IBM Cloud Pak for Data project.  At DataSkill, we always encourage our customers to take advantage of the full range of documents, videos and community groups available to them and are happy to provide assistance in locating the most relevant resources.