IBM InfoSphere DataStage is an ETL tool and part of the IBM Information Platforms Solutions suite and IBM InfoSphere. It uses a graphical notation to construct data integration solutions and is available in various versions such as the Server Edition, the Enterprise Edition, and the MVS Edition.


DataStage Training Introduction

DataStage integrates data across multiple systems using a high performance parallel framework, and it supports extended metadata management and enterprise connectivity. The scalable platform provides more flexible integration of all types of data, including big data at rest (Hadoop-based) or in motion (stream-based), on distributed and mainframe platforms.

DataStage manager provides the user interface to view the contents of data repository. Data sources, transformations and destination database are specified in DataStage Training.

DataStage Training Curriculum

Data Warehouse Basics

• An introduction to Data Warehousing
• Purpose of Data Warehouse
• Data Warehouse Architecture
• Operational Data Store
• OLTP Vs Warehouse Applications

Data Modeling

• Introduction to Data Modeling
• Entity Relationship model (E-R model)
• Data Modeling for Data Warehouse
• curriculum_moduleETL Design process
• Introduction to Extraction
• Transformation & Loading
• Types of ETL Tools
• Key tools in the market

Data Stage Administrator

• Data stage project Administration
• Editing projects and Adding Projects
• Deleting projects Cleansing up project files
• Environmental Variables
• Environment management

Data Stage Director

• Introduction to Data stage Director
• Validating Data stage Jobs
• Executing Data stage jobs
• Job execution status
• Monitoring a job, Job log view
• Job scheduling
• Creating Batches
• Scheduling batches

Data Stage Designer

• Introduction to Data stage Designer
• Importance of Parallelism
• Pipeline Parallelism
• Partition Parallelism
• Partitioning and collecting(In depth coverage of partitioning and collective techniques)
• Symmetric Multi Processing (SMP)and Massively Parallel Processing (MPP)
• Introduction to Configuration file
• Editing a Configuration file

Working with Parallel Job Stages

• Database Stages, Oracle, ODBC, Dynamic RDBMS
• File Stages, Sequential file, Dataset
• File set, Lookup file set
• Processing Stages, Copy, Filter, Funnel
• Sort, Remove duplicate, Aggregator, Switch
• Pivot stage, Lookup, Join, Merge
• Difference between look up, join and merge
• Change capture, External Filter, Surrogate key generator
• Transformer, Real time scenarios using different Processing Stages – Implementing different logics using Transformer
• Debug Stages, Head, Tail, Peek
• Column generator, Row generator

Advanced Stages in Parallel Jobs

• Explanation of Type1 and Type 2 processes
• Implementation of Type1 and Type2 logics using Change Capture stage and SCD Stage
• Range Look process
• Surrogate key generator stage
• FTP stage

• Performance tuning

Job Sequencers

• Arrange job activities in Sequencer
• Triggers in Sequencer
• Restablity
• Recoverability
• Notification activity
• Terminator activity

IBM Information Server Administration Guide

• IBM Web Sphere Data stage administration
• Opening the IBM Information Server Web console
• Setting up a project ion the console
• Customizing the project dashboard
• Setting up security
• Creating users in the console
• Assigning security roles to users and groups
• process and setup Project Explanation

Explain Data Stage?

A data stage is simply a tool which is used to design, develop and execute many applications to fill various tables in data warehouse or data marts.Learn more about DataStage in this insightful blog post now.

Tell how a source file is populated?

We can generate a source file in various ways such as by making a SQL query in Oracle, or  by using row generator extract tool etc.

Write the command line functions to import and export the DS jobs?

To signify the DS jobs, dsimport.exe is used and to export the DS jobs, dsexport.exe is used.

Differentiate between Datastage 7.5 and 7.0?

In Datastage 7.5 various new stages are added for more sturdiness and smooth performance, such as Procedure Stage, Command Stage,etc.

Explain Merge?

Merge means to merge two or more tables. The two tables are merged on the origin of Primary key columns in both the tables.Interested in learning DataStage? Well, we have the in-depth DataStage Courses to give you a head start in your career.

