Latest Questions
Post Top Ad
Your Ad Spot
Monday, June 17, 2019

50 Top DataStage Interview Questions and Answers {Updated}

DataStage Interview Questions and Answers for experienced PDF, Read commonly asked DataStage Job Interview Questions with Answers PDF for Freshers.


Read DataStage Interview Questions and Answers

What can we do with DataStage Director?
Validating, Scheduling, Executing and Monitoring Jobs (server Jobs).

What are stage variables?
Stage variables are declarative in Transformer Stage used to store values. Stage variables are active at the run time. (Because memory is allocated at the run time).

How do you handle the version controls?
Ascential has a separate product called ( Version Control ), which is used to handle version control.

Did you use the conditional scheduling in your project?
Using Sequencer Job we can create conditional scheduling.

What are the command line functions that import and export the DS jobs?
dsimport.exe : imports the DataStage components.
dsexport.exe : exports the DataStage components.

What is the transformer stage?
Transformer stages do not extract data or write data to a target database. They are used to handle extracted data, perform any conversions required, and pass data to another Transformer stage or a stage that writes data to a target data table.

How we can reuse the components?
Using Shared and Local Containers.

How do you eliminate duplicate rows?
Data Stage provides us with a stage Remove Duplicates in Enterprise edition. Using that stage we can eliminate the duplicates based on a key column.

What is the difference between primary key and partition key?
Primary Key is a combination of unique and not null. It can be a collection of key values called a composite primary key. Partition Key is just a part of Primary Key. There are several methods of partition like Hash, DB2, and Random, etc. While using Hash partition we specify the Partition Key.

How do you schedule or monitoring the job?
Using the DataStage Director we can schedule or monitor the job.

What is the default cache size?
Default cache size is 256 MB.

What is the difference between join stage and merge stage?
JOIN: Performs join operations on two or more data sets input to the stage and then outputs the resulting dataset.
MERGE: Combines a sorted master data set with one or more sorted updated data sets. The columns from the records in the master and update data set s arc merged so that the output record contains all the columns from the master record plus any additional columns from each update record that required.

What is metadata repository?
Meta Data is data about the data. It also contains
* Query statistics
* ETL statistics
* Business subject area
* Source Information
* Target Information
* Source to Target mapping Information

What is the difference between active and passive Stage?
Passive Stages are used for data extraction and loading.
Active Stage is used to implements and processes the business rules

What is orabulk Stage?
This Stage is used to Bulk Load the Oracle Target Database.

What is the difference between local and shared container?
Local Container is local to the particular job in which we developed the container.
Shared Container is can be used in any other jobs also.

What is a container?
Containers are the reusable set of stages.

What are the advantages of data warehousing?
A data warehousing strategy provides the following advantages :
* Capitalizes on the potential value of the organization’s information.
* Improves the quality and accessibility of data.
* Combines valuable archive data with the latest data in operational sources.
* Increases the amount of information available to users.
* Reduces the requirement of users to access operational data.
* Reduces the strain on IT departments, as they can produce one database to serve all user groups.
* Allows new reports and studies to be introduced without disrupting operational systems.
* Promotes users to be self-sufficient.

What is a fact table?
The chief feature of a star schema is the table at the center, called the fact table.

What is the datastage server?
Runs executable jobs that extract, transform, and load data into a data warehouse.

What is a repository?
A central store that contains all the information required to build a data mart or data warehouse.

What is the datastage administrator?
A user interface used to set up DataStage users, create and move projects, and set up purging criteria.

What is the datastage manager?
A user interface used to view and edit the contents of the repository.

What is a datastage designer?
A design interface used to create DataStage applications (known as jobs). Each job specifies the data sources, the transforms required, and the destination of the data. Jobs are compiled to create executables that are scheduled by the Director and run by the Server.

What is the merge stage?
The Merge stage combines a sorted master data set with one or more sorted update data sets. The columns from the records in the master and update data sets are merged so that the output record contains all the columns from the master record plus any additional columns from each update record.

How do u convert the columns to rows in DataStage?
Using Pivot Stage.

What are Routines?
Routines are the functions which we develop in BASIC Code for required tasks, which we DS is not fully supported (Complex).

What is staging variable?
These are the temporary variables created in transformer for calculation.

What is hash file stage?
Hash file stage is binary file used for lookup, for better performance.

What are the types of Containers?
There are two types of containers are :
Local Container
Shared Container

What are the components of Ascential Data Stage?
Client Components : Administrator, Director, Manager, and Designer.
Server Components: Repository, Server and Plugins.

What is the difference between DataStage version 5.2 and 6.0?
Version 5.2 doesn’t have – IPC Stage, Link Partition Stage, Link Collector Stage and Parallel Extender.

What is a dynamic array?
Dynamic arrays map the structure of DataStage file records to character string data. Any character string can be a dynamic array. A dynamic array is a character string containing elements that are substrings separated by delimiters.

What is the universe stage?
A stage that extracts data from or loads data into a Universe database using SQL. Used to represent a data source, an aggregation step, or a target data table.

What are the types of the stage?
A stage can be passive or active. A passive stage handles access to databases for the extraction or writing of data. Active stages model the flow of data and provide mechanisms for combining data streams, aggregating data, and converting data from one data type to another. There are two types of stage:
Built in stages: Supplied with DataStage and used for extracting, aggregating, transforming, or writing data.
Plug in stages: Additional stages defined in the DataStage Manager to perform tasks that the built-in stages do not support.

What is a datastage job?
DataStage jobs consist of individual stages. Each stage describes a particular database or process. For example, one stage may extract data from a data source, while another transforms it. Stages are added to a job and linked together using the designer.

What are the types of server components?
* Repository
* Datastage Server
* Datastage Package Installer

What is data aggregation?
An operational data source usually contains records of individual transactions such as product sales. If the user of a data warehouse only needs a summed total, you can reduce records to a more manageable number by aggregating the data.

What is the difference between DataStage and Informatica?
* DataStage support parallel processing which Informatica doesn’t.
* Links are objected in the DataStage, in Informatica, it’s a port to port connectivity.
* In Informatica it's easy to implement Slowly Changing Dimensions which is a little bit complex in DataStage.
* DataStage doesn’t support complete error handling.

What is the datastage director?
The DataStage Director is the client component that validates, runs, schedules, and monitors jobs run by the DataStage Server.

Define the job?
A collection of linked stages, data elements, and transforms that define how to extract, cleanse, transform, integrate, and load data into a target database.

What is the difference between maps and locales?
Maps: Defines the character sets that the project can use.
Locales: Defines the local formats for dates, times, sorting order, and so on that the project can use.

What is metadata?
Data about data. A table definition which describes the structure of the table is an example of metadata.

What are the types of DataStage clients?
* Datastage Administrator
* Datastage Designer
* Datastage manager
* Datastage Director

Where the DataStage stored his repository?
DataStage stored his repository in IBM Universe Database.

What is the difference between Server Job and Parallel Jobs?
Server Jobs works in a sequential way while parallel jobs work in parallel fashion (Parallel Extender work on the principal of pipeline and partition) for I/O processing.

What are the types of jobs available in DataStage?
* Server Job
* Parallel Job
* Sequencer Job
* Container Job

What are the main features of DataStage?
DataStage has the following features to aid the design and processing required to build a data warehouse :
* Uses graphical design tools. With a simple point and click techniques, you can draw a scheme to represent your processing requirements.
* Extracts data from any number or types of database.
* Handles all the metadata definitions required to define your data warehouse.
* You can view and modify the table definitions at any point during the design of your application.
* Aggregates data.
* You can modify SQL SELECT statements used to extract data.
* Transforms data. DataStage has a set of predefined transforms and functions. you can use to convert your data. You can easily extend the functionality by defining your own transforms to use.
* Loads the data warehouse.

What are the components of DataStage?
Datastage consists of a number of client and server components. Datastage has four client components :
* Datastage Designer
* Datastage Director
* Datastage Manager
* Datastage Administrator.

Post Top Ad

Your Ad Spot

Pages