Latest Questions
Post Top Ad
Your Ad Spot
Saturday, June 15, 2019

45 Top ETL Testing Interview Questions and Answers {Updated}

ETL Testing Interview Questions and Answers for experienced PDF, Read commonly asked ETL Testing Job Interview Questions with Answer PDF for Freshers.


Read ETL Testing Interview Questions and Answers

What is a Data Warehouse?
A Data Warehouse is a collection of data marts representing historical data from a different operational data source (OLTP). The data from this OLTP are structured and optimized for querying and data analysis in a Data Warehouse.

What is Data mart?
A Data Mart is a subset of a data warehouse that can provide data for reporting and analysis on a section, unit or a department like Sales Dept, HR Dept, etc. The Data Mart is sometimes also called as HQPS (Higher Performance Query Structure).

What is OLAP?
OLAP stands for Online Analytical Processing. It uses database tables (Fact and Dimension tables) to enable multidimensional viewing, analysis, and querying of a large amount of data.

What is OLTP?
OLTP stands for Online Transaction Processing Except for data warehouse databases the other databases are OLTPs. These OLTP uses normalized schema structure. These OLTP databases are designed for recording the daily operations and transactions of a business.

What are Dimensions?
Dimensions are categories by which summarized data can be viewed. For example, a profit Fact table can be viewed by a time dimension.

What are Confirmed Dimensions?
The Dimensions which are reusable and fixed in nature Example customer, time, geography dimensions.

What are the Fact Tables?
A Fact Table is a table that contains summarized numerical (facts) and historical data. This Fact Table has a foreign key-primary key relation with a dimension table. The Fact Table maintains the information in 3rd normal form.
A star schema is defined as a logical database design in which there will be a centrally located fact table which is surrounded by at least one or more dimension tables. This design is best suited for Data Warehouse or Data Mart.

What are the types of Facts?
The types of Facts are as follows.
1. Additive Facts: A Fact which can be summed up for any of the dimension available in the fact table.
2. Semi-Additive Facts: A Fact which can be summed up to a few dimensions and not for all dimensions available in the fact table.
3. Non-Additive Fact: A Fact which cannot be summed up for any of the dimensions available in the fact table.

What are the types of Fact Tables?
The types of Fact Tables are:
1. Cumulative Fact Table: This type of fact tables generally describes what has happened over the period of time. They contain additive facts.
2. Snapshot Fact Table: This type of fact table deals with a particular period of time. They contain non-additive and semi-additive facts.

What is the Grain of Fact?
The Grain of Fact is defined as the level at which the fact information is stored in a fact table. This is also called as Fact Granularity or Fact Event Level.

What is Factless Fact table?
The Fact Table which does not contains facts is called as Fact Table. Generally when we need to combine two data marts, then one data mart will have a factless fact table and another one with the common fact table.

What are Measures?
Measures are numeric data based on columns in a fact table.

What are the Cubes?
Cubes are data processing units composed of fact tables and dimensions from the data warehouse. They provided a multidimensional analysis.

What are Virtual Cubes?
These are a combination of one or more real cubes and require no disk space to store them. They store the only definition and not the data.

What is a Star schema design?
A Star schema is defined as a logical database design in which there will be a centrally located fact table which is surrounded by at least one or more dimension tables. This design is best suited for Data Warehouse or Data Mart.

What is Snow Flake schema Design?
In a Snow Flake design, the dimension table (de-normalized table) will be further divided into one or more dimensions (normalized tables) to organize the information in a better structural format. To design snowflake, we should first design star schema design.

What is the Operational Data Store [ODS]?
It is a collection of integrated databases designed to support operational monitoring. Unlike the OLTP databases, the data in the ODS are integrated, subject-oriented and enterprise-wide data.

What is Denormalization?
Denormalization means a table with a multi duplicate key. The dimension table follows Denormalization method with the technique of surrogate key.

What is a Surrogate Key?
A Surrogate Key is a sequence generated a key which is assigned to be a primary key in the system (table).

What are the client components of Informatica 7.1.1?
Informatica 7.1.1 Client Components:
1. Informatica Designer
2. Informatica Work Flow Manager
3. Informatica Work Flow Monitor
4. Informatica Repository Manager
5. Informatica Repository Server Administration Console.

What are the server components of Informatica 7.1.1?
Informatica 7.1.1 Server Components:
1. Informatica Server
2. Informatica Repository Server.

What is Metadata?
Data about data is called Metadata. The Metadata contains the definition of data.

What is a Repository?
The repository is a centrally stored container which stores the metadata, which is used by the Informatica Power center server and Power Center client tools. The Informatica stores Repository in relational database format.
Informatica 7.1.1 Repository has 247 database objects
Informatica 6.1.1 Repository has 172 database objects
Informatica 5.1.1 Repository has 145 database objects
Informatica 4.1.1 Repository has 111 database objects

What is Data Acquisition Process?
The process of extracting the data from different source (operational databases) systems, integrating the data and transforming the data into a homogenous format and loading into the target warehouse database. Simple called ETL (Extraction, Transformation, and Loading). The Data Acquisition process designs are called in different manners by different ETL vendors.
Informatica —-> Mapping
Data Stage —-> Job
Abinitio —-> Graph

What are the GUI based ETL tools?
The following are the GUI based ETL tools:
1. Informatica
2. DataStage
3. Data Junction
4. Oracle Warehouse Builder
5. Abinitio
6. Business Object Data Integrator
7. Cognos Decision Stream.

What are programmatic based ETL tools?
1. Pl/SQL
2. SAS BASE
3. SAS ACCESS
4. Tera Data Utilities
a. BTEQ
b. Fast Load
c. Multi-Load
d. Fast Export
e. T (Trickle) Pump

What is Transformation?
A transformation is a repository object that generates, modifies, or passes data. Transformations in a mapping represent the operations the PowerCenter Server performs on the data. Data passes into and out of transformations through ports that you link in a mapping or mapplet. Transformations can be active or passive. An active transformation can change the number of rows that pass through it. A passive transformation does not change the number of rows that pass through it.

What are the various tools? Name a few?
A few more
– Cognos Decision Stream
– Oracle Warehouse Builder
– Business Objects XI (Extreme Insight)
– SAP Business Warehouse
– SAS Enterprise ETL Server
Along with the above, need to include the below tools –Informatica –Abintio –DataStage

What are features of Informatica Repository Server?
Features of Informatica Repository Server.
1. Informatica client application and Informatica server access the repository database tables through the Repository Server.
2. Informatica client connects to the repository server through the hostname/ IP address and its port number.
3. The Repository Server can manage multiple repositories on different machines on the network.
4. For each repository database registered with the Repository Server, it configures and manages a Repository Agent process.
5. The Repository Agent is a multi-threaded process that performs the action needed to retrieve, insert and update metadata in the repository database tables.

What is a Work Flow?
A Work Flow is a set of instructions on how to execute tasks such as sessions, emails, and shell commands. A WorkFlow is created from the Workflow Manager.

What are the uses of Lookup Transformation?
The Lookup Transformation is useful for:
1. Getting a related value form a table using a key column value
2. Update slowly changing dimension table
3. To check whether records already exist in the table.

What are the different sources of Source systems of Data Warehouse?
RDBMS
Flat Files
XML Files
SAP R/3
PeopleSoft
SAP BW
Web Methods
Web Services
Seibel
Cobol Files
Legacy Systems.

Types of Slowly Changing Dimensions:
Type – 1 (Recent updates)
Type – 11 (Full historical information)
Type – 111 (Partial historical information)

What is Update Strategy’s target table options?
Update as Update: Updates each row flagged for an update if it exists in the table.
Update as Insert: Inserts a new row for each update.
Update else Inserts Updates if the row exists, else inserts.

What does a Mapping document contain?
The Mapping document contains the following information :
Source Definition – from where the database has to be loaded
Target Definition – to where the database has to be loaded
Business Logic – what logic has to be implemented in the staging area.

What does the Top Down Approach say?
The Top Down Approach is coined by Bill Immon. According to his approach, he says “First we need to implement the Enterprise data warehouse by extracting the data from individual departments and from the Enterprise data warehouse develop subject-oriented databases called as “Data Marts”.

What does the Bottom Up Approach or Ralph Kimball Approach say?
The Bottom Down Approach is coined by Ralph Kimball. According to his approach, he says “First we need to develop a subject-oriented database called “Data Marts” then integrate all the Data Marts to develop the Enterprise data warehouse.

Who is the first person in the organization to start the Data Warehouse project?
The first person to start the Data Warehouse project in an organization is a Business Analyst.

What is Dimension Modeling?
A Dimensional Modeling is a high-level methodology used to implement the start schema structure which is done by the Data Modeler.

What are the types of OLAPs?
DOLAP: The OLAP tool which words with desktop databases are called as DOLAP. Example: Cognos EP 7 Series and Business Objects, Micro strategy.
ROLAP: The OLAP which works with Relational databases are called as ROLAP. Example: Business Object, Micro strategy, Cognos ReportNet, and BRIO.
MOLAP: The OLAP which is responsible for creating multidimensional structures called cubes are called as MOLAP. Example: Cognos ReportNet.
HOLAP: The OLAP which uses the combined features of ROLAP and MOLAP are called as HOLAP. Example Cognos ReportNet.

What is worklet?
The worklet is a group of sessions. To execute the worklet we have to create the workflow.

Why we use the lookup transformation?
Look up Transformations can access data from relational tables that are not sourced in mapping. With Lookup transformation, we can accomplish the following tasks.

What is a Power Center Repository?
The Power Center Repository allows you to share metadata across repositories to create a data mart domain. In a data mart domain, you can create a single global repository to store metadata used across an enterprise and a number of local repositories to share the global metadata as needed.

What are snapshots? What are materialized views & where do we use them? What is a materialized view?
A materialized view is a view in which data is also stored in some temp table. i.e if we will go with the View concept in DB in that we only store query and once we call View it extract data from DB. But In materialized View data is stored in some temp tables.

Where do we use connected and unconnected lookups?
If return port only one then we can go for unconnected. More than one return port is not possible with Unconnected. If more than one return port then goes for Connected.
If you require dynamic cache i.e where your data will change dynamically then you can go for connected lookup. If your data is static where your data won’t change when the session loads you can go for unconnected lookups.

Post Top Ad

Your Ad Spot

Pages