informatica interview Question Part-2 - ETL- iNFORMATICA DEVELOPER

Monday, June 17, 2019

informatica interview Question Part-2

ktnewslive

Q. What is a code page?

A. A code page contains encoding to specify characters in a set of one or more languages. The code page is selected based on source of the data. For example if source contains Japanese text then the code page should be selected to support Japanese text.

When a code page is chosen, the program or application for which the code page is set, refers to a specific set of data that describes the characters the application recognizes. This influences the way that application stores, receives, and sends character data.

Q. Which all databases PowerCenter Server on Windows can connect to?

A. PowerCenter Server on Windows can connect to following databases:
IBM DB2
Informix
Microsoft Access
Microsoft Excel
Microsoft SQL Server
Oracle
Sybase
Teradata

Q. Which all databases PowerCenter Server on UNIX can connect to?

A. PowerCenter Server on UNIX can connect to following databases:
IBM DB2
Informix
Oracle
Sybase
Teradata

Q. How to execute PL/SQL script from Informatica mapping?

A. Stored Procedure (SP) transformation can be used to execute PL/SQL Scripts. In SP
Transformation PL/SQL procedure name can be specified. Whenever the session is executed, the session will call the pl/sql procedure.



The informatica server follows instructions coded into update strategy transformations within the session mapping which determine how to flag records for insert, update, delete or reject. If we do not choose data driven option setting, the informatica server ignores all update strategy transformations in the mapping.

Q. What are the types of mapping wizards that are provided in Informatica?

The designer provide two mapping wizard.
1. Getting Started Wizard - Creates mapping to load static facts and dimension tables as well as slowly growing dimension tables.
2. Slowly Changing Dimensions Wizard - Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data we want to keep and the method we choose to handle historical dimension data.

Q. What is Load Manager?
A. While running a Workflow, the PowerCenter Server uses the Load Manager
process and the Data Transformation Manager Process (DTM) to run the workflow and carry out workflow tasks. When the PowerCenter Server runs a workflow, the Load Manager performs the following tasks:
1. Locks the workflow and reads workflow properties.
2. Reads the parameter file and expands workflow variables.
3. Creates the workflow log file.
4. Runs workflow tasks.
5. Distributes sessions to worker servers.
6. Starts the DTM to run sessions.
7. Runs sessions from master servers.
8. Sends post-session email if the DTM terminates abnormally.
When the PowerCenter Server runs a session, the DTM performs the following tasks:
1. Fetches session and mapping metadata from the repository.
2. Creates and expands session variables.
3. Creates the session log file.
4. Validates session code pages if data code page validation is enabled. Checks
Query conversions if data code page validation is disabled.
5. Verifies connection object permissions.
6. Runs pre-session shell commands.
7. Runs pre-session stored procedures and SQL.
8. Creates and runs mappings, reader, writer, and transformation threads to extract,
transform, and load data.
9. Runs post-session stored procedures and SQL.
10. Runs post-session shell commands.
11. Sends post-session email.

Q. What is Data Transformation Manager?

A. After the load manager performs validations for the session, it creates the DTM
process. The DTM process is the second process associated with the session run. The
primary purpose of the DTM process is to create and manage threads that carry out
the session tasks.
• The DTM allocates process memory for the session and divide it into buffers. This
is also known as buffer memory. It creates the main thread, which is called the
master thread. The master thread creates and manages all other threads.
• If we partition a session, the DTM creates a set of threads for each partition to
allow concurrent processing.. When Informatica server writes messages to the
session log it includes thread type and thread ID.
Following are the types of threads that DTM creates:
Master Thread - Main thread of the DTM process. Creates and manages all other
threads.
Mapping Thread - One Thread to Each Session. Fetches Session and Mapping
Information.
Pre and Post Session Thread - One Thread each to Perform Pre and Post Session
Operations.
Reader Thread - One Thread for Each Partition for Each Source Pipeline.
Writer Thread - One Thread for Each Partition if target exist in the source pipeline
write to the target.
Transformation Thread - One or More Transformation Thread For Each Partition.

Q. What is Session and Batches?

Session - A Session Is A set of instructions that tells the Informatica Server How
And When To Move Data From Sources To Targets. After creating the session, we
can use either the server manager or the command line program pmcmd to start
or stop the session.
Batches - It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The Informatica Server. There Are Two Types Of Batches:
1. Sequential - Run Session One after the Other.
2. Concurrent - Run Session At The Same Time.

Q. How many ways you can update a relational source definition and what
are they?


A. Two ways
1. Edit the definition
2. Reimport the definition

Q. What is a transformation?

A. It is a repository object that generates, modifies or passes data.

Q. What are the designer tools for creating transformations?

A. Mapping designer
Transformation developer
Mapplet designer

Q. In how many ways can you create ports?

A. Two ways
1. Drag the port from another transformation
2. Click the add button on the ports tab.

Q. What are reusable transformations?

A. A transformation that can be reused is called a reusable transformation
They can be created using two methods:
1. Using transformation developer
2. Create normal one and promote it to reusable

Q. Is aggregate cache in aggregator transformation?
A. The aggregator stores data in the aggregate cache until it completes aggregate calculations. When u run a session that uses an aggregator transformation, the Informatica server creates index and data caches in memory to process the transformation. If the Informatica server requires more space, it stores overflow values in cache files.

Q. What r the settings that u use to configure the joiner transformation?
·         Master and detail source
·         Type of join
·         Condition of the join

Q. What are the join types in joiner transformation?

A. Normal (Default) -- only matching rows from both master and detail
Master outer -- all detail rows and only matching rows from master
Detail outer -- all master rows and only matching rows from detail
Full outer -- all rows from both master and detail (matching or non matching)

Q. What are the joiner caches?

A. When a Joiner transformation occurs in a session, the Informatica Server reads all the records from the master source and builds index and data caches based on the master rows. After building the caches, the Joiner transformation reads records
from the detail source and performs joins.

Q. What r the types of lookup caches?
·         Static cache: You can configure a static or read-only cache for only lookup table. By default Informatica server creates a static cache. It caches the lookup table and lookup values in the cache for each row that comes into the transformation. When the lookup condition is true, the Informatica server does not update the cache while it processes the lookup transformation.
·         Dynamic cache: If you want to cache the target table and insert new rows into cache and the target, you can create a look up transformation to use dynamic cache. The Informatica server dynamically inserts data to the target table.
·         Persistent cache: You can save the lookup cache files and reuse them the next time the Informatica server processes a lookup transformation configured to use the cache.
·         Recache from database: If the persistent cache is not synchronized with the lookup table, you can configure the lookup transformation to rebuild the lookup cache.
·         Shared cache: You can share the lookup cache between multiple transactions. You can share unnamed cache between transformations in the same mapping.

Q. What is Transformation?
A: Transformation is a repository object that generates, modifies, or passes data.
Transformation performs specific function. They are two types of transformations:
1. Active
Rows, which are affected during the transformation or can change the no of rows that pass through it. Eg: Aggregator, Filter, Joiner, Normalizer, Rank, Router, Source qualifier, Update Strategy, ERP Source Qualifier, Advance External Procedure.
2. Passive
Does not change the number of rows that pass through it. Eg: Expression, External Procedure, Input, Lookup, Stored Procedure, Output, Sequence Generator, XML Source Qualifier.

Q. What are Options/Type to run a Stored Procedure?

A: Normal: During a session, the stored procedure runs where the
transformation exists in the mapping on a row-by-row basis. This is useful for calling the stored procedure for each row of data that passes through the mapping, such as running a calculation against an input port. Connected stored procedures run only in normal mode.
Pre-load of the Source. Before the session retrieves data from the source, the stored procedure runs. This is useful for verifying the existence of tables or performing joins of data in a temporary table.
Post-load of the Source. After the session retrieves data from the source, the stored procedure runs. This is useful for removing temporary tables.
Pre-load of the Target. Before the session sends data to the target, the stored procedure runs. This is useful for verifying target tables or disk space on the target system.
Post-load of the Target. After the session sends data to the target, the stored procedure runs. This is useful for re-creating indexes on the database. It must contain at least one Input and one Output port.

Q. What kinds of sources and of targets can be used in Informatica?

·         Sources may be Flat file, relational db or XML.
·         Target may be relational tables, XML or flat files.

Q: What is Session Process?

A: The Load Manager process. Starts the session, creates the DTM process, and
sends post-session email when the session completes.

Q. What is DTM process?

A: The DTM process creates threads to initialize the session, read, write, transform
data and handle pre and post-session operations.

Q. What is the different type of tracing levels?
Tracing level represents the amount of information that Informatica Server writes in a log file. Tracing levels store information about mapping and transformations. There are 4 types of tracing levels supported
1. Normal: It specifies the initialization and status information and summarization of the success rows and target rows and the information about the skipped rows due to transformation errors.
2. Terse: Specifies Normal + Notification of data
3. Verbose Initialization: In addition to the Normal tracing, specifies the location of the data cache files and index cache files that are treated and detailed transformation statistics for each and every transformation within the mapping.
4. Verbose Data: Along with verbose initialization records each and every record processed by the informatica server.

A dimension table consists of the attributes about the facts. Dimensions store the textual descriptions of the business.
Conformed Dimension:
Conformed dimensions mean the exact same thing with every possible fact table to which they are joined.
Eg: The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.
Junk Dimension:
A junk dimension is a collection of random transactional codes flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes.
Eg: Assume that we have a gender dimension and marital status dimension. In the fact table we need to maintain two keys referring to these dimensions. Instead of that create a junk dimension which has all the combinations of gender and marital status (cross join gender and marital status table and create a junk table). Now we can maintain only one key in the fact table.
Degenerated Dimension:
A degenerate dimension is a dimension which is derived from the fact table and doesn’t have its own dimension table.
Eg: A transactional code in a fact table.
Slowly changing dimension:
Slowly changing dimensions are dimension tables that have slowly increasing
data as well as updates to existing data.

Q. What are the output files that the Informatica server creates during the
session running?
Informatica server log: Informatica server (on UNIX) creates a log for all status and
error messages (default name: pm.server.log). It also creates an error log for error
messages. These files will be created in Informatica home directory
Session log file: Informatica server creates session log file for each session. It writes
information about session into log files such as initialization process, creation of sql
commands for reader and writer threads, errors encountered and load summary. The
amount of detail in session log file depends on the tracing level that you set.
Session detail file: This file contains load statistics for each target in mapping.
Session detail includes information such as table name, number of rows written or
rejected. You can view this file by double clicking on the session in monitor window.
Performance detail file: This file contains information known as session performance
details which helps you where performance can be improved. To generate this file
select the performance detail option in the session property sheet.
Reject file: This file contains the rows of data that the writer does not write to
targets.
Control file: Informatica server creates control file and a target file when you run a
session that uses the external loader. The control file contains the information about
the target flat file such as data format and loading instructions for the external
loader.
Post session email: Post session email allows you to automatically communicate
information about a session run to designated recipients. You can create two
different messages. One if the session completed successfully the other if the session
fails.
Indicator file: If you use the flat file as a target, you can configure the Informatica
server to create indicator file. For each target row, the indicator file contains a
number to indicate whether the row was marked for insert, update, delete or reject.
Output file: If session writes to a target file, the Informatica server creates the
target file based on file properties entered in the session property sheet.
Cache files: When the Informatica server creates memory cache it also creates cache
files.
For the following circumstances Informatica server creates index and data cache
files:
Aggregator transformation
Joiner transformation
Rank transformation
Lookup transformation


No comments:

Post a Comment