Q.
What is Factless fact table? In which purpose we are using this in our DWH
projects?
It is a fact table which does not contain
any measurable data.
EX: Student attendance fact (it contains
only Boolean values, whether student attended class or not ? Yes or No.)
A Factless fact table contains only the
keys but there is no measures or in other way we can say that it contains no
facts. Generally it is used to integrate the fact tables
Factless fact table contains only foreign
keys. We can have two kinds of aggregate functions from the factless fact one
is count and other is distinct count.
2 purposes of factless fact
1. Coverage: to indicate what did NOT happen. Like to
Like: which product did not sell well in a
particular region?
2. Event tracking: To know if the event took place or not.
Like: Fact for tracking student’s
attendance will not contain any measures.
Q.
What is staging area?
Staging area is nothing but to apply our
logic to extract the data from source and cleansing the data and put the data
into meaningful and summaries of the data for data warehouse.
Q.
What is constraint based loading
Constraint based load order defines the
order of loading the data into the multiple targets based on primary and
foreign keys constraints.
Q.
Why union transformation is active transformation?
the only condition for a transformation
to bcum active is row number changes.
Now the thing is how a row number can
change. Then there are
2 conditions:
1. either the no of rows coming in and
going out is diff.
eg: in case of filter we have the data
like
id name dept row_num
1 aa 4 1
2 bb 3 2
3 cc 4 3
and we have a filter condition like
dept=4 then the o/p wld
b like
id name dept row_num
1 aa 4 1
3 cc 4 2
So row num changed and it is an active
transformation
2. or the order of the row changes
eg: when Union transformation pulls in
data, suppose we have
2 sources
sources1:
id name dept row_num
1 aa 4 1
2 bb 3 2
3 cc 4 3
source2:
id name dept row_num
4 aaa 4 4
5 bbb 3 5
6 ccc 4 6
it never restricts the data from any
source so the data can
come in any manner
id name dept row_num old row_num
1 aa 4 1 1
4 aaa 4 2 4
5 bbb 3 3 5
2 bb 3 4 2
3 cc 4 5 3
6 ccc 4 6 6
so
the row_num are changing . Thus we say that union is an active transformation
Q. What is use of batch file in
informatica? How many types of batch file in informatica?
With the batch file, we can run
sessions either in sequential or in concurrently.
Grouping of Sessions is known as Batch.
Two types of batches:
1)Sequential: Runs Sessions one after
another.
2)Concurrent: Run the Sessions at the
same time.
If u have sessions with source-target
dependencies u have to go for sequential batch to start the sessions one after
another. If u have several independent sessions u can use concurrent batches
Which run all the sessions at the same time
Q.
What is joiner cache?
When we use the joiner transformation
an integration service maintains the cache, all the records are stored in
joiner cache. Joiner caches have 2 types of cache 1.Index cache 2. Joiner
cache.
Index cache stores all the port values
which are participated in the join condition and data cache have stored all
ports which are not participated in the join condition.
Q.
What is the location of parameter file in Informatica?
$PMBWPARAM
Q. How can you display only hidden
files in UNIX
$ ls -la
total 16
8 drwxrwxrwx 2 zzz yyy 4096 Apr 26
12:00 ./
8 drwxrwxrwx 9 zzz yyy 4096 Jul 31
16:59 ../
Correct answer is
ls -a|grep "^\."
$ls -a
Q.
How to delete the data in the target table after loaded.
SQ---> Properties tab-->Post SQL
delete from target_tablename
SQL statements executed using the
source database connection, after a pipeline is run write post sql in target
table as truncate table name. we have the property in session truncate option.
Q. What is polling in informatica?
It displays the updated information
about the session in the monitor window. The monitor window displays the status
of each session when you poll the Informatica server.
Q.
How i will stop my workflow after 10 errors
Session level property error
handling mention condition stop on errors:
10
--->Config object –>
Error Handling –> Stop on errors
Q.
How can we calculate fact table size?
A fact table is multiple of combination
of dimension tables
ie if we want 2 find the fact table
size of 3years of historical date with 200 products and 200 stores
3*365*200*200=fact table size
Q.
Without using emailtask how will send a mail from informatica?
by using 'mailx' command in unix of
shell scripting
Q.
How will compare two mappings in two different repositories?
in the designer client , goto mapping
tab there is one
option that is 'compare', here we will
compare two mappings in two different repository
in informatica designer go to mapping
tab--->compare..
we can compare 2 folders within the
same repository ..
we can compare 2 folders within
different repository ..
Q.
What is constraint based load order
Constraint based load order defines the
order in which data loads into the multiple targets based on primary key and
foreign key relationship.
Q.
What is target load plan
Suppose i have 3 pipelines in a single
mapping designer
emp source--->sq--->tar1
dept source--->sq--->tar2
bonus source--->sq--->tar3
my requirement is to load first in tar2
then tar1 and then finally tar3
for this type of loading to control the
extraction of data from source by source qualifier we use target load plan.
Q.
What is meant by data driven.. in which scenario we use that..?
Data driven is available at session
level. it says that when we r using update strategy t/r ,how the integration
service fetches the data and how to update/insert row in the database log.
Data driven is nothing but instruct the
source rows that should take action on target i.e(update,delete,reject,insert).
If we use the update strategy transformation in a mapping then will select the
data driven option in session.
Q.
How to run workflow in unix?
Syntax: pmcmd startworkflow -sv <service
name> -d <domain name> -u <user name> -p <password> -f
<folder name> <workflow name>
Example
Pmcmd start workflow –service
${INFA_SERVICE} -domain
${INFA_DOMAIN} -uv xxx_PMCMD_ID -pv
PSWD -folder
${ETLFolder} -wait ${ETLWorkflow} \
Q.
What is the main difference between a Joiner Transformation and Union
Transformation?
Joiner Transformation merge horizontally
Union Transformation merge vertically
A joiner Transformation is used to join
data from hertogenous database ie (Sql database and flat file) where has Union
transformation is used to join data from
the same relational sources.....(oracle
table and another Oracle table)
Join Transformation combines data
record horizontally based on join condition.
And combine data from two different
sources having different metadata.
Join transformation supports
heterogeneous, homogeneous data source.
Union Transformation combines data
record vertically from multiple sources, having same metadata.
Union transformation also support
heterogeneous data source.
Union transformation functions as UNION
ALL set operator.
Q.
What is constraint based loading exactly? And how to do this? I think it is
when we have primary key-foreign key relationship. Is it correct?
Constraint Based Load order defines
load the data into multiple targets depend on the primary key foreign key
relation.
set the option is: Double click the
session
Configure Object check the Constraint Based Loading
Q. Difference between top down(w.h
inmon)and bottom up(ralph kimball)approach?
Top Down approach:-
As per W.H.INWON, first we need to
build the Data warehouse after that we need to build up the
DataMart but this is so what difficult to maintain the DWH.
Bottom up approach;-
As per Ralph Kimbal, first we need to
build up the Data Marts then we need to build up the Datawarehouse..
this approach is most useful in real
time while creating the Data warehouse.
Q.
What are the different caches used in informatica?
·
Static cache
·
Dynamic cache
·
Shared cache
·
Persistent cache
Q. What is the
command to get the list of files in a directory in unix?
$ls -lrt
Q. How to import
multiple flat files in to single target where there is no common column in the
flat files
in workflow session properties in
Mapping tab in properties choose Source filetype - Indirect
Give the Source filename :
<file_path>
This <file_path> file should
contain all the multiple files which you want to Load
Q. How to connect two
or more table with single source qualifier?
Create a Oracle source with how much
ever column you want and write the join query in SQL query override. But the
column order and data type should be same as in the SQL query.
Q. How to call
unconnected lookup in expression transformation?
:LKP.LKP_NAME(PORTS)
Q. What is diff
between connected and unconnected lookup?
Connected lookup:
It is used to join the two tables
it returns multiple rows
it must be in mapping pipeline
u can implement lookup condition
using connect lookup u can generate
sequence numbers by
enabling dynamic lookup cache.
Unconnected lookup:
it returns single output through return
port
it acts as a lookup function(:lkp)
it is called by another t/r.
not connected either source r target.
------
CONNECTED LOOKUP:
>> It will participated in data pipeline
>> It contains multiple inputs and
multiple outputs.
>> It supported static and
dynamic cache.
UNCONNECTED LOOKUP:
>> It will not participated in data
pipeline
>> It contains multiple inputs and
single output.
>> It supported static
cache only.
Q. Types of
partitioning in Informatica?
Partition 5 types
1. Simple pass through
2. Key range
3. Hash
4. Round robin
5. Database
Q. Which
transformation uses cache?
1. Lookup transformation
2. Aggregator transformation
3. Rank transformation
4. Sorter transformation
5. Joiner transformation
Q. Explain about union transformation?
A union transformation is a multiple
input group transformation, which is used to merge the data from multiple
sources similar to UNION All SQL statements to combine the results from 2 or
more sql statements.
Similar to UNION All
statement, the union transformation doesn't remove duplicate rows. It is an
active transformation.
Q. Explain about Joiner transformation?
Joiner transformation is used to join
source data from two related heterogeneous sources. However this can also be
used to join data from the same source. Joiner t/r join sources with at least
one matching column. It uses a condition that matches one or more pair of
columns between the 2 sources.
To configure a
Joiner t/r various settings that we do are as below:
1) Master and detail source
2) Types of join
3) Condition of the join
Q. Explain about Lookup transformation?
Lookup t/r is used in a mapping to look
up data in a relational table, flat file, view or synonym.
The informatica server queries the look
up source based on the look up ports in the transformation. It compares look up
t/r port values to look up source column values based on the look up condition.
Look up t/r is used to perform the
below mentioned tasks:
1) To get a related value.
2) To perform a calculation.
3) To update SCD tables.
Q. How to identify
this row for insert and this row for update in dynamic lookup cache?
Based on NEW LOOKUP ROW.. Informatica
server indicates which one is insert and which one is update.
Newlookuprow- 0...no change
Newlookuprow- 1...Insert
Newlookuprow- 2...update
Q. How many ways can we implement SCD2?
1) Date range
2) Flag
3) Versioning
Q. How will you check the bottle necks
in informatica? From where do you start checking?
You start as per this order
1. Target
2. Source
3. Mapping
4. Session
5. System
Q. What is incremental aggregation?
When the aggregator transformation
executes all the output data will get stored in the temporary location called
aggregator cache. When the next time the mapping runs the aggregator
transformation runs for the new records loaded after the first run. These
output values will get incremented with the values in the aggregator cache.
This is called incremental aggregation. By this way we can improve
performance...
---------------------------
Incremental aggregation means applying
only the captured changes in the source to aggregate calculations in a session.
When the source changes
only incrementally and if we can capture those changes, then we can configure
the session to process only those changes. This allows informatica server to
update target table incrementally, rather than forcing it to process the entire
source and recalculate the same calculations each time you run the session. By
doing this obviously the session performance increases.
No comments:
Post a Comment