Informatica Interview Question Part -11 - ETL- iNFORMATICA DEVELOPER

Monday, June 24, 2019

Informatica Interview Question Part -11


ktnewslive

Constraint-Based Loading

In the Workflow Manager, you can specify constraint-based loading for a session. When you select this option, the Integration Service orders the target load on a row-by-row basis. For every row generated by an active source, the Integration Service loads the corresponding transformed row first to the primary key table, then to any foreign key tables. Constraint-based loading depends on the following requirements:

Active source: Related target tables must have the same active source.
Key relationships: Target tables must have key relationships.
Target connection groups: Targets must be in one target connection group.
Treat rows as insert. Use this option when you insert into the target. You cannot use updates with constraint based loading.

Active Source:
When target tables receive rows from different active sources, the Integration Service reverts to normal loading for those tables, but loads all other targets in the session using constraint-based loading when possible. For example, a mapping contains three distinct pipelines. The first two contain a source, source qualifier, and target. Since these two targets receive data from different active sources, the Integration Service reverts to normal loading for both targets. The third pipeline contains a source, Normalizer, and two targets. Since these two targets share a single active source (the Normalizer), the Integration Service performs constraint-based loading: loading the primary key table first, then the foreign key table.

Key Relationships:

When target tables have no key relationships, the Integration Service does not perform constraint-based loading.
Similarly, when target tables have circular key relationships, the Integration Service reverts to a normal load. For example, you have one target containing a primary key and a foreign key related to the primary key in a second target. The second target also contains a foreign key that references the primary key in the first target. The Integration Service cannot enforce constraint-based loading for these tables. It reverts to a normal load.

Target Connection Groups:

The Integration Service enforces constraint-based loading for targets in the same target connection group. If you want to specify constraint-based loading for multiple targets that receive data from the same active source, you must verify the tables are in the same target connection group. If the tables with the primary key-foreign key relationship are in different target connection groups, the Integration Service cannot enforce constraint-based loading when you run the workflow. To verify that all targets are in the same target connection group, complete the following tasks:
·         Verify all targets are in the same target load order group and receive data from the same active source.
·         Use the default partition properties and do not add partitions or partition points.
·         Define the same target type for all targets in the session properties.
·         Define the same database connection name for all targets in the session properties.
·         Choose normal mode for the target load type for all targets in the session properties.

Treat Rows as Insert:

Use constraint-based loading when the session option Treat Source Rows As is set to insert. You might get inconsistent data if you select a different Treat Source Rows As option and you configure the session for constraint-based loading.
When the mapping contains Update Strategy transformations and you need to load data to a primary key table first, split the mapping using one of the following options:
·         Load primary key table in one mapping and dependent tables in another mapping. Use constraint-based loading to load the primary table.
·         Perform inserts in one mapping and updates in another mapping.
Constraint-based loading does not affect the target load ordering of the mapping. Target load ordering defines the order the Integration Service reads the sources in each target load order group in the mapping. A target load order group is a collection of source qualifiers, transformations, and targets linked together in a mapping. Constraint based loading establishes the order in which the Integration Service loads individual targets within a set of targets receiving data from a single source qualifier.



Example

The following mapping is configured to perform constraint-based loading:
In the first pipeline, target T_1 has a primary key, T_2 and T_3 contain foreign keys referencing the T1 primary key. T_3 has a primary key that T_4 references as a foreign key.
Since these tables receive records from a single active source, SQ_A, the 

Integration Service loads rows to the target in the following order:


1. T_1
2. T_2 and T_3 (in no particular order)
3. T_4
The Integration Service loads T_1 first because it has no foreign key dependencies and contains a primary key referenced by T_2 and T_3. The Integration Service then loads T_2 and T_3, but since T_2 and T_3 have no dependencies, they are not loaded in any particular order. The Integration Service loads T_4 last, because it has a foreign key that references a primary key in T_3.After loading the first set of targets, the Integration Service begins reading source B. If there are no key relationships between T_5 and T_6, the Integration Service reverts to a normal load for both targets.
If T_6 has a foreign key that references a primary key in T_5, since T_5 and T_6 receive data from a single active source, the Aggregator AGGTRANS, the Integration Service loads rows to the tables in the following order:
T_5
T_6
T_1, T_2, T_3, and T_4 are in one target connection group if you use the same database connection for each target, and you use the default partition properties. T_5 and T_6 are in another target connection group together if you use the same database connection for each target and you use the default partition properties. The Integration Service includes T_5 and T_6 in a different target connection group because they are in a different target load order group from the first four targets.

Enabling Constraint-Based Loading:

When you enable constraint-based loading, the Integration Service orders the target load on a row-by-row basis. To enable constraint-based loading:
1.   In the General Options settings of the Properties tab, choose Insert for the Treat Source Rows As property.
2.   Click the Config Object tab. In the Advanced settings, select Constraint Based Load Ordering.
3.   Click OK.



Target Load Plan

When you use a mapplet in a mapping, the Mapping Designer lets you set the target load plan for sources within the mapplet.

Setting the Target Load Order

You can configure the target load order for a mapping containing any type of target definition. In the Designer, you can set the order in which the Integration Service sends rows to targets in different target load order groups in a mapping. A target load order group is the collection of source qualifiers, transformations, and targets linked together in a mapping. You can set the target load order if you want to maintain referential integrity when inserting, deleting, or updating tables that have the primary key and foreign key constraints.
The Integration Service reads sources in a target load order group concurrently, and it processes target load order groups sequentially.
To specify the order in which the Integration Service sends data to targets, create one source qualifier for each target within a mapping. To set the target load order, you then determine in which order the Integration Service reads each source in the mapping.

The following figure shows two target load order groups in one mapping:

In this mapping, the first target load order group includes ITEMS, SQ_ITEMS, 
and T_ITEMS. The second target load order group includes all other objects in the mapping, including the TOTAL_ORDERS target. The Integration Service processes the first target load order group, and then the second target load order group.
When it processes the second target load order group, it reads data from both sources at the same time.

To set the target load order:
·       
            Create a mapping that contains multiple target load order groups.
·         Click Mappings > Target Load Plan.
·         The Target Load Plan dialog box lists all Source Qualifier transformations in the mapping and the targets that receive data from each source qualifier.
·         Select a source qualifier from the list.
·         Click the Up and Down buttons to move the source qualifier within the load order.
·         Repeat steps 3 to 4 for other source qualifiers you want to reorder. Click OK.



Mapping Parameters & Variables


Mapping parameters and variables represent values in mappings and mapplets.
When we use a mapping parameter or variable in a mapping, first we declare the mapping parameter or variable for use in each mapplet or mapping. Then, we define a value for the mapping parameter or variable before we run the session.

Mapping Parameters

A mapping parameter represents a constant value that we can define before running a session.
A mapping parameter retains the same value throughout the entire session.


Example: When we want to extract records of a particular month during ETL process, we will create a Mapping Parameter of data type and use it in query to compare it with the timestamp field in SQL override.
After we create a parameter, it appears in the Expression Editor.
We can then use the parameter in any expression in the mapplet or mapping.
We can also use parameters in a source qualifier filter, user-defined join, or extract override, and in the Expression Editor of reusable transformations.

Mapping Variables

Unlike mapping parameters, mapping variables are values that can change between sessions.
·         The Integration Service saves the latest value of a mapping variable to the repository at the end of each successful session.
·         We can override a saved value with the parameter file.
·         We can also clear all saved values for the session in the Workflow Manager.
We might use a mapping variable to perform an incremental read of the source. For example, we have a source table containing time stamped transactions and we want to evaluate the transactions on a daily basis. Instead of manually entering a session override to filter source data each time we run the session, we can create a mapping variable, $$IncludeDateTime. In the source qualifier, create a filter to read only rows whose transaction date equals $$IncludeDateTime, such as:
TIMESTAMP = $$IncludeDateTime
In the mapping, use a variable function to set the variable value to increment one day each time the session runs. If we set the initial value of $$IncludeDateTime to 8/1/2004, the first time the Integration Service runs the session, it reads only rows dated 8/1/2004. During the session, the Integration Service sets $$IncludeDateTime to 8/2/2004. It saves 8/2/2004 to the repository at the end of the session. The next time it runs the session, it reads only rows from August 2, 2004.


Used in following transformations:


Expression
Filter
Router
Update Strategy


Initial and Default Value:

When we declare a mapping parameter or variable in a mapping or a mapplet, we can enter an initial value. When the Integration Service needs an initial value, and we did not declare an initial value for the parameter or variable, the Integration Service uses a default value based on the data type of the parameter or variable.
Data ->Default Value
Numeric ->0
String ->Empty String
Date time ->1/1/1
Variable Values: Start value and current value of a mapping variable

Start Value:

The start value is the value of the variable at the start of the session. The Integration Service looks for the start value in the following order:
·         Value in parameter file
·         Value saved in the repository
·         Initial value
·         Default value

Current Value:


The current value is the value of the variable as the session progresses. When a session starts, the current value of a variable is the same as the start value. The final current value for a variable is saved to the repository at the end of a successful session. When a session fails to complete, the Integration Service does not update the value of the variable in the repository.
Note: If a variable function is not used to calculate the current value of a mapping variable, the start value of the variable is saved to the repository.
Variable Data type and Aggregation Type When we declare a mapping variable in a mapping, we need to configure the Data type and aggregation type for the variable. The IS uses the aggregate type of a Mapping variable to determine the final current value of the mapping variable.

Aggregation types are:

 Count: Integer and small integer data types are valid only.

Max: All transformation data types except binary data type are valid.

 Min: All transformation data types except binary data type are valid.

Variable Functions
Variable functions determine how the Integration Service calculates the current value of a mapping variable in a pipeline.

SetMaxVariable: Sets the variable to the maximum value of a group of values. It ignores rows marked for update, delete, or reject. Aggregation type set to Max.

SetMinVariable: Sets the variable to the minimum value of a group of values. It ignores rows marked for update, delete, or reject. Aggregation type set to Min.

SetCountVariable: Increments the variable value by one. It adds one to the variable value when a row is marked for insertion, and subtracts one when the row is Marked for deletion. It ignores rows marked for update or reject. 
Aggregation type set to Count.

SetVariable: Sets the variable to the configured value. At the end of a session, it compares the final current value of the variable to the start value of the variable. Based on the aggregate type of the variable, it saves a final value to the repository.

Creating Mapping Parameters and Variables
·          

          Open the folder where we want to create parameter or variable.·         In the Mapping Designer, click Mappings > Parameters and Variables. -or- In the Mapplet Designer, click Mapplet > Parameters and Variables.
·         Click the add button.
·         Enter name. Do not remove $$ from name.
·         Select Type and Data type. Select Aggregation type for mapping variables.
·         Give Initial Value. Click ok.
·         Example: Use of Mapping of Mapping Parameters and Variables
·         EMP will be source table.

Create a target table MP_MV_EXAMPLE having columns: EMPNO, ENAME, DEPTNO, TOTAL_SAL, MAX_VAR, MIN_VAR, COUNT_VAR and SET_VAR.
TOTAL_SAL = SAL+ COMM + $$BONUS (Bonus is mapping parameter that changes every month)
SET_VAR: We will be added one month to the HIREDATE of every employee.

Create shortcuts as necessary.


Creating Mapping
1.   

    Open folder where we want to create the mapping.
2.   Click Tools -> Mapping Designer.
3.   Click Mapping-> Create-> Give name. Ex: m_mp_mv_example
4.   Drag EMP and target table.
5.   Transformation -> Create -> Select Expression for list -> Create –>  Done.
6.   Drag EMPNO, ENAME, HIREDATE, SAL, COMM and DEPTNO to Expression.
7.   Create Parameter $$Bonus and Give initial value as 200.
8.   Create variable $$var_max of MAX aggregation type and initial value 1500.
9.   Create variable $$var_min of MIN aggregation type and initial value 1500.
10.                Create variable $$var_count of COUNT aggregation type and initial value 0. COUNT is visible when datatype is INT or SMALLINT.
11.                Create variable $$var_set of MAX aggregation type.
12. Create 5 output ports out_ TOTAL_SAL, out_MAX_VAR, out_MIN_VAR,
out_COUNT_VAR and out_SET_VAR.
13. Open expression editor for TOTAL_SAL. Do the same as we did earlier for SAL+ COMM. To add $$BONUS to it, select variable tab and select the parameter from mapping parameter. SAL + COMM + $$Bonus
14. Open Expression editor for out_max_var.
15. Select the variable function SETMAXVARIABLE from left side pane. Select
$$var_max from variable tab and SAL from ports tab as shown below. SETMAXVARIABLE($$var_max,SAL)
17. Open Expression editor for out_min_var and write the following expression:
SETMINVARIABLE($$var_min,SAL). Validate the expression.
18. Open Expression editor for out_count_var and write the following expression:
SETCOUNTVARIABLE($$var_count). Validate the expression.
19. Open Expression editor for out_set_var and write the following expression:
SETVARIABLE($$var_set,ADD_TO_DATE(HIREDATE,'MM',1)). Validate.
20. Click OK. Expression Transformation below:
21. Link all ports from expression to target and Validate Mapping and Save it.
22. See mapping picture on next page.



PARAMETER FILE


A parameter file is a list of parameters and associated values for a workflow, worklet, or session.
Parameter files provide flexibility to change these variables each time we run a workflow or session.
We can create multiple parameter files and change the file we use for a session or workflow. We can create a parameter file using a text editor such as WordPad or Notepad.
Enter the parameter file name and directory in the workflow or session properties.

A parameter file contains the following types of parameters and variables:

Workflow variable: References values and records information in a workflow.

Worklet variable: References values and records information in a worklet. Use predefined worklet variables in a parent workflow, but we cannot use workflow variables from the parent workflow in a worklet.

Session parameter: Defines a value that can change from session to session, such as a database connection or file name.
Mapping parameter and Mapping variable

USING A PARAMETER FILE
Parameter files contain several sections preceded by a heading. The heading identifies the Integration Service, Integration Service process, workflow, worklet, or session to which we want to assign parameters or variables.
·         Make session and workflow.
·         Give connection information for source and target table.
·         Run workflow and see result.

Sample Parameter File for Our example:
In the parameter file, folder and session names are case sensitive.
Create a text file in notepad with name Para_File.txt
[Practice.ST:s_m_MP_MV_Example]
$$Bonus=1000
$$var_max=500
$$var_min=1200
$$var_count=0

CONFIGURING PARAMTER FILE

We can specify the parameter file name and directory in the workflow or session properties.
To enter a parameter file in the workflow properties:
1. Open a Workflow in the Workflow Manager.
2. Click Workflows > Edit.
3. Click the Properties tab.
4. Enter the parameter directory and name in the Parameter Filename field.
5. Click OK.

To enter a parameter file in the session properties:


1. Open a session in the Workflow Manager.
2. Click the Properties tab and open the General Options settings.
3. Enter the parameter directory and name in the Parameter Filename field.
4. Example: D:\Files\Para_File.txt or $PMSourceFileDir\Para_File.txt
5. Click OK.

3 comments:

  1. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. machine learning projects for final year In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.


    The Nodejs Training Angular Training covers a wide range of topics including Components, Angular Directives, Angular Services, Pipes, security fundamentals, Routing, and Angular programmability. The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training

    ReplyDelete
  2. Thank you so much for exploring the best and right information about how useful is Informatica's set of tools and how one grasp the best out of it can.

    Informatica Read Rest Api

    ReplyDelete