Informatica Interview Question Part -14 - ETL- iNFORMATICA DEVELOPER

Friday, June 28, 2019

Informatica Interview Question Part -14

ktnewslive


Scheduler


We can schedule a workflow to run continuously, repeat at a given time or interval, or we can manually start a workflow. The Integration Service runs a scheduled workflow as configured.
By default, the workflow runs on demand. We can change the schedule settings by editing the scheduler. If we change schedule settings, the Integration Service reschedules the workflow according to the new settings.
  • A scheduler is a repository object that contains a set of schedule settings.
  • Scheduler can be non-reusable or reusable.
  • The Workflow Manager marks a workflow invalid if we delete the scheduler associated with the workflow.
  • If we choose a different Integration Service for the workflow or restart the Integration Service, it reschedules all workflows.
  • If we delete a folder, the Integration Service removes workflows from the schedule.
  • The Integration Service does not run the workflow if:
  • The prior workflow run fails.
  • We remove the workflow from the schedule
  • The Integration Service is running in safe mode


Creating a Reusable Scheduler
  • For each folder, the Workflow Manager lets us create reusable schedulers so we can reuse the same set of scheduling settings for workflows in the folder.
  • Use a reusable scheduler so we do not need to configure the same set of scheduling settings in each workflow.
  • When we delete a reusable scheduler, all workflows that use the deleted scheduler becomes invalid. To make the workflows valid, we must edit them and replace the missing scheduler.




Steps:
  1. Open the folder where we want to create the scheduler.
  2. In the Workflow Designer, click Workflows > Schedulers.
  3. Click Add to add a new scheduler.
  4. In the General tab, enter a name for the scheduler.
  5. Configure the scheduler settings in the Scheduler tab.
  6. Click Apply and OK.


Configuring Scheduler Settings


Configure the Schedule tab of the scheduler to set run options, schedule options, start options, and end options for the schedule.
There are 3 run options:
  1. Run on Demand
  2. Run Continuously
  3. Run on Server initialization



1. Run on Demand:
Integration Service runs the workflow when we start the workflow manually.
2. Run Continuously:
Integration Service runs the workflow as soon as the service initializes. The Integration Service then starts the next run of the workflow as soon as it finishes the previous run.
3. Run on Server initialization
Integration Service runs the workflow as soon as the service is initialized. The Integration Service then starts the next run of the workflow according to settings in Schedule Options.


Schedule options for Run on Server initialization:
  •  Run Once: To run the workflow just once.
  •  Run every: Run the workflow at regular intervals, as configured.
  •  Customized Repeat: Integration Service runs the workflow on the dates and times specified in the Repeat dialog box.
Start options for Run on Server initialization:
·         Start Date
·         Start Time


End options for Run on Server initialization:
  • End on: IS stops scheduling the workflow in the selected date.
  • End After: IS stops scheduling the workflow after the set number of
  • Workflow runs.
  • Forever: IS schedules the workflow as long as the workflow does not fail.


Creating a Non-Reusable Scheduler
  1. In the Workflow Designer, open the workflow.
  2. Click Workflows > Edit.
  3. In the Scheduler tab, choose Non-reusable. Select Reusable if we want to select an existing reusable scheduler for the workflow.
  4. Note: If we do not have a reusable scheduler in the folder, we must
  5. Create one before we choose Reusable.
  6. Click the right side of the Scheduler field to edit scheduling settings for the non- reusable scheduler
  7. If we select Reusable, choose a reusable scheduler from the Scheduler
  8. Browser dialog box.
  9. Click Ok.

Points to Ponder:
  • To remove a workflow from its schedule, right-click the workflow in the Navigator window and choose Unscheduled Workflow.
  • To reschedule a workflow on its original schedule, right-click the workflow in the Navigator window and choose Schedule Workflow.

Pushdown Optimization Overview
·        
   You can push transformation logic to the source or target database using pushdown optimization. When you run a session configured for pushdown optimization, the Integration Service translates the transformation logic into SQL queries and sends the SQL queries to the database. The source or target database executes the SQL queries to process the transformations.
·         The amount of transformation logic you can push to the database depends on the database, transformation logic, and mapping and session configuration. The Integration Service processes all transformation logic that it cannot push to a database.
·         Use the Pushdown Optimization Viewer to preview the SQL statements and mapping logic that the Integration Service can push to the source or target database. You can also use the Pushdown Optimization Viewer to view the messages related to pushdown optimization.
·         The following figure shows a mapping containing transformation logic that can be pushed to the source database:
·         This mapping contains a Filter transformation that filters out all items except those with an ID greater than 1005. The Integration Service can push the transformation logic to the database. It generates the following SQL statement to process the transformation logic:
·         INSERT INTO ITEMS(ITEM_ID, ITEM_NAME, ITEM_DESC, n_PRICE) SELECT ITEMS.ITEM_ID, ITEMS.ITEM_NAME, ITEMS.ITEM_DESC, CAST(ITEMS.PRICE AS INTEGER) FROM ITEMS WHERE (ITEMS.ITEM_ID >1005)
·         The Integration Service generates an INSERT SELECT statement to get the ID, NAME, and DESCRIPTION columns from the source table. It filters the data using a WHERE clause. The Integration Service does not extract data from the database at this time.


Pushdown Optimization Types

You can configure the following types of pushdown optimization:
Source-side pushdown optimization. The Integration Service pushes as much transformation logic as possible to the source database.
Target-side pushdown optimization. The Integration Service pushes as much transformation logic as possible to the target database.

Full pushdown optimization. The Integration Service attempts to push all transformation logic to the target database. If the Integration Service cannot push all transformation logic to the database, it performs both source-side and target-side pushdown optimization.


Running Source-Side Pushdown Optimization Sessions
When you run a session configured for source-side pushdown optimization, the Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the database.
The Integration Service generates and executes a SELECT statement based on the transformation logic for each transformation it can push to the database. Then, it reads the results of this SQL query and processes the remaining transformations.

Running Target-Side Pushdown Optimization Sessions

When you run a session configured for target-side pushdown optimization, the Integration Service analyzes the mapping from the target to the source or until it reaches an upstream transformation it cannot push to the database. It generates an INSERT, DELETE, or UPDATE statement based on the transformation logic for each transformation it can push to the database. The Integration Service processes the transformation logic up to the point that it can push the transformation logic to the target database. Then, it executes the generated SQL.

Running Full Pushdown Optimization Sessions

To use full pushdown optimization, the source and target databases must be in the same relational database management system. When you run a session configured for full pushdown optimization, the Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the target database. It generates and executes SQL statements against the source or target based on the transformation logic it can push to the database.
When you run a session with large quantities of data and full pushdown optimization, the database server must run a long transaction. Consider the following database performance issues when you generate a long transaction:
A long transaction uses more database resources.
A long transaction locks the database for longer periods of time. This reduces database concurrency and increases the likelihood of deadlock.

A long transaction increases the likelihood of an unexpected event.
To minimize database performance issues for long transactions, consider using source-side or target-side pushdown optimization.

Integration Service Behavior with Full Optimization

When you configure a session for full optimization, the Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the target database. If the Integration Service cannot push all transformation logic to the target database, it tries to push all transformation logic to the source database. If it cannot push all transformation logic to the source or target, the Integration Service pushes as much transformation logic to the source database, processes intermediate transformations that it cannot push to any database, and then pushes the remaining transformation logic to the target database. The Integration Service generates and executes an INSERT SELECT, DELETE, or UPDATE statement for each database to which it pushes transformation logic.
For example, a mapping contains the following transformations:
The Rank transformation cannot be pushed to the source or target database. If you configure the session for full pushdown optimization, the Integration Service pushes the Source Qualifier transformation and the Aggregator transformation to the source, processes the Rank transformation, and pushes the Expression transformation and target to the target database. The Integration Service does not fail the session if it can push only part of the transformation logic to the database.

Scheduler
We can schedule a workflow to run continuously, repeat at a given time or interval, or we can manually start a workflow. The Integration Service runs a scheduled workflow as configured.
By default, the workflow runs on demand. We can change the schedule settings by editing the scheduler. If we change schedule settings, the Integration Service reschedules the workflow according to the new settings.
  • A scheduler is a repository object that contains a set of schedule settings.
  • Scheduler can be non-reusable or reusable.
  • The Workflow Manager marks a workflow invalid if we delete the scheduler associated with the workflow.
  • If we choose a different Integration Service for the workflow or restart the Integration Service, it reschedules all workflows.
  • If we delete a folder, the Integration Service removes workflows from the schedule.
  • The Integration Service does not run the workflow if:
  • The prior workflow run fails.
  • We remove the workflow from the schedule
  • The Integration Service is running in safe mode
Creating a Reusable Scheduler
  • For each folder, the Workflow Manager lets us create reusable schedulers so we can reuse the same set of scheduling settings for workflows in the folder.
  • Use a reusable scheduler so we do not need to configure the same set of scheduling settings in each workflow.
  • When we delete a reusable scheduler, all workflows that use the deleted scheduler becomes invalid. To make the workflows valid, we must edit them and replace the missing scheduler.


Steps:
  1. Open the folder where we want to create the scheduler.
  2. In the Workflow Designer, click Workflows > Schedulers.
  3. Click Add to add a new scheduler.
  4. In the General tab, enter a name for the scheduler.
  5. Configure the scheduler settings in the Scheduler tab.
  6. Click Apply and OK.
Configuring Scheduler Settings
Configure the Schedule tab of the scheduler to set run options, schedule options, start options, and end options for the schedule.
There are 3 run options:
  1. Run on Demand
  2. Run Continuously
  3. Run on Server initialization


1. Run on Demand:
Integration Service runs the workflow when we start the workflow manually.
2. Run Continuously:
Integration Service runs the workflow as soon as the service initializes. The Integration Service then starts the next run of the workflow as soon as it finishes the previous run.
3. Run on Server initialization
Integration Service runs the workflow as soon as the service is initialized. The Integration Service then starts the next run of the workflow according to settings in Schedule Options.
Schedule options for Run on Server initialization:
  •  Run Once: To run the workflow just once.
  •  Run every: Run the workflow at regular intervals, as configured.
  •  Customized Repeat: Integration Service runs the workflow on the dates and times specified in the Repeat dialog box.
Start options for Run on Server initialization:
·         Start Date
·         Start Time
End options for Run on Server initialization:
  • End on: IS stops scheduling the workflow in the selected date.
  • End After: IS stops scheduling the workflow after the set number of
  • Workflow runs.
  • Forever: IS schedules the workflow as long as the workflow does not fail.
Creating a Non-Reusable Scheduler
  1. In the Workflow Designer, open the workflow.
  2. Click Workflows > Edit.
  3. In the Scheduler tab, choose Non-reusable. Select Reusable if we want to select an existing reusable scheduler for the workflow.
  4. Note: If we do not have a reusable scheduler in the folder, we must
  5. Create one before we choose Reusable.
  6. Click the right side of the Scheduler field to edit scheduling settings for the non- reusable scheduler
  7. If we select Reusable, choose a reusable scheduler from the Scheduler
  8. Browser dialog box.
  9. Click Ok.
Points to Ponder:
  • To remove a workflow from its schedule, right-click the workflow in the Navigator window and choose Unscheduled Workflow.
  • To reschedule a workflow on its original schedule, right-click the workflow in the Navigator window and choose Schedule Workflow.

Pushdown Optimization Overview
·        You can push transformation logic to the source or target database using pushdown optimization. When you run a session configured for pushdown optimization, the Integration Service translates the transformation logic into SQL queries and sends the SQL queries to the database. The source or target database executes the SQL queries to process the transformations.
·         The amount of transformation logic you can push to the database depends on the database, transformation logic, and mapping and session configuration. The Integration Service processes all transformation logic that it cannot push to a database.
·         Use the Pushdown Optimization Viewer to preview the SQL statements and mapping logic that the Integration Service can push to the source or target database. You can also use the Pushdown Optimization Viewer to view the messages related to pushdown optimization.
·         The following figure shows a mapping containing transformation logic that can be pushed to the source database:
·         This mapping contains a Filter transformation that filters out all items except those with an ID greater than 1005. The Integration Service can push the transformation logic to the database. It generates the following SQL statement to process the transformation logic:
·         INSERT INTO ITEMS(ITEM_ID, ITEM_NAME, ITEM_DESC, n_PRICE) SELECT ITEMS.ITEM_ID, ITEMS.ITEM_NAME, ITEMS.ITEM_DESC, CAST(ITEMS.PRICE AS INTEGER) FROM ITEMS WHERE (ITEMS.ITEM_ID >1005)
·         The Integration Service generates an INSERT SELECT statement to get the ID, NAME, and DESCRIPTION columns from the source table. It filters the data using a WHERE clause. The Integration Service does not extract data from the database at this time.
Pushdown Optimization Types
You can configure the following types of pushdown optimization:
Source-side pushdown optimization. The Integration Service pushes as much transformation logic as possible to the source database.
Target-side pushdown optimization. The Integration Service pushes as much transformation logic as possible to the target database.

Full pushdown optimization. The Integration Service attempts to push all transformation logic to the target database. If the Integration Service cannot push all transformation logic to the database, it performs both source-side and target-side pushdown optimization.
Running Source-Side Pushdown Optimization Sessions
When you run a session configured for source-side pushdown optimization, the Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the database.
The Integration Service generates and executes a SELECT statement based on the transformation logic for each transformation it can push to the database. Then, it reads the results of this SQL query and processes the remaining transformations.
Running Target-Side Pushdown Optimization Sessions
When you run a session configured for target-side pushdown optimization, the Integration Service analyzes the mapping from the target to the source or until it reaches an upstream transformation it cannot push to the database. It generates an INSERT, DELETE, or UPDATE statement based on the transformation logic for each transformation it can push to the database. The Integration Service processes the transformation logic up to the point that it can push the transformation logic to the target database. Then, it executes the generated SQL.
Running Full Pushdown Optimization Sessions
To use full pushdown optimization, the source and target databases must be in the same relational database management system. When you run a session configured for full pushdown optimization, the Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the target database. It generates and executes SQL statements against the source or target based on the transformation logic it can push to the database.
When you run a session with large quantities of data and full pushdown optimization, the database server must run a long transaction. Consider the following database performance issues when you generate a long transaction:
A long transaction uses more database resources.
A long transaction locks the database for longer periods of time. This reduces database concurrency and increases the likelihood of deadlock.

A long transaction increases the likelihood of an unexpected event.
To minimize database performance issues for long transactions, consider using source-side or target-side pushdown optimization.



Integration Service Behavior with Full Optimization

When you configure a session for full optimization, the Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the target database. If the Integration Service cannot push all transformation logic to the target database, it tries to push all transformation logic to the source database. If it cannot push all transformation logic to the source or target, the Integration Service pushes as much transformation logic to the source database, processes intermediate transformations that it cannot push to any database, and then pushes the remaining transformation logic to the target database. The Integration Service generates and executes an INSERT SELECT, DELETE, or UPDATE statement for each database to which it pushes transformation logic.


For example, a mapping contains the following transformations:
The Rank transformation cannot be pushed to the source or target database. If you configure the session for full pushdown optimization, the Integration Service pushes the Source Qualifier transformation and the Aggregator transformation to the source, processes the Rank transformation, and pushes the Expression transformation and target to the target database. The Integration Service does not fail the session if it can push only part of the transformation logic to the database.

Active and Idle Databases
During pushdown optimization, the Integration Service pushes the transformation logic to one database, which is called the active database. A database that does not process transformation logic is called an idle database. For example, a mapping contains two sources that are joined by a Joiner transformation. If the session is configured for source-side pushdown optimization, the Integration Service pushes the Joiner transformation logic to the source in the detail pipeline, which is the active database. The source in the master pipeline is the idle database because it does not process transformation logic.
The Integration Service uses the following criteria to determine which database is active or idle:
1. When using full pushdown optimization, the target database is active and the source database is idle.
2. In sessions that contain a Lookup transformation, the source or target database is active, and the lookup database is idle.

3. In sessions that contain a Joiner transformation, the source in the detail pipeline is active, and the source in the master pipeline is idle.
4. In sessions that contain a Union transformation, the source in the first input group is active. The sources in other input groups are idle.
To push transformation logic to an active database, the database user account of the active database must be able to read from the idle databases.


No comments:

Post a Comment