Thursday, December 15, 2011

Transactions in SSIS

What is the transaction support feature in SSIS?

When you execute SSIS package, every task executes in its own Transaction. We can group tasks together to execute as a single Transaction.

This can be achieved by setting Transaction property. In this way we can ensure either all of the tasks complete successfully and can roll back if any of the task fails.

What properties do you need to configure in order to use the transaction feature in SSIS?

  • Suppose you want to execute 5 tasks in a single transaction, in this case you can place all 5 tasks in a Sequence Container and set the TransactionOption and IsolationLevel properties appropriately.
    • The TransactionOption property expects one of these three values:
      • Supported - The container/task does not create a separate transaction, but if the parent object has already initiated a transaction then participate in it
      • Required - The container/task creates a new transaction irrespective of any transaction initiated by the parent object
      • NotSupported - The container/task neither creates a transaction nor participates in any transaction initiated by the parent object
  • Isolation level dictates how two more transaction maintains consistency and concurrency when they are running in parallel. To learn more about Transaction and Isolation Level.

Transaction Isolation levels:

The transaction isolation level determines the duration that locks are held. We'll use SQL Server as an example. The following transaction isolation levels are available in SQL Server:

  • READ UNCOMMITTED - reads do not acquire share locks and they don't wait on locks. This is often referred to as a dirty read because you can read modified data that hasn't been committed yet and it could get rolled back after you read it.
  • READ COMMITTED - reads acquire share locks and wait on any data modified by a transaction in process. This is the SQL Server default.
  • REPEATABLE READ - same as READ COMMITTED but in addition share locks are retained on rows read for the duration of the transaction. In other words any row that is read cannot be modified by another connection until the transaction commits or rolls back.
  • SERIALIZABLE - same as REPEATABLE READ but in addition no other connection can insert rows if the new rows would appear in a SELECT statement already issued. In other words if you issue a select statement in a transaction using the SERIALIZABLE isolation level you will get the same exact result set if you issue the select statement again within the same transaction.

The built-in transaction support in SSIS makes use of the Distributed Transaction Coordinator (MSDTC) service which must be running. MSDTC also allows you to perform distributed transactions; e.g. updating a SQL Server database and an Oracle database in the same transaction. If you execute an SSIS package that utilizes the built-in transaction support and MSDTC is not running, you will get an error message like the following:

Error: 0xC001401A at Transaction: The SSIS Runtime has failed  to start the distributed transaction due to error 0x8004D01B  "The Transaction Manager is not available.". The DTC transaction  failed to start. This could occur because the MSDTC Service is not running.

Note also that the SSIS package elements also have an IsolationLevel property with a default of Serializable. As discussed above in the section on Transaction Isolation Levels, this setting impacts the duration of locks as well as whether shared locks are acquired.

When I enabled transactions in an SSIS package, it failed with this exception: "The Transaction Manager is not available. The DTC transaction failed to start." What caused this exception and how can it be fixed?

  • SSIS uses the MS DTC (Microsoft Distributed Transaction Coordinator) Windows Service for transaction support. As such, you need to ensure this service is running on the machine where you are actually executing the SSIS packages or the package execution will fail with the exception message as indicated in this question.

What is event handling in SSIS?

  • Like many other programming languages, SSIS and its components raise different events during the execution of the code. You can write an even handler to capture the event and handle it in a few different ways. For example consider you have a data flow task and before execution of this data flow task you want to make some environmental changes such as creating a table to write data into, deleting/truncating a table you want to write, etc. Along the same lines, after execution of the data flow task you want to cleanup some staging tables. In this circumstance you can write an event handler for the OnPreExcute event of the data flow task which gets executed before the actual execution of the data flow. Similar to that you can also write an event handler for OnPostExecute event of the data flow task which gets executed after the execution of the actual data flow task. Please note, not all the tasks raise the same events as others. There might be some specific events related to a specific task that you can use with one object and not with others.

No comments: