Data flow transformations in SSIS use memory/buffers in different ways. The way transformation uses memory can impact the performance of your package. Transformations memory/Buffer usage are classified into 3 categories:
1.Non Blocking 2.Semi Blocking 3. Full Blocking
All the dataflow components can be categorized to be either Synchronous or Asynchronous.
Synchronous vs Asynchronous :
- Synchronous components The output of an synchronous component uses the same buffer as the input. Reusing of the input buffer is possible because the output of an synchronous component always contain exactly the same number of records as the input. Number of records IN == Number of records OUT.
- Asynchronous components The output of an asynchronous component uses a new buffer. It’s not possible to reuse the input buffer because an asynchronous component can have more or less output records than input records.
One Main thing is that synchronous components reuse buffers and therefore are generally faster than asynchronous components.
All source adapters are asynchronous, they create two buffers; one for the success output and one for the error output.
All destination adapters on the other hand, are synchronous.
Here are some tips that will help you minimize your use of asynchronous transformations:
- Instead of using a Merge Join in your Data Flow, perform the joins in the source query or in a transform environment.
- If you absolutely have to use Merge Join, make sure that you sort the data in your source query. In the Advanced Editor, set the “IsSorted” property to true, and set the Sort Key Position on the Output Columns to the appropriate values.
- Instead of using an Aggregate transform do perform a count, consider using Row Count, which is a synchronous transformation.
- If you are using a Pivot or Unpivot transformation and it is performing poorly, consider using staging tables in your solution in order to leverage your SQL server environment to perform these transformations instead of doing it in SSIS.
- Even though Union All is a semi-blocking asynchronous transformation, you will probably not achieve better performance by replacing the Union All with multiple inserts.
Non-blocking | Semi-blocking | Fully-blocking | |
Synchronous or asynchronous | Synchronous | Asynchronous | Asynchronous |
Number of rows in == number of rows out | True | Usually False | Usually False |
Must read all input before they can output | False | False | True |
New buffer created? | False | True | True |
SSIS transformations categorized:
Non-Blocking transformations | Semi-blocking transformations | Blocking transformations |
Audit | Data Mining Query | Aggregate |
Character Map | Merge | Fuzzy Grouping |
Conditional Split | Merge Join | Fuzzy Lookup |
Copy Column | Pivot | Row Sampling |
Data Conversion | Unpivot | Sort |
Derived Column | Term Lookup | Term Extraction |
Lookup | Union All | |
Multicast | ||
Percent Sampling | ||
Row Count | ||
Script Component | ||
Export Column | ||
Import Column | ||
Slowly Changing Dimension | ||
OLE DB Command |
One comment