Tag Archives: SSIS

SSIS – Non Blocking , Partially Blocking and Full Blocking

Data flow transformations in SSIS use memory/buffers in different ways.  The way transformation uses memory can impact the performance of your package.  Transformations memory/Buffer usage are classified into 3 categories:

1.Non Blocking  2.Semi Blocking  3. Full Blocking

All the dataflow components can be categorized to be either Synchronous or Asynchronous.

Synchronous vs Asynchronous : 

  • Synchronous components The output of an synchronous component uses the same buffer as the input.  Reusing of the input buffer is possible because the output of an synchronous component always contain exactly the same number of records as the input.                                                                                                                               Number of records IN == Number of records OUT.
  • Asynchronous components The output of an asynchronous component uses a new buffer. It’s not possible to reuse the input buffer because an asynchronous component can have more or less output records than input records.

One Main thing is that synchronous components reuse buffers and therefore are generally faster than asynchronous components.

All source adapters are asynchronous, they create two buffers; one for the success output and one for the error output.

All destination adapters on the other hand, are synchronous.

Here are some tips that will help you minimize your use of asynchronous transformations:

  • Instead of using a Merge Join in your Data Flow, perform the joins in the source query or in a transform environment.
  • If you absolutely have to use Merge Join, make sure that you sort the data in your source query. In the Advanced Editor, set the “IsSorted” property to true, and set the Sort Key Position on the Output Columns to the appropriate values.
  • Instead of using an Aggregate transform do perform a count, consider using Row Count, which is a synchronous transformation.
  • If you are using a Pivot or Unpivot transformation and it is performing poorly, consider using staging tables in your solution in order to leverage your SQL server environment to perform these transformations instead of doing it in SSIS.
  • Even though Union All is a semi-blocking asynchronous transformation, you will probably not achieve better performance by replacing the Union All with multiple inserts.
Non-blocking Semi-blocking Fully-blocking
Synchronous or asynchronous Synchronous Asynchronous Asynchronous
Number of rows in == number of rows out True Usually False Usually False
Must read all input before they can output False False True
New buffer created? False True True

SSIS transformations categorized:

Non-Blocking transformations Semi-blocking transformations Blocking transformations
Audit Data Mining Query Aggregate
Character Map Merge Fuzzy Grouping
Conditional Split Merge Join Fuzzy Lookup
Copy Column Pivot Row Sampling
Data Conversion Unpivot Sort
Derived Column Term Lookup Term Extraction
Lookup Union All
Multicast
Percent Sampling
Row Count
Script Component
Export Column
Import Column
Slowly Changing Dimension
OLE DB Command
Advertisements

SSIS to SQL Server Data Types Matching

SSIS Data Type

SSIS Expression

SQL Server

single-byte signed integer

(DT_I1)

 

two-byte signed integer

(DT_I2)

smallint

four-byte signed integer

(DT_I4)

int

eight-byte signed integer

(DT_I8)

bigint

single-byte unsigned integer

(DT_UI1)

tinyint

two-byte unsigned integer

(DT_UI2)

 

four-byte unsigned integer

(DT_UI4)

 

eight-byte unsigned integer

(DT_UI8)

 

float

(DT_R4)

real

double-precision float

(DT_R8)

float

string

(DT_STR, «length», «code_page»)

char, varchar

Unicode text stream

(DT_WSTR, «length»)

nchar, nvarchar, sql_variant, xml

date

(DT_DATE)

date

Boolean

(DT_BOOL)

bit

numeric

(DT_NUMERIC, «precision», «scale»)

decimal, numeric

decimal

(DT_DECIMAL, «scale»)

decimal

currency

(DT_CY)

smallmoney, money

unique identifier

(DT_GUID)

uniqueidentifier

byte stream

(DT_BYTES, «length»)

binary, varbinary, timestamp

database date

(DT_DBDATE)

date

database time

(DT_DBTIME)

 

database time with precision

(DT_DBTIME2, «scale»)

time(p)

database timestamp

(DT_DBTIMESTAMP)

datetime, smalldatetime

database timestamp with precision

(DT_DBTIMESTAMP2, «scale»)

datetime2

database timestamp with timezone

(DT_DBTIMESTAMPOFFSET, «scale»)

datetimeoffset(p)

file timestamp

(DT_FILETIME)

 

image

(DT_IMAGE)

image

text stream

(DT_TEXT, «code_page»)

text

Unicode string

(DT_NTEXT)

ntext

SSIS Data Flow Tasks with Descriptions

S No Transformation Description
1 Aggregate Summing / Averaging a total of the products purchased by a customer online to produce the final amount.
2 Audit When you want to load audit information like (Created Date, UserName , server Name..etc) to your Destination table.
3 Character Map String Manipulation, like Lowercase, Uppercase etc..
4 Conditional Split When you want to split your data to many destinations based on conditions.
5 Copy Column To have same column data with alias name
6 Data Conversion To convert the data type, (Unicode String àString…)
7 Data Mining Query Evaluating the input data against the analysis model to get a proper set.
8 Derived Column Adding a title of courtesy (Mr., Mrs., Dr, etc) before the name and removing the trailing and ending spaces.
9 Export Column When we get the normal files/pdf files/image files from different systems and save it under a particular folder and map it to the table master
10 Fuzzy Grouping Matching the name of a customer with master and child table and use it to group and get the desired set
11 Fuzzy Lookup Matching the name of a customer with master and child table and use it to group and get the desired set
12 Import Column When we get the normal files/pdf files/image files from different systems and save it under a particular folder and map it to the table master
13 Lookup Employee table information saved in a master file and the region wise data available across the table which can be mapped and joined to perform a joined querying operation
14 Merge Combine data from multiple data source like master and child employee table and get result in single dataset.
15 Merge Join Combine data from multiple data source like master and child employee table and get result in single dataset. Can use any type of join like inner, outer, left , right etc
16 Multicast Similar to the conditional split but this splits across all the parts
17 OLE DB Command Used when we need to do updates to all the rows of a table like update If a message sent to the entire customer who have made a payment today.
18 Percentage Sampling Similar to “SELECT TOP 10 PERCENT” in TSQL
19 Pivot To Convert Rows to Column and Column to Rows
20 Row Count Gives the Count of Rows
21 Row Sampling Similar to “SELECT TOP 10 ” in TSQL
22 Script Component Used for places where we need to use .net framework specific assemblies.
23 Slowly Changing Dimension When we need to use some historic dimensions of data
24 Sort To make some sorting to get the desired result. Sorting like customer who made the highest payment in a particular day.
25 Term Extraction Used to get a data from a large set of data and get the extracted output in a formatted set.
26 Term Lookup Used to get a data from a large set of data and get the extracted output in a formatted set.
27 Union All Used to get data from different data sources and get in a single dimensional format.
28 Unpivot Reverse of PIVOT Operation