SSIS: The Complete Guide Tutorial | CHECK-OUT
SSIS Tutorial

SSIS: The Complete Guide Tutorial For Free | CHECK-OUT

Last updated on 12th Jun 2020, Blog, Tutorials

About author

Selvakumar (SSIS Developer )

Selvakumar is an industry expert and subject specialist. He is a qualified professional with more than seven years of ad-hoc SQL queries, sub-queries, CTEs, flow, data flow, error handling, data transformation tools, data conversion, executing SQL tasks, sending email tasks, and data flow task experience. His articles help the learners get insights into the domain.

(5.0) | 19288 Ratings 2331

WHAT IS SSIS? 

SQL Server Integration Services (SSIS) is the anchor in a growing suite of products that make up the Microsoft SQL Server Business Intelligence (BI) platform. What makes SSIS so important is without the data movement and cleansing features that SSIS brings to the table, the other SQL Server BI products can’t operate. In its simplest form, SSIS is an enterprise-level, in-memory ETL tool. However, SSIS is not just a fancy wrapper around an import wizard. In a drag-and-drop development environment, ETL developers can snap together intricate workflows and out-of-the-box data-cleansing flows that rival custom coding and expensive million-dollar, third-party tools. The best thing about SSIS is that you have already paid for it when you license SQL Server.

ssis-image

SSIS is a platform for data integration and workflow applications. It features a data warehousing tool used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data.

First released with Microsoft SQL Server 2005, SSIS replaced Data Transformation Services, which had been a feature of SQL Server since Version 7.0. Unlike DTS, which was included in all versions, SSIS is only available in the “Standard”, “Business Intelligence” and “Enterprise” editions. With Microsoft “Visual Studio Dev Essentials” it is now possible to use SSIS with Visual Studio 2017 free of cost so long as it is for development and learning purposes only

Tools of the trade:

Data Transformation Services

Data Transformation Services, or DTS, is a set of objects and utilities to allow the automation of extract, transform and load operations to or from a database. The objects are DTS packages and their components, and the utilities are called DTS tools. DTS was included with earlier versions of Microsoft SQL Server, and was almost always used with SQL Server databases, although it could be used independently with other databases.

DTS allows data to be transformed and loaded from heterogeneous sources using OLE DB, ODBC, or text-only files, into any supported database. DTS can also allow automation of data import or transformation on a scheduled basis, and can perform additional functions such as FTPing files and executing external programs. In addition, DTS provides an alternative method of version control and backup for packages when used in conjunction with a version control system, such as Microsoft Visual SourceSafe.

DTS has been superseded by SQL Server Integration Services in later releases of Microsoft SQL Server though there was some backwards compatibility and ability to run DTS packages in the new SSIS for a time.

    Subscribe For Free Demo

    [custom_views_post_title]

    DTS Packages

    The DTS package is the fundamental logical component of DTS; every DTS object is a child component of the package. Packages are used whenever one modifies data using DTS. All the metadata about the data transformation is contained within the package. Packages can be saved directly in a SQL Server, or can be saved in the Microsoft Repository or in COM files. SQL Server 2000 also allows a programmer to save packages in a Visual Basic or other language file (when stored to a VB file, the package is actually scripted—that is, a VB script is executed to dynamically create the package objects and its component objects).

    A package can contain any number of connection objects, but does not have to contain any. These allow the package to read data from any OLE DB-compliant data source, and can be expanded to handle other sorts of data. The functionality of a package is organized into tasks and steps.

    A DTS Task is a discrete set of functionalities executed as a single step in a DTS package. Each task defines a work item to be performed as part of the data movement and data transformation process or as a job to be executed.

    dts-packages

    In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is considered a core component of business intelligence. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for workers throughout the enterprise.

    dts-package-overview

    Import and Export Wizard

    The SSIS Import/Export Wizard lets the user create packages that move data from a single data source to a destination with no transformations. The Wizard can quickly move data from a variety of source types to a variety of destination types, including text files and other SQL Server instances.

    Developers tasked with creating or maintaining SSIS packages use a visual development tool based on Microsoft Visual Studio called the SQL Server Business Intelligence Development Studio (BIDS). It allows users to edit SSIS packages using a drag-and-drop user interface. A scripting environment for writing programming code is also available in the tool. A package holds a variety of elements that define a workflow. Upon package execution, the tool provides color-coded real-time monitoring. (Note: In more recent versions MS SQL Server, BIDS has been replaced with “SQL Server Data Tools – Business Intelligence” (SSDT-BI).)

    SSIS Architecture:

    ssis-architecture-overview
    Course Curriculum

    Learn Hands-on Experience from SSIS Certification Course

    Weekday / Weekend BatchesSee Batch Details

    Following are components of SSIS architecture:

    • Control Flow (Stores containers and Tasks)
    • Data Flow (Source, Destination, Transformations)
    • Event Handler (sending of messages, Emails)
    • Package Explorer (Offers a single view for all in package)
    • Parameters (User Interaction)

    Let’s understand each component in detail:

    Control Flow

    Control flow is a brain of SSIS package. It helps you to arranges the order of execution for all its components. The components contain containers and tasks which are managed by precedence constraints.

    Precedence Constraints

    Precedence constrain are package component which direct tasks to execute in a predefined order. It also defines the workflow of the entire SSIS package. It controls the execution of the two linked tasks by executing the destination tasks based on the result of the earlier task — business rules which are defined using special expressions.

    Task

    A ‘Task’ is an individual unit of work. It is the same as a method/function used in a programming language. However, in SSIS, you don’t use coding methods. Instead, you will use drag & drop technique to design surface and to configure them.

    Containers

    The container is units for grouping tasks together into units of work. Apart from offering visual consistency, it also allows you to declare variables and event handlers which should be in the scope of that specific container.

    Three types of containers in SSIS are:

    • Sequence Container: allows you to organize subsidiary tasks by grouping them, and allows you to you apply transactions or assign logging to the container.
    • For loop container: Provides the same functionality as the sequence Container except that it also lets you run the tasks multiple times. However, it is based on an evaluation condition, like a looping from 1 to 100.
    • For each Loop Container: It also allows looping. But the difference that instead of using a condition expression, loop s done over a set of objects, likes files in a folder.

    Data Flow:

    The main use of the SSIS tool is to extract data into the server’s memory, transform it, and write it to another destination. If Control Flow is the brain, Data Flow is the heart of SSIS

    Packages

    Another core component of SSIS is the notion of a package. It is a collection of tasks which execute in an orderly fashion. Here, president constraints help manage the order in which the task will execute.

    A package can help you to saves files onto a SQL Server, in the msdb or package catalog database. It can save as a .dtsx file, which is a structured file very similar to .rdl files are to Reporting Services.

    ssis-packages

    Parameters

    Parameters behave much like a variable but with a few main exceptions. It can be set outside the package easily. It can be designated as values that must be passed in for the package to start.

    Features of SSIS:

    The features of SSIS are as follows:

    • Studio environments
    • Packages
    • Expressions
    • Event handler

    Studio environments

    SQL Server Integration Services included two studios

    SSDT:  SQL  Server Data Tools

    It helps to develop the integration service  packages  and We can perform the following  jobs  in SSDT

    • Copy basic package data from source to destination
    • Complete flow control and data flow are included when to create the packages
    • We can update the properties of packages while run time
    • Deployment creating
    • Save package copies to SQL server msdb

    Step 1: Open the SQL Server data tools.
    Step 2: Click on File –> New –> Project

    SSMS: SQL Server Management Studio

    It helps to manage packages in a production environment and we can perform the following jobs in SSMS.

    • Folder creation to organize the packages
    • With the help of Execute package utility, we can run the packages and stored  on the local computer
    • Generate a command line when we run the execute package utility
    • Storing and fetching packages to and from the SQL Server msdb

    Packages

    An SSIS package is a collection of Control flow and data flow ,control flow includes the two things such as task and data flow task and Data flow includes source, transformation, and destination.

    Expressions

    SSIS Expression is a combination of literals, identifiers, and operators
    Literals: An interpreted variable is known as a literal, different type of literals are

    • Numeric literal: expression supports integral literal and nonintegral literal
    • String literal: It has zero or more characters enclosed with the quotation and also it always followed by the \ symbol.
    • Boolean literal: It has only two options true or false.

    Event Handling

    A workflow can be designed for a number of events in the different scopes where they might occur. In this way, tasks may be executed in response to happenings within the package — such as cleaning up after errors. Creating event handler is similar to building packages and SSIS event handlers are

    • OnError
    • OnExecStatusChanged
    • OnInformation
    • OnPostExecute
    • OnPostValidate
    • OnPreExecute
    • OnProgress
    • OnTaskFailed
    • OnVariableValueChanged
    • OnWarning

    Example:  Let us consider two excel sheets as  follows

    File 1:

    event-handlers-file1

    File 2:

    event-handlers-file2

    Connections:

    A connection includes the information necessary to connect to a particular data source. Tasks can reference the connection by its name, allowing the details of the connection to be changed or configured at run time.

    Parameters (SQL Server 2012 Integration Services):

    Parameters allow you to assign values to properties within packages at the time of package execution. You can have project parameters and package parameters. In general, if you are deploying a package using the package deployment model, you should use configurations instead of parameters.

    Variables:

    • Tasks may reference variables to store results, make decisions, or affect their configuration.
    • A package may be saved to a file or to a store with a hierarchical namespace within a SQL Server instance. In either case, the package content is persisted in XML.
    • Once completed, the designer also allows the user to start the package’s execution. Once started, the package may be readily debugged or monitored.

    Other included tools:

    Aside from the Import/Export Wizard and the designer, the product includes a few other notable tools.

    • DTEXEC executes a package from the command line wherever it may be stored.Before running the package, the tool may be instructed to apply configuration information, which will allow the same package to be reused with slightly different parameters, including different connection strings for its endpoints.
    • DTUTIL provides the ability to manage packages from the command prompt. The tool can copy or move a package from a file into the server store, or back out again. Among a few other sundry functions, it can be used to delete, rename, encrypt, or decrypt packages.

    Extensibility & Programmability:

    • Users may write code to define their own connection objects, log providers, transforms, and tasks.
    • SSIS features a programmable object model that allows developers to write their own hosts for package execution. Such a host can respond to events, start and stop packages, and so on.
    • The object model also allows developers to create, store, and load packages, as well as create, destroy, and modify any of the contained objects.
    • Within limits, SSIS packages can load and call CLI assembly DLLs, providing access to virtually any kind of operation permissible by the .NET CLR.
    • SSIS can be used on all SQL Server 2005, 2008, 2008 R2, 2012, 2014 and 2016 editions except Express and Workgroup.
    Course Curriculum

    Enroll in Hands-on Practical SSIS Training from Industry Experts

    • Instructor-led Sessions
    • Real-life Case Studies
    • Assignments
    Explore Curriculum
    Audit For loop Container Row Count
    Cache Transform Foreach Loop Container Row Sampling
    Copy/Map (Fuzzy) Lookup Script Component
    Data Conversion Fuzzy Grouping SlowlyChanging Dimension
    Data Mining Model Training OLE DB Command Term Extraction
    Data Mining Query Partition Processing Term Lookup
    Derived Column Percentage Sampling Unpivot
    Dimension Processing RowSampling Transformation Unpivot

    The Conditional Split transformation is used to conditionally route rows to other transformation objects based on a particular condition. It is similar to the “if else” construct in the C language.

    DTS wizards:

    The DTS wizards can be used to perform simple or common DTS tasks. These include the Import/Export Wizard and the Copy of Database Wizard. They provide the simplest method of copying data between OLE DB data sources. There is a great deal of functionality that is not available by merely using a wizard. However, a package created with a wizard can be saved and later altered with one of the other DTS tools.

    A Create Publishing Wizard is also available to schedule packages to run at certain times. This only works if SQL Server Agent is running; otherwise the package will be scheduled, but will not be executed.

    DTS Designer:

    The DTS Designer is a graphical tool used to build complex DTS Packages with workflows and event-driven logic. DTS Designer can also be used to edit and customize DTS Packages created with the DTS wizard.

    Each connection and task in DTS Designer is shown with a specific icon. These icons are joined with precedence constraints, which specify the order and requirements for tasks to be run. One task may run, for instance, only if another task succeeds (or fails). Other tasks may run concurrently.

    The DTS Designer has been criticized for having unusual quirks and limitations, such as the inability to visually copy and paste multiple tasks at one time. Many of these shortcomings have been overcome in SQL Server Integration Services, DTS’s successor.

    DTS Query Designer:

    A graphical tool used to build queries in DTS.

    DTS Run Utility:

    • DTS Packages can be run from the command line using the DTSRUN Utility.
    • The utility is invoked using the following syntax:

    • dtsrun /S server_name[\instance_name]
    •         { {/[~]U user_name [/[~]P password]} | /E }
    •     ]
    •     {    
    •         {/[~]N package_name }
    •         | {/[~]G package_guid_string}
    •         | {/[~]V package_version_guid_string}
    •     }
    •     [/[~]M package_password]
    •     [/[~]F filename]
    •     [/[~]R repository_database_name]
    •     [/A global_variable_name:typeid=value] 
    •     [/L log_file_name]
    •     [/W NT_event_log_completion_status]
    •     [/Z] [/!X] [/!D] [/!Y] [/!C]
    • ]

    When passing in parameters which are mapped to Global Variables, you are required to include the typeid. This is rather difficult to find on the Microsoft site. Below are the TypeIds used in passing in these values.

    SSIS Sample Resumes! Download & Edit, Get Noticed by Top Employers! Download

    Uses of SSIS:

    SSIS includes graphical tools and wizards for building and debugging packages; tasks for performing workflow functions such as FTP operations, executing SQL statements, and sending e-mail messages; data sources and destinations for extracting and loading data; transformations for cleaning, aggregating, merging, and 

    SSIS is a platform for data integration and workflow applications. It features a data warehousing tool used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data.

    To connect to the Integration Services Service

    • Open SQL Server Management Studio.
    • Click Object Explorer on the View menu.
    • On the Object Explorer toolbar, click Connect, and then click Integration Services.
    • In the Connect to Server dialog box, provide a server name. …
    • Click Connect.

    Since the database is created in a normal SQL Server instance, so it’s an OLTP database as OLAP databases are created in Analysis Server instances only.

    Advantages:

    • SSIS can handle data from heterogeneous data sources at the same package. 
    • SSIS consumes data which are difficult like FTP, HTTP,MSMQ, and Analysis services etc.…
    • SSIS provides transformation functionality.
    • Easier to maintain and package configuration.
    • Tightly integrated with Microsoft Visual Studio and SQL Server.

    Disadvantages SSIS:

    • To see package execution reports need Management Studio rather than being published to reporting services or another way.
    • If multiple packages are available that need to run parallel then you have trouble.
    • SSIS memory usage is high and it conflicts with SQL.

    Conclusion:

    SQL Server Integration Services provide tasks to transform and validate data during the load process and transformations to insert data into your destination. Rather than create a stored procedure with T-SQL to validate or change data, it is good to know about the different SSIS tasks and how they can be used.Hope you have found all the details that you were looking for, in this article.

    Are you looking training with Right Jobs?

    Contact Us
    Get Training Quote for Free