The
Percentage Sampling (15 rows is not 25% of 77 in this present example) may not
provide an exact percentage of the input as it uses special algorithms for random
data selection. Interested readers may refer to the SQL Server Books online. The
Advanced Editor for the Aggregate Data Flow item has more advanced features that
were not explored in this exercise.
Using a Data Conversion
Data Flow Transformation
Migration of data from one system to another is always fraught with lots of
problems. As sources (older generation in general) and destinations (newer, cutting
edge) are neither: on the same kind of database, nor on similar kind of platforms,
data conversion can become an arduous effort. Data quality at the source can be poor
for any number of reasons, misspelled, wrong type of data in a field, missing data,
etc. These are handled at the transformation stage of the ETL process by resorting to
changing the attributes of the data and other cleansing procedures.
Knowledge of data structures of source and destination, data mapping, and the
conversion process are all important for a successful data conversion operation.
SSIS has built in tool to handle this process, Data Conversion. Data Transformation is
therefore one of the main pieces in ETL.
In this exercise, we will extract a small table residing in an MS Excel spreadsheet file
and transform this file before loading to an OLE DB Destination (an MDB file). The
schema at the destination and the parameters chosen for data conversion are such
that some data conversion errors are likely to be generated.
Pages:
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122