UTF != UTF

I have recently faced a very annoying issue with one of the ETL processes. Very annoying issues are often easily resolvable in simple ways[citation needed] and thankfully this was one of them. However, before I managed to solve it, I wasted a whole bunch of otherwise perfectly usable minutes which is something I always loathe.

By writing this entry I am hoping to save you from wasting those precious minutes yourself.

The issue:

You are trying to create a Flat File source in an SSIS package, using UTF encoding and {CR}{LF} line separators. So, you select the text file, you leave the default line separator ({CR}{LF}) then you tick the „Unicode” check-box…

utf1

…and switch to the „Columns” tab. And suddenly, out of the blue, the UI throws you a sinister message that, despite reading 64KB of data from the file, it could not find a single instance of {CR}{LF}.

You open the source file with your favourite text editor (Notepad++ for example) and you double-, triple- and quadruple-check that {CR}{LF}s are present at the end of each and every line. You scratch your head. You repeat the whole process, like, five times. If you are really desperate, you may even try to convert Windows EOL characters to Linux ones, hoping that halving the number of EOL bytes will magically solve the issue. But there’s no magic. Nothing happens and you reach out for the Ultimate Weapon: Google.

After googling for a while you start to feel nervous. A couple of people apparently had this issue before and posted it on-line but there doesn’t seem to be a single, correct answer. Some people suggest using BCP. Some other people suggest using a different ETL tool. And so on.

Finally, you find this blog entry you are just reading.

Your life is just about to improve a little!

The solution:

Rather that selecting the „Unicode” check-box, leave it blank and pick the „65001 (UTF-8)” option from the „Code page” drop-down.

utf2

Switch to the „Columns” tab. Set all the columns’ properties you need. Save, execute, admire! Dance naked! Sing „hosanna” to the Moon! Have a pint!

Enjoy 😉

Autor: xpil

Po czterdziestce. Żonaty. Dzieciaty. Komputerowiec. Krwiodawca. Emigrant. Rusofil. Lemofil. Sarkastyczny. Uparty. Mól książkowy. Ateista. Apolityczny. Nie oglądam TV. Uwielbiam matematykę. Walę prosto z mostu. Gram na paru instrumentach. Lubię planszówki. Słucham bluesa, poezji śpiewanej i kapel a’capella. || Kliknij tutaj po więcej szczegółów ||

Dodaj komentarz

Bądź pierwszy!

Powiadom o
avatar
wpDiscuz