Western European (Windows) and Unicode (UTF-8) File Encoding


Relativity allows admins to import load files that have either Western European (Windows) or Unicode (UTF-8) encoding. What is the difference between the two? Western European (Windows) or ANSI (Windows-1252) text is a small extension of the standard ASCII text English character set that includes characters used in other Latin alphabet European languages. This chart shows the full character set:

UTF-8, or Unicode, consists of more than 128,000 characters, accounting for Greek, Chinese, Cyrillic, Japanese and many other non-Latin alphabets. For a fuller discussion, see the Tip of the Night for November 25, 2015. In Relativity if you attempt to import Unicode text into a field that is not Unicode enabled, you'll get scrambled results. You can set a field for Unicode by going to Administration . . . Fields.

If you need to quickly determine which encoding a load file uses, download File Encoding Checker from CodePlex. See this page, https://encodingchecker.codeplex.com/ . In the file mask box, simply enter a string such as *.dat to find all of the load files. Then click 'View'. You'll get a list showing each file's encoding.


Contact Me With Your Litigation Support Questions:

seankevinoshea@hotmail.com

  • Twitter Long Shadow

© 2015 by Sean O'Shea . Proudly created with Wix.com