Western European (Windows) and Unicode (UTF-8) File Encoding

June 29, 2016

Relativity allows admins to import load files that have either Western European (Windows) or Unicode (UTF-8) encoding.   What is the difference between the two?     Western European (Windows) or ANSI (Windows-1252) text is a small extension of the standard ASCII text English character set that includes characters used in other Latin alphabet European languages.  This chart shows the full character set:

 

 

UTF-8, or Unicode, consists of more than 128,000 characters, accounting for Greek, Chinese, Cyrillic, Japanese and many other non-Latin alphabets.   For a fuller discussion, see the Tip of the Night for November 25, 2015.     In Relativity if you attempt to import Unicode text into a field that is not Unicode enabled, you'll get scrambled results.   You can set a field for Unicode by going to Administration . . . Fields.

 

 

 

 If you need to quickly determine which encoding a load file uses, download File Encoding Checker from CodePlex.   See this page, https://encodingchecker.codeplex.com/ .    In the file mask box, simply enter a string such as *.dat to find all of the load files.   Then click 'View'.   You'll get a list showing each file's encoding.  

 

 

 

 

 

Please reload

Contact Me With Your Litigation Support Questions:

seankevinoshea@hotmail.com

  • Twitter Long Shadow

© 2015 by Sean O'Shea . Proudly created with Wix.com