Western European (Windows) and Unicode (UTF-8) File Encoding
top of page

Western European (Windows) and Unicode (UTF-8) File Encoding


Relativity allows admins to import load files that have either Western European (Windows) or Unicode (UTF-8) encoding. What is the difference between the two? Western European (Windows) or ANSI (Windows-1252) text is a small extension of the standard ASCII text English character set that includes characters used in other Latin alphabet European languages. This chart shows the full character set:

UTF-8, or Unicode, consists of more than 128,000 characters, accounting for Greek, Chinese, Cyrillic, Japanese and many other non-Latin alphabets. For a fuller discussion, see the Tip of the Night for November 25, 2015. In Relativity if you attempt to import Unicode text into a field that is not Unicode enabled, you'll get scrambled results. You can set a field for Unicode by going to Administration . . . Fields.

If you need to quickly determine which encoding a load file uses, download File Encoding Checker from CodePlex. See this page, https://encodingchecker.codeplex.com/ . In the file mask box, simply enter a string such as *.dat to find all of the load files. Then click 'View'. You'll get a list showing each file's encoding.


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page