top of page

You can get a free 60 day trial version of iConect's XERA electronic discovery platform here. The platform is pre-loaded with documents related to the assassination of President Kennedy. iConect XERA has a very basic layout . .

. . but it does give you ability to navigate through a large document set and filter and sort through metadata associated with the documents . . .

. . .and perform advanced searching.

The search form is structured to remind the user to run searches on keyword variations.

Another distinctive feature of the platform (absent in Concordance or Relativity) is that the default layout gives a list of documents which either have the same date or the same name as the current document.

This encourages a reviewer to consider related documents which don't fall within the results of keyword search. The user can select documents from these lists and compare them to the document being viewed.

iConect XERA's design does not make it appear to be the most advanced document review platform but it does include useful features not available in Concordance or all Relativity workspaces that can facilitate document review.


 
 
  • Nov 6, 2018

Word Counter is a great online utility which you can use to find the keywords which are used most frequently in a given document.

Simply create an account and the put the text of the document you want to analyze in Word Counter.

The utility will not only detect the total word and character counts, but it will also generate a list of which words are used most frequently.

You can change the keyword density to only include two word or three word phrases.

My friend Nikolai of HashTag Legal demonstrates in YouTube video, how the analysis performed by Word Counter can be implemented in Relativity. The application he has has designed makes keyword density analysis far more useful.

Keyword Density is shown in a separate field, which gives a count for each string.

Hash Tag Legal has made numerous useful tools available here. I highly recommend checking out Nick's site. The general spirit of this site is similar to Litigation Support Tip of the Night.


 
 

Here's a continuation of my outline of the 2016 edition of Craig Ball's Electronic Discovery Workbook which I last posted about on September 28, 2018.

The chapter entitled, "The Step-by-Step of Smart Search" provides a 10 step approach for effective keyword searching.

A. Statements By Judges on Keyword Searching

1. Judge Facciola - lawyers doing keyword searching without expert guidance going, "where angels fear to tread".

2. Judge Grimm - search methods must be tested for quality assurance.

3. Judge Peck - "wake-up call to the Bar" for their inexpert search terms.

4. Jason R. Baron of NARA - leading figure in e-discovery search.

B. 10 Step Approach

1. Start with the Request for Production

a. ESI search should really begin when litigation is anticipated.

b. Use both terms of art from the RFPs, and rephrase demands in ordinary English.

c. Push back against overboard requests.

d. If requests are vague, tell other side how you will interpret them and put them in the position of having to object.

2. Seek Input from Key Players

a. Custodians are SMEs for their own data.

b. TREC Legal Track challenge showed correlation between precision & recall and questioning key players.

3. Look at What You've Got and the Tools You'll Use

a. TIFF images require different search technique than emails or Word documents.

b. Test search tools against actual data.

c. Search tools must be able to search through container files and nested content & email attachments.

d. Search tools must identify encrypted tiles or non-standard types that can't be searched.

4. Communicate and Collaborate

a. Tell the other side the tools and terms you are using.

b. Ask for targeted suggestions and run them on sample data. They highlight terms that you overlooked.

c. Let the other side have two rounds of keyword search and review on your data.

5. Incorporate Misspellings, Variants and Synonyms

a. Common variants are more effective than fuzzy searching, which gets too many false hits.

b. Dumb Dictionary and Wikipedia lists of common misspellings.

6. Filter and Deduplicate First

a. Filter out music and image files which have alphanumeric content.

b. de-NIST by known hash values

c. Deduplication before indexing.

d. Be able to repopulate suppressed iterations.

e. Use keywords to exclude irrelevant ESI. e.g., "baby shower"

7. Test, test, test!

a. Test on data representative of custodian data with responsive evidence.

b. Can a large number of hits be found in system files, business units not subject of litigation, or other irrelevant ESI?

8. Review the Hits

a. Create spreadsheet showing hits on context - 20-30 words on each side.

b. Review responsive documents for additional keywords.

c. Search is an iterative process.

9. Tweak the Queries and Retest

a. Do keywords cluster in pairs? If so, can use Boolean AND or proximity connector to reduce noise hits.

10. Check the Discards

a. Sampling method must be rational compromise between quality assurance and cost.


 
 

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page