Litigation Support Tip of the Night

March 19, 2019

Ranks NL has posted stop word lists in several different languages here:  https://www.ranks.nl/stopwords.  There are multiple lists in English, including that used by MySQL.   Ranks NL is recommended by Relativity for use in a conceptual index.   See this guide to Analytics indexes

Stop words lists are included for many foreign languages.

April 20, 2018

In its Analytics Guide, Relativity recommends a noise word list posted to a Dutch site Ranks.   Ranks has noise word lists available in several different languages, including Japanese, Chinese, French and German.   

There are actually several noise word lists for English.

1. A long list of 668 words.  It includes less obvious words such as accordance; beginning; looking; predominantly; and usefulness, as well as each single letter in the alphabet. 

2. A short list of 174 words.  

3. The MySQL stop word list of 543 words that does not include single letter words. 

Surprisingly the MySQL and Ranks long list noise word lists are quite different.    There are 206 words in the long Ranks list which are not in the MySQL list, and 82 words in the MySQL list omitted from the Ranks long list.   As you can see from the lists below, the MySQL list tends to include more contractions, and the Ranks Long List has more common adverbsand misspellings or grammatical mistakes. 

Ranks Long List Not in MySQL

a
abst
accordance
act
added
adj
affected
affecting
affects
ah
announce
anymore
apparently
approximately
aren
arent
arise
auth
b
back
begin
beginning
beginnings
begins
biol
briefly
c
ca
couldnt
d
date
due
e
ed
effect
eighty
end
ending
et-al
f
ff
fix
found
g
gave
give
giving
h
hed
heres
hes
hid
home
hundred
i
id
im
immediately
importance
important
index
information
invention
itd
j
k
kg
km
l
largely
lets
line
'll
m
made
make
makes
means
meantime
mg
million
miss
ml
mr
mrs
mug
n
na
nay
necessarily
ninety
nonetheless
nos
noted
o
obtain
obtained
omitted
ord
owing
p
page
pages
part
past
poorly
possibly
potentially
pp
predominantly
present
previously
primarily
promptly
proud
put
q
quickly
r
ran
readily
recent
recently
ref
refs
related
research
resulted
resulting
results
run
s
sec
section
shed
she'll
shes
show
showed
shown
showns
shows
significant
significantly
similar
similarly
slightly
somethan
specifically
stop
strongly
substantially
successfully
sufficiently
suggest
t

taking
that'll
that've
thered
there'll
thereof
therere
thereto
there've
theyd
theyre
thou
thoughh
thousand
throug
til
tip
ts
u
unlike
ups
usefully
usefulness
v
've
vol
vols
w
wasnt
wed
werent
what'll
whats
wheres
whim
whod
who'll
whomever
whos
widely
wont
words
world
wouldnt
www
x
y
youd
youre
z
 

MySQL not Ranks Long List

a's
ain't
allow
allows
apart
appear
appreciate
appropriate
aren't
associated
best
better
c'mon
c's
cant
changes
clearly
concerning
consequently
consider
considering
corresponding
couldn't
course
currently
definitely
described
despite
entirely
exactly
example
going
greetings
hadn't
he's
hello
help
here's
hopefully
i'd
i'm
ignored
inasmuch
indicate
indicated
indicates
inner
insofar
it'd
it's
let's
novel
presumably
reasonably
second
secondly
sensible
serious
seriously
t's
that's
there's
they'd
they're
third
thorough
thoroughly
three
wasn't
we'd
we're
well
weren't
what's
where's
who's
will
won't
wonder
wouldn't
you'd
you're
 

October 23, 2017

The old dBase .dbf file format is still used by many applications for storing structured data, and will turn up in document productions.  If you need to view such a file, consider installing GTK DBF Editor.  A free download is available.  

June 10, 2016

Those of you using Driven's ONE document database may want to know how to search for multiple strings in any one field in the database. The menus in the software don't make this entirely clear. If you want, for example, to search for a long list of Bates numbers these are the steps you should take.

1. Go to Search . . . Advanced Search.

2. In the Field Type menu select Metadata and then choose the Field Name you want to search in.

3. In the Operation field select 'In' , and then in the Value field press F2.

4. A new dialog box entitled, 'Get the Data' will open. At the bottom click Select File. You now want to browse to a text file you previously created with just the strings you want to search for separated only with line breaks.

5. Your search terms will load. Now just click Save (F10).

6. The terms will be loaded in the first line of Advanced Search. Now all you need to do is to click 'Execute' to get your documents in the results.

4.

February 6, 2016

Microsoft has an interface for accessing data in a variety of different databases called Open Database Connectivity (ODBC).    A description of an eDiscovery application may refer to its ability to access SQL and other databases using an ODBC driver.   With ODBC, administrators can access and edit data from multiple databases in different formats.   Keep  this in mind when dealing with requests for the production of databases that are not in familiar formats such as the MS Access .mdb format. 

Please reload

Please reload

Sean O'Shea has more than 15 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

 

All content provided on this blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information.

 

This policy is subject to change at any time.