Noise Word Lists

Noise Word Lists

April 20, 2018

In its Analytics Guide, Relativity recommends a noise word list posted to a Dutch site Ranks.   Ranks has noise word lists available in several different languages, including Japanese, Chinese, French and German.   

 

There are actually several noise word lists for English.

 

1. A long list of 668 words.  It includes less obvious words such as accordance; beginning; looking; predominantly; and usefulness, as well as each single letter in the alphabet. 

 

2. A short list of 174 words.  

 

3. The MySQL stop word list of 543 words that does not include single letter words. 

 

Surprisingly the MySQL and Ranks long list noise word lists are quite different.    There are 206 words in the long Ranks list which are not in the MySQL list, and 82 words in the MySQL list omitted from the Ranks long list.   As you can see from the lists below, the MySQL list tends to include more contractions, and the Ranks Long List has more common adverbsand misspellings or grammatical mistakes. 

 

 

Ranks Long List Not in MySQL

a
abst
accordance
act
added
adj
affected
affecting
affects
ah
announce
anymore
apparently
approximately
aren
arent
arise
auth
b
back
begin
beginning
beginnings
begins
biol
briefly
c
ca
couldnt
d
date
due
e
ed
effect
eighty
end
ending
et-al
f
ff
fix
found
g
gave
give
giving
h
hed
heres
hes
hid
home
hundred
i
id
im
immediately
importance
important
index
information
invention
itd
j
k
kg
km
l
largely
lets
line
'll
m
made
make
makes
means
meantime
mg
million
miss
ml
mr
mrs
mug
n
na
nay
necessarily
ninety
nonetheless
nos
noted
o
obtain
obtained
omitted
ord
owing
p
page
pages
part
past
poorly
possibly
potentially
pp
predominantly
present
previously
primarily
promptly
proud
put
q
quickly
r
ran
readily
recent
recently
ref
refs
related
research
resulted
resulting
results
run
s
sec
section
shed
she'll
shes
show
showed
shown
showns
shows
significant
significantly
similar
similarly
slightly
somethan
specifically
stop
strongly
substantially
successfully
sufficiently
suggest
t

taking
that'll
that've
thered
there'll
thereof
therere
thereto
there've
theyd
theyre
thou
thoughh
thousand
throug
til
tip
ts
u
unlike
ups
usefully
usefulness
v
've
vol
vols
w
wasnt
wed
werent
what'll
whats
wheres
whim
whod
who'll
whomever
whos
widely
wont
words
world
wouldnt
www
x
y
youd
youre
z
 

MySQL not Ranks Long List

a's
ain't
allow
allows
apart
appear
appreciate
appropriate
aren't
associated
best
better
c'mon
c's
cant
changes
clearly
concerning
consequently
consider
considering
corresponding
couldn't
course
currently
definitely
described
despite
entirely
exactly
example
going
greetings
hadn't
he's
hello
help
here's
hopefully
i'd
i'm
ignored
inasmuch
indicate
indicated
indicates
inner
insofar
it'd
it's
let's
novel
presumably
reasonably
second
secondly
sensible
serious
seriously
t's
that's
there's
they'd
they're
third
thorough
thoroughly
three
wasn't
we'd
we're
well
weren't
what's
where's
who's
will
won't
wonder
wouldn't
you'd
you're
 

 

 

 

 

Please reload

Some elements on this page did not load. Refresh your site & try again.

Contact Me With Your Litigation Support Questions:

seankevinoshea@hotmail.com

  • Twitter Long Shadow

© 2015 by Sean O'Shea . Proudly created with Wix.com