The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer. All content provided on this blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information. This policy is subject to change at any time. The owner is not an attorney, and nothing posted on this site should be construed as legal advice. Litigation Support Tip of the Night does not provide confirmation that any e-discovery technique or conduct is compliant with legal, regulatory, contractual or ethical requirements.
Here's a demonstration of the concept of co-reference resolution, or finding multiple references to the same entity in unstructured text which appear in different variations.
The Allen Institute for AI has an online search engine which will find references to entities in inputted text. See here.
As we can see, co-reference resolution finds references to people, locations, and organizations, which appear in complete, incomplete, and pronoun form, and groups those forms together. The Allen Institute uses its end-to-end neural coreference resolution model which has achieved an F1 score as high as 78.87% on some data sets. As discussed in the Tip of the Night for June 11, 2016, a F1 score measures the weighted average of precision and recall. The Allen Institute F1 score reflects a precision score of around 80% (how many hits in the results are true hits) and a recall of about 73% (how many of the total true hits in the source data show up in the results). See the research paper posted here. The Allen Institute was founded by Microsoft head Paul Allen and has created an open source NLP library.