Named entities

Introduction

The named entities module extracts named entites using NLTK, Stanford, and by just tagging text and extracting Personal Pronouns. Named entities are useful for identifying characters in stories, personal experience and events.

Note: In order to use the Stanford named entity detection, you will need to have Java installed.

Usage

To use the module:

>>> import coast_core
>>> coast_core.named_entities.function(to_use)

or:

>>> from coast_core import named_entities
>>> named_entities.function(to_use)

Functions

Title: named_entities.py

Author: Ashley Williams

Description: Extract named entities from text.

This module is called by init, so there is no need to import this module specifically.

coast_core.named_entities.extract_all_named_entities(article_text)

Extract the named entities for all extracted articles.

Parameters:article_text – The article text to operate on.
Returns:An object containing all named entities
coast_core.named_entities.getNodes(parent)

Never called externally, used to extract entities using nltk.

coast_core.named_entities.get_nltk_named_entities(text, exception_list=[])

Returns a list of named entities in a given block of text using NLTK’s averaged_perceptron_tagger.

Parameters:
  • text – The text to analyse.
  • exception_list – A list of named entities to ignore.
Returns:

A list of named entities.

coast_core.named_entities.get_pronouns(text, exception_list=[])
Returns a list of personal pronouns in a given block of text
PRP - Personal pronouns PRP$ - Possessive pronoun
Parameters:
  • text – The text to analyse.
  • exception_list – A list of named entities to ignore.
Returns:

A list of named entities.

coast_core.named_entities.get_stanford_named_entities(text, exception_list=[])

Returns a list of named entities in a given block of text.

Parameters:
  • text – The text to analyse.
  • exception_list – A list of named entities to ignore.
Returns:

A list of named entities.