The utils module contains some helper functions that are used by other modules. However, they are open to be utilised as you wish.


To use the citations module:

>>> import coast_core
>>> coast_core.utils.function(to_use)

or: .. code-block:: console

>>> from coast_core import utils
>>> utils.function(to_use)


A collection of generic utility functions that are used throughout coast by various modules relating to NLP tasks and the reporting and performance measures.


Reads a file and returns each line as a list of strings. Notes:

  1. All double quotes are replaced with single quotes.
  2. New line characters are removed.
Parameters:path – The path to the file you wish to read.
Returns:A list of strings, where each string is a line in the file.

Reads a JSON file and returns as an object. :param path: The path to the JSON file you wish to read. :return: A JSON object, generated from the contents of the file. :return: In the event of an error, the error is printed to the stdout.

coast_core.utils.get_ngrams(text, number)

Split a given body of text into ngrams.

  • text – The body of text to operate on.
  • number – Specify the size of the ngram (e.g unigram, bigram etc).

A list of ngrams.


Import punkt

coast_core.utils.penn_treebank_filter(article_text, filter_list, exception_list=[])

Returns a list of tuples that are tagged with any penn treebank tag from the filter list.

  • article_text – The text to analyse.
  • filter_list – The tags to return.
  • exception_list – A list of exception.

A list of words containing any of the tags in the filter list.