Utils¶
Introduction¶
The utils module contains some helper functions that are used by other modules. However, they are open to be utilised as you wish.
Usage¶
To use the citations module:
>>> import coast_core
>>> coast_core.utils.function(to_use)
or: .. code-block:: console
>>> from coast_core import utils
>>> utils.function(to_use)
Functions¶
A collection of generic utility functions that are used throughout coast by various modules relating to NLP tasks and the reporting and performance measures.
-
coast_core.utils.
get_from_file
(path)¶ Reads a file and returns each line as a list of strings. Notes:
- All double quotes are replaced with single quotes.
- New line characters are removed.
Parameters: path – The path to the file you wish to read. Returns: A list of strings, where each string is a line in the file.
-
coast_core.utils.
get_json_from_file
(path)¶ Reads a JSON file and returns as an object. :param path: The path to the JSON file you wish to read. :return: A JSON object, generated from the contents of the file. :return: In the event of an error, the error is printed to the stdout.
-
coast_core.utils.
get_ngrams
(text, number)¶ Split a given body of text into ngrams.
Parameters: - text – The body of text to operate on.
- number – Specify the size of the ngram (e.g unigram, bigram etc).
Returns: A list of ngrams.
-
coast_core.utils.
import_punkt
()¶ Import punkt
-
coast_core.utils.
penn_treebank_filter
(article_text, filter_list, exception_list=[])¶ Returns a list of tuples that are tagged with any penn treebank tag from the filter list.
Parameters: - article_text – The text to analyse.
- filter_list – The tags to return.
- exception_list – A list of exception.
Returns: A list of words containing any of the tags in the filter list.