cosmel.corpus.parsed module

class cosmel.corpus.parsed.ParsedArticle(file_path, article)[source]

Bases: collections.abc.Sequence

The parsed article object (contains list of sentences).

  • Item: the parsed sentence (str)
Parameters:file_path (str) – the path to the article.
article

the article of this bundle.

Type:Article
aid

the article ID (with leading author name and underscore).

Type:str
path

the related file path.

Type:str
class cosmel.corpus.parsed.ParsedArticleSet(parsed_root, article_set)[source]

Bases: collections.abc.Collection

The set of parsed articles.

Parameters:
  • parsed_root (str) – the path to the folder containing parsed article files.
  • article_set (ArticleSet) – the set of articles.

Notes

  • Load all articles from parsed_root/part for all part in parts.
path

the root path of the articles.

Type:str
class cosmel.corpus.parsed.Id2ParsedArticle(id_to_article)[source]

Bases: collections.abc.Mapping

The dictionary maps article ID to parsed article.

  • Key: the article ID (str).
  • Item: the parsed article (ParsedArticle).
Parameters:id_to_article (Id2Article) – the dictionary maps article ID to article object.