cosmel.corpus.mention module

class cosmel.corpus.mention.Mention(article, sid, mid, *args, gid='', nid='', rid='', rule='', **kwargs)[source]

Bases: object

The mention class.

Parameters:
  • article (Article) – the article containing this mention.
  • sid (int) – the sentence index in the aritcle.
  • mid (int) – the mention index in the sentence.
  • gid (str) – the golden product ID.
  • nid (str) – the network-predicted product ID.
  • rid (str) – the rule-labeled product ID.
  • rule (str) – the rule.
  • idxs (slice) – the indix slice of this mention.
article

the article containing this mention.

Type:Article
sentence

the sentence containing this mention.

Type:WsWords
sentence_pre

the words before this mention in the sentence.

Type:WsWords
sentence_pre_(with_mention=True)[source]

WsWords: the words before this mention in the sentence (with/without mention itself).

sentence_post

the words after this mention in the sentence.

Type:WsWords
sentence_post_(with_mention=True)[source]

WsWords: the words after this mention in the sentence (with/without mention itself).

mention

this mention.

Type:WsWords
bundle

the mention bundle containing this mention.

Type:MentionBundle
asmid

the tuple of article ID, sentence ID, and mention ID.

Type:tuple
ids
aid

the article ID.

Type:str
sid

the sentence ID (the sentence index in the article).

Type:int
start_idx

the starting index of the mention in the sentence.

Type:int
end_idx

the ending index of the mention in the sentence.

Type:int
last_idx

the index of the last word of the mention in the sentence.

Type:int
mid

the mention ID (the mention index in the sentence).

Type:int
gid

the golden product ID.

Type:str
nid

the network-predicted product ID.

Type:str
rid

the rule-labeled product ID.

Type:str
rule

the rule for the product ID.

Type:str
head_ws

the word-segmented head word.

Type:WsWords
head

the head word.

Type:str
head_txt

the head word.

Type:str
head_tag

the head post-tag.

Type:str
head_role

the head role.

Type:str
attrs

The xml attributes.

start_xml

the starting XML tag.

Type:str
start_xml_(**kwargs)[source]

str: the starting XML tag with custom attributes.

end_xml

the ending XML tag.

Type:str
json

Convert to json.

set_gid(gid)[source]

Sets the golden product ID.

set_nid(nid)[source]

Sets the network-predicted product ID.

set_rid(rid)[source]

Sets the rule-labeled product ID.

set_rule(rule)[source]

Sets the rule for the product ID.

class cosmel.corpus.mention.MentionSet(mention_bundles)[source]

Bases: collections.abc.Collection

The set of mentions.

Parameters:mention_bundles (MentionBundleSet) – the set of mention bundles.
path

the root path of the mentions.

Type:str
class cosmel.corpus.mention.MentionBundle(file_path, article)[source]

Bases: collections.abc.Sequence

The bundle of mentions in an article.

Parameters:
  • file_path (str) – the path to the mention bundle.
  • article (Article) – the article containing this mention bundle.
article

the article of this bundle.

Type:Article
aid

the article ID (with leading author name and underscore).

Type:str
path

the related file path.

Type:str
save(file_path)[source]

Save the mention bundle to json file.

class cosmel.corpus.mention.MentionBundleSet(mention_root, article_set)[source]

Bases: collections.abc.Collection

The set of mention bundles.

Parameters:
  • article_root (str) – the path to the folder containing word segmented article files.
  • mention_root (str) – the path to the folder containing mention files.
  • article_set (ArticleSet) – the set of articles.
path

the root path of the mentions.

Type:str
save(output_root)[source]

Save all mention bundles to files.

class cosmel.corpus.mention.Id2Mention(mention_set)[source]

Bases: collections.abc.Mapping

The dictionary maps article ID, sentence ID, and mention ID to mention object.

  • Key: the article ID, sentence ID, and mention ID (tuple).
  • Item: the mention object (Mention).
Parameters:mention_set (MentionSet) – the mention set.
class cosmel.corpus.mention.Id2MentionBundle(id_to_article)[source]

Bases: collections.abc.Mapping

The dictionary maps article ID to mention bundle.

  • Key: the article ID (str).
  • Item: the mention bundle (MentionBundle).
Parameters:id_to_article (Id2Article) – the dictionary maps article ID to article object.
class cosmel.corpus.mention.Head2MentionList(mention_set)[source]

Bases: collections.abc.Mapping

The dictionary maps head word to mention object list.

Parameters:mention_set (MentionSet) – the mention set.