Data handling containers

Pipeline

 class Pipeline(exp=None, name=None, out=None)

Shell for experiment pipeline storing and handling.

Parameters	Type	Doc
exp	class, optional, default None	Instance of Experimen with fitted pipes. If not supplied, name and source should be set.
out	tuple, optional, default None	Tuple with storage options, can be "json" (json serialization),or "db" (for database storage, requires blitzdb).

Methods

Function	Doc
_make_tab	Tabular level experiment representation.
save	Save experiment and classifier in format specified.
load	Load experiment and classifier from source specified.
classify	Given a data point, return a (label, probability) tuple.

_make_tab

    _make_tab()

Tabular level experiment representation.

Generates a table-level representation of an experiment. This stores JSON native information ONLY, and is used for the experiment table in the front-end, as deserializing a lot of experiments will be expensive in terms of loading times.

save

    save()

Save experiment and classifier in format specified.

load

    load()

Load experiment and classifier from source specified.

classify

    classify(data)

Given a data point, return a (label, probability) tuple.

CSV

 class CSV

Quick and dirty csv loader.

Parameters	Type	Doc
text	integer	Index integer of the .csv where the text is located.
parse	integer, optional, default None	Index integer of the .csv where the annotations are provided. Currently it assumes that these are per instance a list of, for every word, (token, lemma, POS). Frog and spaCy are implemented to provide these for you.
header	boolean, optional, default False	If the file has a header, you can skip it by setting this to true.

Methods

Function	Doc
iter	Standard iter method.
next	Iterate through csv file.

iter

    __iter__()

Standard iter method.

Data handling containers

Pipeline

Methods

_make_tab

save

load

classify

CSV

Methods

iter

next