magichour.lib.PARIS package¶
Submodules¶
magichour.lib.PARIS.paris module¶
-
magichour.lib.PARIS.paris.
PARIS
(D, r_slack, num_iterations=3)¶
-
magichour.lib.PARIS.paris.
PCF
(D, A, R, tau=0.5, r_slack=0, verbose=False)¶ Calculate the full PARIS cost function given a set of Documents, Atoms, and a representation
-
magichour.lib.PARIS.paris.
design_atom
(E, r_slack=0)¶ Implementation of the atom design function described in the PARIS paper
-
magichour.lib.PARIS.paris.
get_best_representation
(di, A, verbose=False, r_slack=None)¶ Get best possible representation for di given a set of atoms A
-
magichour.lib.PARIS.paris.
get_error
(D, A, R, a_index_to_ignore)¶ Compute the elements for each item in di not represented in by the current Atom/Representation (ignoring atoms we are supposed to ignore
-
magichour.lib.PARIS.paris.
get_error2
(D, A, R, a_index_to_ignore)¶ Compute the elements for each item in di not represented in by the current Atom/Representation (ignoring atoms we are supposed to ignore
-
magichour.lib.PARIS.paris.
main
()¶
-
magichour.lib.PARIS.paris.
paris_distance
(di, A, R, r_count)¶ Compute paris distance function
-
magichour.lib.PARIS.paris.
run_paris_on_document
(log_file, window_size=20.0, line_count_limit=None, groups_to_skip=set([-1]))¶
-
magichour.lib.PARIS.paris.
test_with_syntheticdata
()¶
-
magichour.lib.PARIS.paris.
xor
(s1, s2, skip_in_s2=0)¶ XOR at the core of the the PARIS distance function Take in two sets and compute their symmetric difference. Allow up to skip_in_s2 items to be missing.