magichour.lib.PARIS package

Submodules

magichour.lib.PARIS.paris module

magichour.lib.PARIS.paris.PARIS(D, r_slack, num_iterations=3)
magichour.lib.PARIS.paris.PCF(D, A, R, tau=0.5, r_slack=0, verbose=False)

Calculate the full PARIS cost function given a set of Documents, Atoms, and a representation

magichour.lib.PARIS.paris.design_atom(E, r_slack=0)

Implementation of the atom design function described in the PARIS paper

magichour.lib.PARIS.paris.get_best_representation(di, A, verbose=False, r_slack=None)

Get best possible representation for di given a set of atoms A

magichour.lib.PARIS.paris.get_error(D, A, R, a_index_to_ignore)

Compute the elements for each item in di not represented in by the current Atom/Representation (ignoring atoms we are supposed to ignore

magichour.lib.PARIS.paris.get_error2(D, A, R, a_index_to_ignore)

Compute the elements for each item in di not represented in by the current Atom/Representation (ignoring atoms we are supposed to ignore

magichour.lib.PARIS.paris.main()
magichour.lib.PARIS.paris.paris_distance(di, A, R, r_count)

Compute paris distance function

magichour.lib.PARIS.paris.run_paris_on_document(log_file, window_size=20.0, line_count_limit=None, groups_to_skip=set([-1]))
magichour.lib.PARIS.paris.test_with_syntheticdata()
magichour.lib.PARIS.paris.xor(s1, s2, skip_in_s2=0)

XOR at the core of the the PARIS distance function Take in two sets and compute their symmetric difference. Allow up to skip_in_s2 items to be missing.

Module contents