magichour.api.local.util.compress package

Submodules

magichour.api.local.util.compress.compress module

class magichour.api.local.util.compress.compress.LogLine(ts, text, processed, dictionary, supportId)

Bases: tuple

__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

__getstate__()

Exclude the OrderedDict from pickling

__repr__()

Return a nicely formatted representation string

dictionary

Alias for field number 3

processed

Alias for field number 2

supportId

Alias for field number 4

text

Alias for field number 1

ts

Alias for field number 0

class magichour.api.local.util.compress.compress.LogSupport(supportId, pattern)

Bases: tuple

__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

__getstate__()

Exclude the OrderedDict from pickling

__repr__()

Return a nicely formatted representation string

pattern

Alias for field number 1

supportId

Alias for field number 0

class magichour.api.local.util.compress.compress.OutLine(ts, supportId, dictionary)

Bases: tuple

__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

__getstate__()

Exclude the OrderedDict from pickling

__repr__()

Return a nicely formatted representation string

dictionary

Alias for field number 2

supportId

Alias for field number 1

ts

Alias for field number 0

class magichour.api.local.util.compress.compress.TransformLine(id, type, NAME, transform)

Bases: tuple

NAME

Alias for field number 2

__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

__getstate__()

Exclude the OrderedDict from pickling

__repr__()

Return a nicely formatted representation string

id

Alias for field number 0

transform

Alias for field number 3

type

Alias for field number 1

magichour.api.local.util.compress.compress.escapeCrap(x)

perform regex safe escapement

Parameters:x (string) – string with possibly unsafe characters
Returns:retval – string with escapements
Return type:string
magichour.api.local.util.compress.compress.getWordSkipNames(s)

find the skip word patterns

Parameters:s (_sre.SRE_Pattern) – compiled regex to match a logline
Returns:retval – list of the skip patterns found in s
Return type:list(string
magichour.api.local.util.compress.compress.main(argv)

create a section of format strings [header] followed by section of timestamp,formatstringID,args [data]

use the supports from LogCluster, and the preprocessing rules to generate a listing of format strings for the header.

for each line of the loglines, find the replacemnts which would be made, along with the skip words. store the timestamp for the original message, a reference to the format string, and the arguments needed to fill in the format string.

Currently storing a dict(list), could save space by storing things in usage order instead.. also could save space by using real integers instead of string representations of integers.

[header] number of headerValues header value1 ... header valueN [format args] timestamp,headerID (index to header), args

Parameters:argv (list(string) – arguments sent to the program
Returns:None
magichour.api.local.util.compress.compress.makeReplacement(s)

find an escaped version of skip{m,n} words replace with unescaped version

Parameters:s (string) – string to search
Returns:retval – string with replacement
Return type:string
magichour.api.local.util.compress.compress.makeTransformedLine(l, transforms)
perform a series of regex replacements on a LogLine
store a list of pre-processing replacements in a dictionary
Parameters:
  • l (LogLine) – namedTuple containing a logline
  • transforms (list(TransformLine) – series of regex transforms to apply
Returns:

retval – LogLine namedTuple with replacements made

Return type:

LogLine

magichour.api.local.util.compress.compress.matchSupport(procLogLine, support)

determine if the support is matched, also determine if there are word skip replacemnets wich need to be tracked for a matched logsupport line

Parameters:
  • procLogLine (LogLine) – logline to further investigate
  • support (_sre_.SRE_Pattern) –
Returns:

retval – logline named tuple with skip words and preprocessing words assigned to a specific support or -1 if no support exists

Return type:

LogLine

magichour.api.local.util.compress.compress.openFile(name, mode)

wrapper for file open, will handle reading/writing gzip files

Parameters:
  • name (string) – name of file to open, if name ends in ‘.gz’ file is treated as a gzip file for i/o
  • mode (string) – mode to open the file as
Returns:

retval – filehandle

Return type:

file

magichour.api.local.util.compress.compress.procSupports(l)

read the logCluser supports output file format turn each line into a regex so log lines can be catagorized

Parameters:l (list(strings) – list of lines in the procSupports output file
Returns:retval – list of compiled regex to search for
Return type:list(_sre.SRE_Pattern
magichour.api.local.util.compress.compress.readTransforms(sFile)

read the preprocessing transforms from file lines starting with a # are considered comments and are skipped

Parameters:sFile (file) – filehandle to the tranform file
Returns:retval – a listing of TransformLine named tuples
each line describes a normalizaiton regex
Return type:list(TransformLine
magichour.api.local.util.compress.compress.writeHeader(supports, oFile)

write the format strings in order

Parameters:
  • supports (list(_sre.SRE_Pattern) – list of compiled regex
  • oFile (file) – file handle for io
Returns:

None

magichour.api.local.util.compress.compress.writeOutput(o, oFile)

Write arguments to the output file

Parameters:
  • o (OutLine) – namedTuple to output
  • oFile (file) – filehandle for io
Returns:

None

Module contents