magichour.api.local.util.compress package¶
Submodules¶
magichour.api.local.util.compress.compress module¶
-
class
magichour.api.local.util.compress.compress.
LogLine
(ts, text, processed, dictionary, supportId)¶ Bases:
tuple
-
__getnewargs__
()¶ Return self as a plain tuple. Used by copy and pickle.
-
__getstate__
()¶ Exclude the OrderedDict from pickling
-
__repr__
()¶ Return a nicely formatted representation string
-
dictionary
¶ Alias for field number 3
-
processed
¶ Alias for field number 2
-
supportId
¶ Alias for field number 4
-
text
¶ Alias for field number 1
-
ts
¶ Alias for field number 0
-
-
class
magichour.api.local.util.compress.compress.
LogSupport
(supportId, pattern)¶ Bases:
tuple
-
__getnewargs__
()¶ Return self as a plain tuple. Used by copy and pickle.
-
__getstate__
()¶ Exclude the OrderedDict from pickling
-
__repr__
()¶ Return a nicely formatted representation string
-
pattern
¶ Alias for field number 1
-
supportId
¶ Alias for field number 0
-
-
class
magichour.api.local.util.compress.compress.
OutLine
(ts, supportId, dictionary)¶ Bases:
tuple
-
__getnewargs__
()¶ Return self as a plain tuple. Used by copy and pickle.
-
__getstate__
()¶ Exclude the OrderedDict from pickling
-
__repr__
()¶ Return a nicely formatted representation string
-
dictionary
¶ Alias for field number 2
-
supportId
¶ Alias for field number 1
-
ts
¶ Alias for field number 0
-
-
class
magichour.api.local.util.compress.compress.
TransformLine
(id, type, NAME, transform)¶ Bases:
tuple
-
NAME
¶ Alias for field number 2
-
__getnewargs__
()¶ Return self as a plain tuple. Used by copy and pickle.
-
__getstate__
()¶ Exclude the OrderedDict from pickling
-
__repr__
()¶ Return a nicely formatted representation string
-
id
¶ Alias for field number 0
-
transform
¶ Alias for field number 3
-
type
¶ Alias for field number 1
-
-
magichour.api.local.util.compress.compress.
escapeCrap
(x)¶ perform regex safe escapement
Parameters: x (string) – string with possibly unsafe characters Returns: retval – string with escapements Return type: string
-
magichour.api.local.util.compress.compress.
getWordSkipNames
(s)¶ find the skip word patterns
Parameters: s (_sre.SRE_Pattern) – compiled regex to match a logline Returns: retval – list of the skip patterns found in s Return type: list(string
-
magichour.api.local.util.compress.compress.
main
(argv)¶ create a section of format strings [header] followed by section of timestamp,formatstringID,args [data]
use the supports from LogCluster, and the preprocessing rules to generate a listing of format strings for the header.
for each line of the loglines, find the replacemnts which would be made, along with the skip words. store the timestamp for the original message, a reference to the format string, and the arguments needed to fill in the format string.
Currently storing a dict(list), could save space by storing things in usage order instead.. also could save space by using real integers instead of string representations of integers.
[header] number of headerValues header value1 ... header valueN [format args] timestamp,headerID (index to header), args
Parameters: argv (list(string) – arguments sent to the program Returns: None
-
magichour.api.local.util.compress.compress.
makeReplacement
(s)¶ find an escaped version of skip{m,n} words replace with unescaped version
Parameters: s (string) – string to search Returns: retval – string with replacement Return type: string
-
magichour.api.local.util.compress.compress.
makeTransformedLine
(l, transforms)¶ - perform a series of regex replacements on a LogLine
- store a list of pre-processing replacements in a dictionary
Parameters: - l (LogLine) – namedTuple containing a logline
- transforms (list(TransformLine) – series of regex transforms to apply
Returns: retval – LogLine namedTuple with replacements made
Return type:
-
magichour.api.local.util.compress.compress.
matchSupport
(procLogLine, support)¶ determine if the support is matched, also determine if there are word skip replacemnets wich need to be tracked for a matched logsupport line
Parameters: - procLogLine (LogLine) – logline to further investigate
- support (_sre_.SRE_Pattern) –
Returns: retval – logline named tuple with skip words and preprocessing words assigned to a specific support or -1 if no support exists
Return type:
-
magichour.api.local.util.compress.compress.
openFile
(name, mode)¶ wrapper for file open, will handle reading/writing gzip files
Parameters: - name (string) – name of file to open, if name ends in ‘.gz’ file is treated as a gzip file for i/o
- mode (string) – mode to open the file as
Returns: retval – filehandle
Return type: file
-
magichour.api.local.util.compress.compress.
procSupports
(l)¶ read the logCluser supports output file format turn each line into a regex so log lines can be catagorized
Parameters: l (list(strings) – list of lines in the procSupports output file Returns: retval – list of compiled regex to search for Return type: list(_sre.SRE_Pattern
-
magichour.api.local.util.compress.compress.
readTransforms
(sFile)¶ read the preprocessing transforms from file lines starting with a # are considered comments and are skipped
Parameters: sFile (file) – filehandle to the tranform file Returns: retval – a listing of TransformLine named tuples each line describes a normalizaiton regexReturn type: list(TransformLine
-
magichour.api.local.util.compress.compress.
writeHeader
(supports, oFile)¶ write the format strings in order
Parameters: - supports (list(_sre.SRE_Pattern) – list of compiled regex
- oFile (file) – file handle for io
Returns: None