My coworker approached me the other day and ask what open source log analysis tools would I recommend. I personally do not have much experience with a general purpose open source log analysis tools so I would have probably recommended him to take a look at Splunk. Since I've recently written a customized log analysis software, I became curious and asked him what he intend to do with the log analysis software.
My coworker said that he needed to analyze Tibco EMS logs. Tibco EMS logs incoming messages in the order it receives. My coworker is interested in a set of related messages that is identified by the message ID tag. His particular issue is that the logs entries that he's interested in are interspersed with other log entries that he's not interested. He wanted a log file where the log entries are grouped by message id in historical order.
Once I understood his needs, I realize that he did not need the Splunk and that I could quickly adapt my F# log analysis software written in the previous blog post for his need. When I gave him the modified F# code, he asked me if I could port it to Linux. That threw me for a loop. I briefly entertained the idea of building a Mono system and compile F# on Mono but decided against it for now. I thought it would be easier to just port it to Haskell, which I already have on Linux.
Here's the ported Haskell log analysis software with modifications to work with Tibco log entries.
import Data.Time.Calendar import Data.Time.LocalTime import Data.Time.Parse import List import System.Environment type Category = String type Entry = [String] type TimeStamp = (LocalTime,String) type LogHeader = (TimeStamp, Category) alphaTime = LocalTime (fromGregorian 2000 1 1) midnight data LogEntry = LogEntry (TimeStamp, String) [String] deriving (Show) {- Grab label -} categorize (_ : _ : label : _) = label categorize words = "" {- Grab timestamp -} timestamp (date : time : _ ) = strptime "%Y-%m-%d %H:%M:%S" (date ++ " " ++ time) timestamp words = Nothing {- header :: String -> (String, Maybe (LocalTime, String)) -} header line = (timestamp tokens, categorize tokens) where tokens = words line {- Concrete implementation of Tibco log parser -} logparser :: [String] -> LogHeader -> [String] -> [LogEntry] -> [LogEntry] logparser (line : rest) xheader entry entries = process (header line) where process (Just (ts),label) = logparser rest h [line] ((LogEntry xheader (reverse entry)):entries ) where h = (ts,label) process (Nothing,_) = logparser rest xheader (line : entry) entries logparser [] xheader entry entries = reverse ((LogEntry xheader entry) : entries) {- Utility method to pull items out of LogEntry -} entry (LogEntry _ entries) = entries category (LogEntry (_,label) _) = label {- comparator based on category -} categorysort (LogEntry (_,a) _) (LogEntry (_,b) _) | a > b = GT | a < b = LT | otherwise = EQ parselog parser lines = parser lines ((alphaTime,".000"),"STARTFLAG") [] [] processlog = unlines . map (unlines . entry) . sortBy categorysort . (parselog logparser) . lines main = do (filename:_) <- getArgs contents <- readFile filename putStr (processlog contents)