Monday, March 21, 2011

Log Analysis with F#

I am often pulled into investigative teams to resolve performance issues. In one particular instance, I was involved in troubleshooting a Java-based batch job that had experience performance issue after it migrated from an older database platform to a newer database platform. Typically in these performance related issues, I would analyze the logs to see if I can spot tell tale signs of performance issues. Some of these log files can range up to millions of lines of text. In this particular case, I received a log file with of almost 4 million lines of text. I do not want to wade through these log files by hand. In the past, I would have use a scripting language like perl or ruby to help parse these logs. But lately, I have leaned more towards F# to analyze log files. The following describes my experience with developing my log analysis software in F#.

The log files that I have received have sample entries with the following pattern:

02/04/11 21:15:55 xyz 123 CATEGORY_LABEL Lorem ipsum dolor sit amet 
    Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
    Aliquam varius mollis purus, vel molestie lorem pellentesque sed. 

where CATEGORY_LABEL categorizes the jobs performed and each log entry always starts with a timestamp. Typically, multiple jobs are running at the same time so the log entries have data interspersed. To start this effort, I want to capture the timestamp information, the category label and the entire log entry. To model this, I created the following type to capture a log entry:

type LogEntry =
        Timestamp : DateTime;
        Category  : string;
        Entry     : string list;

I want to be able to easily access Timestamp and Category label information while retaining the entire log entry for that timestamp/category. I decided to model the log entry as a list of strings. I had hope to leverage list operations later on to perform filtering capability. One of the drawbacks to this model for log entry is if the log file had text that split across 2 lines.

The first problem is to extract the timestamp and the category information. To capture the timestamp information, I implemented the following functions:

let parsedate datestring = DateTime.TryParse(datestring)

let getTimestamp (words:string []) =
    if words.Length > 2 then
        let timestamp = sprintf "%s %s" words.[0] words.[1]
        match parsedate timestamp with
            | true, d -> Some(d)
            | false,_ -> None
    else None

I leveraged F#'s Option type to return either a valid timestamp or None if no timestamp are found. You can see later how I use this as a filter to determine if I'm at the start of a log entry or not. Here's the code to get the category label:

let getCategory (words:string []) =
    if words.Length > 4 then 
        let tokens = words.[4].Split[|':'|]
    // Once in a while, there is log entry with no labels
    else ""

I know that the category label always is the 5th token, but once in a while the category label shows up in the log as CATEGORY_LABEL:Lorem, therefore I have to split the word with the token ":" to extract just the category label.

With these 2 functions, I can now build my getheader function which would extract the timestamp and category label:

// Get the timestamp and the job category     
let getHeader (line:string) =
    let words = line.Trim().Split[| ' ' |]
    let timestamp = getTimestamp words
    let category = getCategory words

I now have all the pieces to be able to build my log parser:

// Utility function to create log entry
let createLogEntry (timestamp,category)  entry =
    {Timestamp = timestamp;  Category=category;  Entry = List.rev entry}      
// Core parser engine to parse the log files
let rec logparser lines oldheader entry (acc:LogEntry list) =
    match lines with
    | line :: rest -> 
        // Check with we got new timestamp with this line
        let header = getHeader line
        match header with
        | (Some(ts),category) -> 
            // Found new timestamp,create log entry from the collected lines
            let logentry = createLogEntry oldheader (line::entry)
            //  Begin new log entry collections
            logparser rest (ts,category) [] (logentry::acc)
        | (None,_) -> 
            // Add new entry to line
            logparser rest oldheader (line::entry) acc
    // We're done with all line processing!
    | [] -> 
        let logentry = createLogEntry oldheader entry
        logentry :: acc |> List.rev

This log parser would recursively processed each line in the log file and build a list of LogEntry to return. When I find a header, I would begin a new LogEntry. Because I built the list using the :: (cons) operator, I need to reverse the list when I finish building the list. I would use the above parser engine as shown in the following code, which would return to me list of all log entries.

let parselog file parser =
    let lines = File.ReadAllLines(file)
    let past = DateTime.Now
    parser (Array.toList lines) (past,"") [] []

Now I have all the pieces to begin performing log analysis. I typically begin by chopping the log into smaller pieces. In the following example, I'm looking for a specific category on specific dates, one before the database migration and one after the database migration.

let (_,premigration) = DateTime.TryParse("03/11/2011")
let (_,postmigration) = DateTime.TryParse("03/14/2011")

// Extraction engine to pull logs related to specific job @ specific date
let extract targetdate label file processor =
    parselog file logparser
    |> List.filter (fun x -> x.Timestamp.Date.Equals(targetdate))
    |> List.filter (fun x -> x.Category.Trim().Equals(label))
    |> processor

// End log entry text to file
let send2file file entries=
    |> (fun x -> x.Entry)
    |> List.concat
    |> List.toArray
    |> writefile file

// smaller working file - premigration for JOB A
(send2file @"JOBA.premigration.log") |>
extract premigration.Date "JOBA" @"jobs.log" 

// smaller working file - postmigration for JOB A
(send2file @"JOBA.postmigration.log") |>
extract postmigration.Date "JOBA" @"jobs.log" 

Here's another example usage where I wanted to grab clusters of log entries along the time dimension (e.g. unique sets of Timestamp,Category pairs).

let getEntryClusters targetdate infile outfile  =
    parselog infile logparser
    |> List.filter (fun x -> x.Timestamp.Date.Equals(targetdate))
    |> (fun entry -> (entry.Timestamp,entry.Category))
    |> Set.ofList |> Set.toList
    |> (fun (ts,category) -> sprintf "%s,%s" (ts.ToString()) category)
    |> List.toArray 
    |> writefile outfile

getEntryClusters postmigration.Date @"jobs.log" @"clusters.log"

The interactive capability of F# in Visual Studio made building this log analysis tool a lot easier than I expected. While I can do similar interactive work with Ruby's irb, or Python's interactive mode, Visual Studio and F# just makes it feel much more productive than either of Ruby or Python's interactive environment. If I had to analyze another log file with a different format, I can certainly reuse my code and maybe just changed the logparser implementation. Indeed, just a few days after later, I was able to apply this code to look into some Tibco logs with minimal amount of changes.