I have been reading through the book
Clojure Programming.
In the course of reading this book, I've looked for all sorts of opportunities to try applying Clojure
at work. Most of the time, I've used Clojure to implement convenience scripts to
help with my day job. I would typically use Clojure whenever I have to work with Java based
platforms and F# for .NET based platforms. Occasionally, I would develop scripts that does not have
any dependency and I could choose any language to implement. What would typically happen
is that I would choose the programming language that I used last.
This strategy, unfortunately, would typically end up biasing me
toward one programming language and lately, it has been biasing
me toward Clojure.
After noticing this trend, I have decided to deliberately and consciously choose to implement
in the less frequently used language so I don't become completely rusty in the
other programming languages.
Recently, I had to opportunity to write a small script. I was managing an infrastructure upgrade and needed to know the downstream impact. It was an infrastructure component that that a lot of persistent inbound connections, but unfortunately, the inbound connections were neither monitored nor documented. One way to check the connections is ask the the network engineers to setup monitoring
on the servers and collect the information on the incoming connections. Our network
engineers are generally pretty busy and we hate to add to their existing workloads. However,
we can effectively do the same thing by running netstat -an
on each
of the target servers and taking that output dump and parse that for incoming connections.
We would do this over a period of time to try to capture most of the client connections.
The following Clojure script loads all the netstat
dump output files
and generate a list of all the hosts that are connected to the target servers:
(import '(java.net InetAddress))
(use '[clojure.string :only (join)])
(use '[clojure.java.io :as io])
; Load all the data from all *.data files in c:\work\servers folder
(def data (->> "c:\\work\\servers"
(io/file)
(file-seq)
(map #(.getAbsolutePath %))
(filter #(re-matches #".*\.data$" %))
(map #(slurp %))
(join " ")))
; Find all ip addresses in the netstat dump
; Perform hostname lookup, discard duplicates, sort the hostnames
(def hosts (->> data
(re-seq #"(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\.\d+")
(map #(second %))
(set)
(map #(.getCanonicalHostName (InetAddress/getByName %)))
(sort)
(join "\n")))
; Dump output to clients.out file
(spit "c:\\work\\servers\\results.out" hosts)
The above script runs with the assumption that all data fits into memory. However, if that becomes a
problem, it is fairly trivial to sequentially read and process netstat
dump one file at a time
and combine the results to write to the output.
The F# version is similar to Clojure version. Grabbing the files from the folder is easier but the need to
explicitly handle exceptions adds back the additional lines of code to be about on par with code verbosity of the Clojure version.
open System.IO
open System.Net
open System.Text.RegularExpressions
// Load all the data from all *.data files in c:\work\servers folder
let data = Directory.GetFiles(@"c:\work\servers","*.data")
|> Seq.map File.ReadAllText
|> String.concat " "
// Return hostname if it can be resolved
// otherwise return the ip address
let getHostEntry (ipaddress:string) =
try
Dns.GetHostEntry(ipaddress).HostName
with
| err -> ipaddress
// Find all ip addresses in the netstat dump
// Perform hostname lookup, discard duplicates, sort the hostnames
let hosts = Regex.Matches(data,@"(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\.\d+")
|> Seq.cast<Match>
|> Seq.map (fun m -> m.Groups.[1].Value)
|> Set.ofSeq
|> Seq.map getHostEntry
|> Seq.sort
|> String.concat "\n"
File.WriteAllText(@"c:\work\servers\results.out",hosts)