Christoph finds exceptional log lines and takes a more literal approach.

Christoph finds exceptional log lines and takes a more literal approach.

Previously, we upgraded our log parsing to handle finding two errors instead of just one.
"It's amazing what you don't find when you're not looking."
We ended up with a set of functions that can parse out multiple errors.
The result is a nice mixed sequence of errors that we can aggregate and pivot with Clojure core.
"By the power of Clojure core, I will map and reduce!"
(02:43) New Problem: there are exceptions in the log, and they span multiple lines!
The exception continuation lines don't parse.
They don't have a date, a log level, or anything else that regular log lines have.
How do we collect up all those lines? They are all part of one logical error.
Our error parsers don't get a chance to see these lines, because the general line parser threw them away.
(06:27) Solution 1: Don't pre-parse the lines, but instead have each error parsing function do the general parse.
Each parsing function would receive the raw, unparsed, line as a string.
Each function would have to run parse-line on the inputs before doing their specific parsing.
Now each function in our inventory needs to re-do the general parsing of the line.
This ends up being much more inefficient.
(10:13) Solution 2: Relax the general parsing.
What is the general parser to do with a line that doesn't match the regexp?
"How about if the general parse does the best job it can, and whatever it can't do, it doesn't do?"
One of the keys in the data returned from parse-line is :raw/line, which is the entire line, and that's always there.
So, if the regexp fails, we can always return at least that.
The map doesn't have the keys that are not found.
Now, parse-line will always return a map for each line.
We've uplifted the data, at least slightly, from a string to a map.
We can know two things

Every map will have :raw/line
If a map doesn't have the other keys (like :log/date), that means it didn't parse, and is probably a continuation of a previous line.

This is us expanding our program's view of reality to more closely match actual reality.
We left out part of reality, and got stuck because of it.
Being more literal to the data source gives us more flexibility.
Clojure's dynamic nature shines in this situation. We don't have to have all the keys all the time.
"A map is a bucket of dimensions."
Each function that operates on the data will inspect the data and see if it has all the keys it needs. It doesn't matter if there are other keys.
Namespaced keys really help in this situation, each new level of parsing can add keys without overwriting keys from other dimensions.
How does this impact our parsing functions?
They are unchanged, because they grab the message out using some->>, which shortcuts on nil.
(19:08) Our exception error parsing function detect the start in a number of ways.

The next line is a bare line?
The first line ends in an opening curly brace?

Then, to find the end of the exception, it can use take-while to find all the lines that are bare.
To assist in finding bare lines, we can introduce a predicate function instead of embedding that logic.
When all lines are found, it then combines them into a new :log/message key.
When updating the error map in our parsing functions, we are careful to only operate on the data we know is there, so that other keys are not impacted.

Related episodes:

028: Fail Donut
029: Problem Unknown: Log Lines
030: Lazy Does It
031: Eager Abstraction
032: Call Me Lazy
033: Cake or Ice Cream? Yes!

Clojure in this episode:

some->>, take-while

Code sample from this episode:


(ns devops.week-06
(:require
[clojure.string :as string]
[devops.week-02 :refer [process-log]]
[devops.week-05 :refer [parse-357-error parse-sprinkle]]
))

(def general-re #"(\d\d\d\d-\d\d-\d\d)\s+(\d\d:\d\d:\d\d)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s(.*)")

(defn parse-line
[line]
(if-let [[whole dt tm thread-name level ns message] (re-matches general-re line)]
{:raw/line whole
:log/date dt
:log/time tm
:log/thread thread-name
:log/level level
:log/namespace ns
:log/message message}
{:raw/line line
:log/message line}))

(defn bare-line?
[line]
(nil? (:log/date line)))

(defn parse-exception-info
[lines]
(let [first-line (first lines)
[_whole classname] (some->> first-line :log/message (re-matches #"([^ ]+) #error \{"))]
(when classname
(let [error-lines (cons first-line (take-while bare-line? (rest lines)))
error-str (string/join "\n" (map :log/message error-lines))]
(merge first-line
{:kind :error
:error/class classname
:log/message error-str})))))

(defn parse-next
[lines]
(or (parse-357-error lines)
(parse-sprinkle lines)
(parse-exception-info lines)))

(defn parse-all
[lines]
(lazy-seq
(when (seq lines)
(if-some [found (parse-next lines)]
(cons found (parse-all (rest lines)))
(parse-all (rest lines))))))

(comment
(process-log "sample.log" #(->> % (map parse-line) parse-all doall))
)

Log file sample:


2019-05-14 16:48:57 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | failed to add sprinkle to donut 50493
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | sprinkle fail reason: unknown state
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | Poster #error {
:cause "Failed to lock the synchronizer"
:data {}
:via
[{:type clojure.lang.ExceptionInfo
:message "Failed to lock the synchronizer"
:data {}
:at [process.poster$eval50560 invokeStatic "poster.clj" 40]}]
:trace
[[process.poster$eval50560 invokeStatic "poster.clj" 40]
[clojure.lang.AFn run "AFn.java" 22]
[java.lang.Thread run "Thread.java" 748]]}