Previous Episode: Ep 030: Lazy Does It
Next Episode: Ep 032: Call Me Lazy

Nate finds that trouble comes in pairs.

Nate finds that trouble comes in pairs.

Last week, we were able to make our parsing lazy.
Each step was lazy, and we side-stepped the landmine of waiting to perform computation till after the file is closed.
(02:00) There's a new error: Sprinkles are failing!
Sprinkles are the "likes" of DonutGram, and are the metric by which all donuts are judged.
Log lines are showing up, like failed to add sprinkle to donut 23948.
Let's write another regexp and handler. This is easy!
But wait, we notice that there is another log line right after: sprinkle fail reason: db timeout exceeded.
We want to capture that reason as part of the error, but it's on a different line.
Our current code (from Episode 029), has a nifty abstraction with a paired regexp and handler function. If the regexp matches, then the matches are passed to the handler.
The underlying assumption is that each line stands alone.
Now we have two lines, and that breaks our pre-abstraction.
Our function needs more context, give our function two lines instead of just one.
Our parse-details function goes out the window, what can we do to get more context for each parsing?
(07:05) Let's go to the Clojure library and check out the partition function.
Used to break up a list into chunks. Eg. Chunking arg lists into key/value pairs.
The step argument varies how far you reach into the collection for the next chunk.
For instance, (partition 2 1 (range 1 7)) yields ((1 2) (2 3) (3 4) (4 5) (5 6)). Each element is paired with it's following element.
Doing this with our log lines means we have a second line in the group that we hand to our parsing function.
The parsing function can look in the first line for the error, and then if it finds it, look in the second line for the reason. If both are found, we return a sprinkle error record.
If either regex fails, nil is returned, so our list becomes a sequence of sprinkle errors or nils.
Then we can filter out the nils and summarize the sprinkle errors.
"Some of these errors hurt more than others."
The sequence is not fully realized, because partition is lazy. It constructs chunks on demand.
"This is the beauty of Clojure core, all this stuff is lazy. Except, of course, for the eager parts."
(15:45) Problem: when looking through log lines, there are often other log lines in between.
What happens if our sprinkle fail reason is not on the line immediately after the sprinkle error?
We might need to look ahead a little further ahead than the next line.
One solution is to increase the partition size when chunking the lines. How about (partition 10 1 lines) or (partition 100 1 lines)?
The first line is the indication that there's an error, and then look through the rest of the lines for the reason.
We could log out "should never happen" if we don't find the error in the next 10 lines.
It becomes a magic number problem, how far do we need to look ahead?
At least these large partitions use structural sharing to re-use the record memory. We only pay extra for the containing collections.
"Is there a partition-infinity?"
"It feels like something we can be lazy about, and push off to our next episode."
"It's like when a show jumps the shark and you know it will come to an end, but you're just not sure which season that's going to be in."
What if parsing the first line determines how many further lines we need to consume?
"Fun is a loaded term."
"It always puts you in an interesting situation when your pre-optimization doesn't quite hold up a month down the road."
"I had the perfect abstraction for the wrong shape."
"The world changed around me." "Clearly it's not the fault of my code."

Related episodes:

028: Fail Donut
029: Problem Unknown: Log Lines
030: Lazy Does It

Clojure in this episode:

partition
->>
map, filter
group-by, frequencies

Code sample from this episode:


(ns devops.week-03

(:require
[devops.week-01 :refer [parse-line]]
[devops.week-02 :refer [process-log]]
))

(defn parse-sprinkle-error
[line-pairs]
(let [[first-line second-line] line-pairs
[_whole donut-id] (some->> first-line :log/message (re-matches #"failed to add sprinkle to donut (\d+)"))
[_whole error] (some->> second-line :log/message (re-matches #"sprinkle fail reason: (.*)"))]
(when (and donut-id error)
(merge first-line
{:kind :sprinkle
:sprinkle/donut-id donut-id
:sprinkle/error error}))))

(defn sprinkle-errors
[lines]
(->> lines
(partition 2 1)
(map parse-sprinkle-error)
(remove nil?)))

(defn sprinkle-errors-by-type
[errors]
(->> errors
(map :sprinkle/error)
(frequencies)))


(comment
(process-log "sample.log" #(->> % (map parse-line) sprinkle-errors sprinkle-errors-by-type))
)


Log file sample:


2019-05-14 16:48:55 | process-Poster | INFO  | com.donutgram.poster | failed to add sprinkle to donut 23948

2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | sprinkle fail reason: should never happen
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | failed to add sprinkle to donut 94238
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | sprinkle fail reason: timeout exceeded threshold
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | failed to add sprinkle to donut 24839
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | sprinkle fail reason: too many requests
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | failed to add sprinkle to donut 19238
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | sprinkle fail reason: should never happen
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | failed to add sprinkle to donut 50493
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | sprinkle fail reason: unknown state