Here is what needs to happen: there is a URN that is a part of an HTTP request. It needs to be parsed/split on the last “:”. The right part would be the key, and the left part would be a value (we’ll call it “id” in this case). Here is an example:
user=> (def urn "company:org:account:347-68F3726A84C") |
After parsing we should get a neat map:
{:by "company:org:account", :id "347-68F3726A84C"} |
While it feels more readable to start “regex”ing the problem:
user=> (require '[clojure.string :as cstr]) user=> (def re-colon (re-pattern #":")) user=> (cstr/split "company:org:account:347-68F3726A84C" re-colon) ["company" "org" "account" "347-68F3726A84C"] |
Just splitting on a simple single character regex (above) takes almost a microsecond (i.e. in this case about 2242 CPU cycles):
user=> (bench (cstr/split "company:org:account:347-68F3726A84C" re-colon)) Execution time mean : 830.235720 ns |
In general it is always best to use language “builtins”, so we’d turn to Java’s own lastIndexOf:
(defn parse-urn-id [urn] (let [last-colon (.lastIndexOf urn ":")] {:by (subs urn 0 last-colon) :id (subs urn (+ last-colon 1))})) |
Putting “validation” outside for a moment, this actually does what is needed:
=> (parse-urn-id urn) {:by "company:org:account", :id "347-68F3726A84C"} |
user=> (bench (parse-urn-id urn)) Execution time mean : 5.588747 µs |
Wow.. builtins seem to fail us. How come?
A culprit is not “lastIndexOf”, but a way Clojure resolves an “untyped” “urn”. Anything that is defined with “def” is kept inside a Clojure “Var” that uses reflection and is not amenable to HotSpot optimizations. An interesting read on what actually happens: “Why Clojure doesn’t need invokedynamic, but it might be nice”.
While, in most cases, parsing a String for 6 microseconds is a perfectly fine expectation, there is a simple hint that can make it run 60 times faster. It’s a hint.. It’s a type hint:
(defn parse-urn-id [^String urn] (let [last-colon (.lastIndexOf urn ":")] {:by (subs urn 0 last-colon) :id (subs urn (+ last-colon 1))})) |
user=> (bench (parse-urn-id urn)) Execution time mean : 83.409471 ns |
By hinting a “urn” type to be a “^String”, this function is now 67 times faster.
Achieve a warm and fuzzy feeling... [Done] |