"; */ ?>

Posts Tagged: clojure


2
Dec 15

Time Series Database in One Line of Clojure

If you ever worked in the financial sector, specifically high frequency trading, a time series database is a well known tool that orders up all those quotes, orders, trades for financial pleasure.

The are many of these databases available. The Wall Street being The Wall Street would of course primarily use proprietary ones, since, well, it’s proprietary :), but giving them a credit: they do outperform open source ones by a lot, at least presently (talking about millions per second).

Disrupting Time Series Business


So I decided to write an open source time series database that will outperform them all not necessarily by performance, but definitely by clarity and size. Get ready for this one line.

If you read this far that means you are ready, so let’s begin by creating a database:

(def db (sorted-map-by >))

Oh, by the way we are done. It’s the one and only: The Time Series Database.

Map is King of Data


Let’s use it. First we’ll need some data:

(def data
  {1449088877092 {:GOOG {:bid 762.74 :offer 762.79}}
   1449088876590 {:AAPL {:bid 116.60 :offer 116.70}}
   1449088877601 {:MSFT {:bid 55.22 :offer 55.27}}
   1449088877203 {:TSLA {:bid 232.57 :offer 232.72}}
   1449088875914 {:NFLX {:bid 128.95 :offer 129.05}}
   1449088870005 {:FB {:bid 105.96 :offer 106.6}}})

The format is simple {timestamp data}.

Now a query to have a database as a value with this data:

(defn with [db data] (merge db data))

And finally some time based queries, like before and after:

(defn before [database ts] (into {} (subseq database > ts)))
(defn after [database ts] (into {} (subseq database < ts)))

done.

Action!


(before (with db data) 1449088877091)
 
{1449088876590 {:AAPL {:bid 116.6, :offer 116.7}},
 1449088875914 {:NFLX {:bid 128.95, :offer 129.05}},
 1449088870005 {:FB {:bid 105.96, :offer 106.6}}}
(after (with db data) 1449088877091)
 
{1449088877601 {:MSFT {:bid 55.22, :offer 55.27}},
 1449088877203 {:TSLA {:bid 232.57, :offer 232.72}},
 1449088877092 {:GOOG {:bid 762.74, :offer 762.79}}}

Beware, you, other time series databases!

P.S. Of course there is a possibility of events that came in at the exact same millisecond, so here is another line that solves it


24
Nov 15

Clojure Libraries in The Matrix

Clojure universe is mostly built on top of libraries rather than “frameworks” or “platforms”, which makes it really flexible and lots of fun to work with. Any library can be swapped, contributed to, or even created from scratch.

There are several things that make libraries great. The quality of its solution is of course the main focus which delivers the most value, but there are others. The one I’d like to mention is not how much a library does, but how little it should.

I like apples, you like me, you like apples


Dependencies are often overlooked when developing libraries. There are quite a few libraries that suffer from depending on something for either convenience, or for its built in example, or just in case, etc.

This results in downloading the whole maven repository when working on the project that depends on just a few of such libraries.

This also could create conflicts between the dependencies libraries bring and the real project required dependencies.

We can do better, and we should.

Those people don’t know what they are doing


The reason I bring it up is not because I am tired of these libraries, or it is time for a rant, but it is simply because I do it myself. And usually by the time I notice I did it, it requires significant rework to make sure developers that use/depend on my libraries do not bring “apples” that I like and they might not.

Useful vs. The Core


A great example of this is me including an excellent clojure/tools.logging as a top level dependency of mount. Mount manages application state lifecycle, and it would only make sense if every time a state is started or stopped, mount would log it:

dev=> (mount/start)
14:34:10.813 [nREPL-worker-0] INFO  mount.core - >> starting..  app-config
14:34:10.814 [nREPL-worker-0] INFO  mount.core - >> starting..  conn
14:34:10.838 [nREPL-worker-0] INFO  mount.core - >> starting..  nyse-app
14:34:10.844 [nREPL-worker-0] INFO  mount.core - >> starting..  nrepl
:started
 
dev=> (mount/stop-except #'app.www/nyse-app)
14:34:47.766 [nREPL-worker-0] INFO  mount.core - << stopping..  nrepl
14:34:47.766 [nREPL-worker-0] INFO  mount.core - << stopping..  conn
14:34:47.766 [nREPL-worker-0] INFO  mount.core - << stopping..  app-config
:stopped

It’s useful, right? Of course it is. As a developer that depends on mount, you don’t have to do it, it is already there for you, very informative and clean.

But here is the catch:

* what if you don’t like the way it logs it?
* what if you don’t want it to log at all?
* what if you use a different library for logging?
* etc..

In other words: “what if you don’t like or need apples and you eat bananas instead?”.

It ends up that “useful” is most of the time a red flag. Stop and think whether this “useful” feature is really the core piece of functionality, or is a bolted on “nice to have”.

Novelty Freshness of Refactoring


While it is not desired to have extra dependencies, and the above idea to include logging was not great, what was great are new thoughts during refactoring:

“Ok, I’ll remove logging, but now mount users won’t know anything about states..”

“Maybe they can use something like (states-with-deps) that would give them the current state of the application”:

dev=> (states-with-deps)
 
({:name app-config, :order 1, 
                    :started? true
                    :suspended? false
                    :ns #object[clojure.lang.Namespace 0x6e126efc "app.config"], 
                    :deps ()} 
 
 {:name conn, :order 2, 
              :started? true
              :suspended? false
              :ns #object[clojure.lang.Namespace 0xf1a66a6 "app.nyse"], 
              :deps ([app-config #'app.config/app-config])} 
 
 {:name nrepl, :order 3, 
               :started? true
               :suspended? false
               :ns #object[clojure.lang.Namespace 0x2c134117 "app"], 
               :deps ([app-config #'app.config/app-config])})

“That’s not bad, but what if they start/stop states selectively, or they suspended/resumed some states.. no visibility”

“Well, it’s simple, why not just return all the states that were affected by a lifecycle method?”

And that’s what I did. But I did not go through this thought process when I had logging in, since logging created an illusion of visibility and control, while in reality it gave “an ok” visibility and no control.

The solution just returns a vector of states that were affected:

dev=> (mount/start)
{:started [#'app.config/app-config 
           #'app.nyse/conn 
           #'app/nrepl
           #'check.suspend-resume-test/web-server
           #'check.suspend-resume-test/q-listener]}

The cool additional thing, and the reason it is a vector and not a set, is these states are in the vector in the order they were touched, in this case “started”.

Rules of The Matrix


While I made a mistake, I am glad I did. It gave me lots of food for thought as well as made me do some other cool tricks with robert hooke to demonstrate how to bring the same logging back if needed.

It does feel great to only depend on the Clojure itself, and a tiny tools.macro, which I use a single function from, and could potentially just grab from there, and cut my dependencies to The One.