Clojure Adventures: An URL Shortener
Lately, I’ve been tinkering with Clojure, and my latest weekend project is a URL shortener à la bitly. Turns out, URL shorteners are not black magic!
How do URL shortener work? They need a bijective function. A bijective function is function for which
- There is no x1, x2 where x1 != x2such that f(x1) = (fx2)
- for every y, there must exist an x such that f(x) = y.
In other words, this means that if f(x) = “abc”, then there exist a function g(“abc”) = x.
I defined the codomain of the function f as the set of strings consisting of the alphabet [a-zA-Z0-9], which is 62 characters.
Once such a function is implemented, the application simply has to assign a unique number to a given URL, encode it using the function f, and voilà! The resulting string is the shortened URL. The simplest way to do this is simply to use a counter and increment it with every URL to shorten.
The main function is therefore pretty simple. long-to-mod-list recursively creates a list of the characters to use by dividing the number x by m and keeping its reminder in a list. Dehydrate, the entry point, simply joins the character of the list and sets the size of the alphabet to 62.
(defn long-to-mod-list [x m]
(if (= 0 x)
'(0)
(loop [n x
digits '()]
(if (<= n 0)
digits
(recur (long (/ n m))
(conj digits (mod n m)))))))
(defn dehydrate [x]
(clojure.string/join "" (map long-to-char (long-to-mod-list x 62))))
Then, I simply store in a Redis key-value storage the mapping between the short-url and the long URL.
(defn create-url [url]
(when-not (str/blank? url)
(let [short-url (shorten url)]
(view/show-short-url short-url))))
My very simple URL shortener is backed by a Redis key-value storage. Why Redis? Because it performs incredibly well since it’s all in-memory. Even more importantly, I never used it before and weekend projects don’t need the same justification as real-production ones, so I get to play with the technologies I want to. Redis made it really simple to store the mapping between the short URL and the long URL to return to the user.
(defn shorten [long-url]
(let [persisted-url (retrieve-short-from-long-url long-url)]
(if (nil? persisted-url)
(register-url long-url)
(update-url persisted-url))))
The shorten function simply checks to see if the URL was previously shortened. If the short URL already exists, it will update a visit counter and return the same short URL. Otherwise, it will register the new URL mapping.
(defn register-url [long-url]
(let [id (get-and-inc-id)
short-url (dehydrate id)]
(persist-url short-url long-url)
short-url))
The register-url function will get and increment the counter, generate the short URL using the dehydrate function, and persist the mapping.
That’s pretty much all there is to a URL shortener. Of course, this approach is not necessarily optimal as it might not scale well. We might for example need a better way to get a unique number to identify the URLs if we want to distribute the application. One such way could be to assign a chunk of numbers, for example, [1-10000] for each slave nodes so that they don’t always need to ask a master node for a new id. Anyway, this is a bit out of the scope of this small project.
Of course, a URL shortening algorithm isn’t worth much by itself, so I created a small Web application using Compojure. The application is pretty simple. There are only 3 valid routes.
(defroutes app-routes
(GET "/" [] (index))
(POST "/" [url] (create-url url))
(GET "/:short-url" [short-url] (handle-request short-url))
(route/not-found "Not Found"))
The first one being index, which is the main page where the user can add a new URL to shorten or see the short URL they generated.
(defn index []
(view/index))
Sending a POST request creates a new short URL from the URL passed in parameter.
(defn create-url [url]
(when-not (str/blank? url)
(let [short-url (shorten url)]
(view/show-short-url short-url))))
(defn show-short-url [short-url]
(view/show-short-url short-url))
Finally, a GET on another page will lookup in if the short url exists. If it does, a redirect will be sent, otherwise, a 404.
(defn handle-request [short-url]
(let [url (retrieve-url short-url)]
(if (nil? url)
(route/not-found "Not Found")
{:status 302
:headers {"Location" (str "http://" url)}})))
The last piece of the puzzle was to generate the actual Web pages. To do so, I used Hiccup, which is a simple and convenient library to generate html in a Clojuresque way.
(defn shorten-form []
[:div {:id "url-shortener-form"}
(form/form-to [:post "/"]
(form/label "url" "What url do you want to shorten?")
(form/text-area "url")
(form/submit-button {:class "btn btn-primary"} "SHORTEN!"))])
Fun right? You can find the full source code on GitHub.