Lately, I’ve been tinkering with Clojure, and my latest weekend project is a URL shortener à la bitly. Turns out, URL shorteners are not black magic!

How do URL shortener work? They need a bijective function. A bijective function is function for which

  1. There is no x1, x2 where x1 != x2such that f(x1) = (fx2)
  2. for every y, there must exist an x such that f(x) = y.

In other words, this means that if f(x) = “abc”, then there exist a function g(“abc”) = x.

I defined the codomain of the function f as the set of strings consisting of the alphabet [a-zA-Z0-9], which is 62 characters.

Once such a function is implemented, the application simply has to assign a unique number to a given URL, encode it using the function f, and voilà! The resulting string is the shortened URL. The simplest way to do this is simply to use a counter and increment it with every URL to shorten.

The main function is therefore pretty simple. long-to-mod-list recursively creates a list of the characters to use by dividing the number x by m and keeping its reminder in a list. Dehydrate, the entry point, simply joins the character of the list and sets the size of the alphabet to 62.

(defn long-to-mod-list [x m]
  (if (= 0 x)
    '(0)
  (loop [n x
         digits '()]
    (if (<= n 0)
      digits
      (recur (long (/ n m))
             (conj digits (mod n m)))))))

(defn dehydrate [x]
  (clojure.string/join "" (map long-to-char (long-to-mod-list x 62))))

Then, I simply store in a Redis key-value storage the mapping between the short-url and the long URL.

(defn create-url [url]
  (when-not (str/blank? url)
    (let [short-url (shorten url)]
  (view/show-short-url short-url))))

My very simple URL shortener is backed by a Redis key-value storage. Why Redis? Because it performs incredibly well since it’s all in-memory. Even more importantly, I never used it before and weekend projects don’t need the same justification as real-production ones, so I get to play with the technologies I want to. Redis made it really simple to store the mapping between the short URL and the long URL to return to the user.

(defn shorten [long-url]
  (let [persisted-url (retrieve-short-from-long-url long-url)]
    (if (nil? persisted-url)
        (register-url long-url)
        (update-url persisted-url))))

The shorten function simply checks to see if the URL was previously shortened. If the short URL already exists, it will update a visit counter and return the same short URL. Otherwise, it will register the new URL mapping.

(defn register-url [long-url]
  (let [id (get-and-inc-id)
            short-url (dehydrate id)]
        (persist-url short-url long-url)
        short-url))

The register-url function will get and increment the counter, generate the short URL using the dehydrate function, and persist the mapping.

That’s pretty much all there is to a URL shortener. Of course, this approach is not necessarily optimal as it might not scale well. We might for example need a better way to get a unique number to identify the URLs if we want to distribute the application. One such way could be to assign a chunk of numbers, for example, [1-10000] for each slave nodes so that they don’t always need to ask a master node for a new id. Anyway, this is a bit out of the scope of this small project.

Of course, a URL shortening algorithm isn’t worth much by itself, so I created a small Web application using Compojure. The application is pretty simple. There are only 3 valid routes.

(defroutes app-routes
  (GET "/" [] (index))
  (POST "/" [url] (create-url url))
  (GET "/:short-url" [short-url] (handle-request short-url))
  (route/not-found "Not Found"))

The first one being index, which is the main page where the user can add a new URL to shorten or see the short URL they generated.

(defn index []
  (view/index))

Sending a POST request creates a new short URL from the URL passed in parameter.

(defn create-url [url]
  (when-not (str/blank? url)
    (let [short-url (shorten url)]
  (view/show-short-url short-url))))

(defn show-short-url [short-url]
  (view/show-short-url short-url))

Finally, a GET on another page will lookup in if the short url exists. If it does, a redirect will be sent, otherwise, a 404.

(defn handle-request [short-url]
  (let [url (retrieve-url short-url)]
    (if (nil? url)
      (route/not-found "Not Found")
      {:status 302
        :headers {"Location" (str "http://" url)}})))

The last piece of the puzzle was to generate the actual Web pages. To do so, I used Hiccup, which is a simple and convenient library to generate html in a Clojuresque way.

(defn shorten-form []
  [:div {:id "url-shortener-form"}
   (form/form-to [:post "/"]
                 (form/label "url" "What url do you want to shorten?")
                 (form/text-area "url")
                 (form/submit-button {:class "btn btn-primary"} "SHORTEN!"))])

Fun right? You can find the full source code on GitHub.