I Shutdown DeskHoppa, here’s why. #DeskHoppa #startups

Yesterday I shut down DeskHoppa. It wasn’t an easy decision but it was the right decision. It surprised a few people that I’d do such a thing and there were a few messages from dear friends wondering if I’d made the right call.

And no, I didn’t delete the code but I did delete all the data.

Marketplace Startups Are Hard

That’s the plain and simple fact. While it’s all very well knowing that there are buyers and sellers out in the market place, actually tying them together via your service is really hard. You are effectively marketing to two sides of the coin, it’s not a simple equation to complete either.

One of the hardest things to solve is in the initial stages. In DeskHoppa’s case you need hosts listing in order to get users searching. Hosts were the hardest customer to get on board, the require convincing and the harsh reality is that most don’t trust you until you can really convince them.

It’s a Numbers Game

Everyone I spoke to was lovely, “That’s a great idea, I needed that yesterday!”. The problem is that kind words do not put money in the bank. So you have to start with a figure in mind, £100,000 turnover for example and work backwards….

There’s 260 working days a year so that’s my frame of reference. £100,000 / 260 = £384.61 a day, that’s what I need to be doing as an average.

If my fees are £1.20 for every £10 booked (card fees are applied after so they don’t chew in to my margin), then I’m looking at 321 bookings a day. Now look at the real world side of that market, the funnel of users.

Search -> View -> Book. 

Assuming my booked users are the 321, I’m guessing the conversion rate is 3% from view to book (users just looking around do that, just look around). I need 10,700 host views a day based on my 321 @ 3%. That, however, is not the end of the story. Not everyone is going to be looking all the time, so far the assumption that 100% of the users are searching, that’s unrealistic. It’s probably 3% again, at best.

So what I’m really saying is I need 356,666 users signed up and booking daily to make £100k/year. Or 3.56m users to make a million revenue a year. That doesn’t even take hosts into account….

Facebook, Instagram, Twitter, Linkedin and GoogleAds…..

This is the first time I ran some experiments on ads. The ability to narrow in on target segments is critical, get that wrong and your spend vanishes in hours as a bunch of underage users poke around to see what you are doing.

Linkedin spends are quite expensive and at least they give a rough idea of the conversion (for my scenarios it was about 0.79%).

Ultimately boosting posts didn’t return anything, some nice users in the US and a two hosts who enquired. Once again though it’s a high volume numbers game, you need money to make money. I knew that all along.

There’s a Skill to Knowing When to Call it a Day

When I embarked on DeskHoppa I was under no illusions, building the service is the easy bit (well it is for me, I can write code quick). The key was always eyeballs and they’re really hard to get. If you fool yourself that folk always care then there’s a hard reality, the majority don’t, it takes time to get their attention and trust.

Knowing when to say, “that’s enough”, is done through various iterations of history. I’ve let things run too long before. Idea validation is the hard part and I don’t believe it’s about product market fit, it’s about market product fit. You have to build the market first, if that market doesn’t exist then you’ll spend a long while creating it. The first person I heard that flipped the whole Product/Market thing was Gretta Van Reil of SkinnyMeTea.

After review numbers and looking at what it might take to get things where they need to be, the right decision was made. There’s a worse position to be in, a service that just trickles money in but doesn’t quite break even. The signal that something is happening but not a volumes you need….. things can become a millstone quickly.


There are some wonderful, supportive people out there. Ones who gave feedback, lists of improvements, shouted out repeatedly on social media. Ones who were blunt and told me the reasons why they wouldn’t host desks…. it was all valuable.

I emailed everyone the final email, to say thank you. You can’t just close a service and not say thanks. Some of the responses were lovely.

Thank you.


“Where have you been hiding?…..” #nitech #ni #machinelearning #ai #customerloyalty #clojure #java

For those who’ve been asking why I’m not so active in NI…..

Errrm, I haven’t. So far 2019 has thrown some joyous curve balls. Some good, some challenging but the pointers to learn from were in plain sight.

Not Much Conference Talking….

Last year, 2018, was full on tech talks and as much as I love doing them it felt like I was treading old ground, a bit like keeping the old classics in the set even though you hate doing them.

In terms of local talks, I stopped. The transaction costs weren’t that high but I certainly wasn’t getting any value back. Plus the amount of sponsored meetups, hackathons and events were pushing out any realistic assessment of the AI/ML landscape locally, just an opinion.

I’d lost my joy for conference talks, I wasn’t talking about the things that mattered and it wasn’t until I back tracked my roots and realised how I was missing talking about real world retail and customer loyalty….. I know some of you had asked about me doing more of that, I’m still finding an interesting angle. (That and no one asks me now).

This year also saw me more involved in the international conferences that I do love. I’m now part of the programme committees for O’Reilly’s Strata Data Conference in London and San Jose, and also ClojureX in London.

And remember, no one should feel pressured to talk. If you want to do it, do it. If you don’t, then don’t.

Machine Learning Book 2nd Edition

Work has now started on the update to Machine Learning: Hands On for Developers and Technical Professionals. More machine learning at scale on the JVM (in Java and Clojure) and more on Deep Learning, Kafka, Image recognition and text mining.

Release won’t be until the end of the year or into 2020. Not my call, depends on how fast I can type…. If you don’t see me then I’m probably typing.


Apache Storm: From Clojure to Java….. some thoughts. #clojure #java #storm

The route to Clojure can be an odd one. For some it just falls straight in to the developer’s way of working (“It’s Lisp, I geddit!”). Others, like me, with our old Java based OOP heads struggled for the penny to drop.

If Java is your main language then the move to Clojure can be difficult, it’s a different way of thinking. If Clojure is your first/main language then doing Java Interop in Clojure is going to melt your head (I’ve seen this a lot, I found it surprising too).

For me the penny dropped when my then boss, Bruce Durling, put it to me like this: “Data goes in to the function, data goes out of the function”. After that everything made sense and if you make functions small, separate and testable then it’s a joy to use.

There’s one issue though that has always been a challenge, not just for Clojure but other languages, mainstream adoption.

It’s better for a developer to have two or three languages in their toolbox, not just one. The reason…. well the Apache Storm project dropped the mic.


“While Storm’s Clojure implementation served it well for many years, it was often cited as a barrier for entry to new contributors.”

Yup get that completely.

Clojure Takes Time….

Clojure takes time to learn and to do well. There’s a group of folk in society that just get confused by too many parentheses, I was one of them. Another thing I’ve found is that adoption route can be made harder by the documentation in projects, too many times I’ve come across things that you were just supposed to know, it just wasn’t helpful.

I suffered huge huge huge imposter syndrome with the Clojure community, they talked in a different language, my mental reaction was “I don’t fit in here”. They spoke about solutions that were just plain confusing. Over the last four years of this blog I’ve done my best to break stuff down and explain it in English to give the next poor sod a chance. I was actually scared of doing my first talk at ClojureX, petrified actually. The audience in the room knew far more than I did.

Finding Clojure developers is pretty much an uphill struggle, it’s a small circle. Finding good ones is harder, though that could be said of Scala and the like too. It’s easier to cross train someone from Java into Clojure but that takes time and most companies are not in a position to wait, there’s work to be done. Recently I was talking to a company who were potentially interested in hiring but the made one thing very clear, “We wouldn’t want you to do anything in Clojure, no one here can support it.”, I totally agree, the bus number is key.

So with something like Apache Storm this does not come as a surprise, Apache projects need adopters and that is a numbers game. Do a project with minority adoption and then there’s a good chance the project will wither and die. Actually I didn’t realise Storm was written in Clojure until I read the announcement.

The Bottom Line is I Love Clojure

Knowing what I know now I find it hard to move away from Clojure. DeskHoppa is 100% Clojure but I know it’ll be developing that for the time being. I’ve realised that it’s a niche especially when it comes to things like Strata Data Conference where I’ve always put things in Java and some Clojure, I’ve had too otherwise my talks get rejected.

I never wanted to learn Haskell…….

Finding #pi with #montecarlo method and #Clojure – #math #justmath

I was reading a post from Toward Data Science blog this morning on mathematical programming to build up skills in data science by Tirthajyoti Sarkar. While the article was based around Python it didn’t use any of the popular frameworks like NumPy or SciPy.

Now with a bit of a lull I wanted to keep my brain ticking nicely so the thought of using math within Clojure appeals nicely to me. And I’m not saying one is better than the other. The best language to for data science is the one you know. The main key of data science is having a good grounding of the math behind, not the frameworks to make it easier.

Calculating Pi By Simulating Random Dart Board Throws

The Monte Carlo method is the concept of emulating a random process. When the process is repeated a large number of times will give rise to the approximation of some mathematical quantity of interest.

If you imagine a square dart board…..

Now imagine a square dart board with a circle inside the square, the edges of circle touch the square…..

If you throw enough darts at the board some will land within the circle and some outside of it. As the original article graphically put it:

These are random throws, you might throw 10 times, you might throw 1 million times. At the end of the dart throws you count the number of darts within the circle, divide that by the number of throws (10, 1m etc) and then multiply it by 4.

As the original article states: the probability of a dart falling inside the circle is just the ratio of the area of the circle to that of the area of the square board.

The more throws we do the better chance we get of finding a number near Pi. The law of large numbers at work.

Throwing a Dart at The Board

I’m going to create a function that simulates a single dart throw. I want to break down my Clojure code into as many simple functions as possible. This makes testing and bug finding far easier in my opinion.

(defn throw-dart []
  {:x (calc-position 0)
   :y (calc-position 0)})

What I’m creating an x,y coordinate with a 0,0 centre point then passing the coord for the x and the y through another function to calculate the position (calc-position).

(def side-of-square 2)

(defn calc-position [v]
  (* (/ (+ v side-of-square) 2) (+ (- 1) (* 2 (Math/random)))))

The calc-position function takes the value of either x and y and applies the calculation, this is somewhere -side-of-square/2 and +side-of-square/2 around the centre point.

Running this function in a REPL we can see the x or y positions.

mathematical.programming.examples.montecarlo> (calc-position 0)

Is The Dart Within The Circle?

Now I have a x,y position as a map {:x some random throw value :y some random throw value} I want to confirm that the throw is within the circle.

Using the side-of-square value again (hence it’s a def ) I can figure out if the dart hits within. I’ll pass the map with x,y coords in and take the square root of the added squared coordinates.

(defn is-within-circle [m]
  (let [distance-from-center (Math/sqrt (+ (Math/pow (:x m) 2) (Math/pow (:y m) 2)))]
     (< distance-from-center (/ side-of-square 2))))

This function will return true or false. If I check this in the REPL it looks like this:

mathematical.programming.examples.montecarlo> (throw-dart)
{:x 0.22535085231582297, :y 0.04203583357796781}
mathematical.programming.examples.montecarlo> (is-within-circle *1)

Now Throws Lots of Darts

So far there are functions to simulate a dart throw and confirm it’s within the circle. Now I need to repeat this process as many times as required.

I’m creating two functions, compute-pi-throwing-dart to run a desired number of throws and throw-range to do the actual working to find the number of true hits in the circle.

(defn throw-range [throws]
  (filter (fn [t] (is-within-circle (throw-dart))) (range 0 throws)))

(defn compute-pi-throwing-dart [throws]
  (double (* 4 (/ (count (throw-range throws)) throws))))

The throw-range function executes the throw-dart function and is-within-circle evaluates the map to see if the value is either true or false. The filter functions will return a list of true values. So, for example, if out of ten throws the first, third and fifth are within the circle I’ll get (1,3,5) as the result from the function.

Calling the function compute-pi-throwing-dart sets all this into motion. Like I said at the start, taking the number of darts in the circle and dividing that by the number of throws taken, multiplying that by four should give a number close to Pi.

The more throws you do, the closer it should get.

mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 10)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 10)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 10)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 10)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 10)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 10)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 100)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 1000)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 10000)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 100000)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 1000000)
mathematical.programming.examples.montecarlo> (compute_pi_throwing_dart 10000000)

Let’s Build a Simulation

Via the REPL there is proof of an emergent behaviour, the value of Pi comes from the large number of throws we did at the dart board.

The last thing I’ll do is build a function to run the simulation.

(defn run-simulation [iter]
  (map (fn [i]
    (let [throws (long (Math/pow 10 i))]
      (compute-pi-throwing-dart throws))) (range 0 iter)))

If I run 4 simulations I’ll get 1, 10, 100 and 1000 throws computed, these are then returned as a list. If I run 9 simulations (which can take some time depending on the machine you’re using) in the REPL I get the following:

mathematical.programming.examples.montecarlo> (run-simulation 9)
(0.0 3.6 3.28 3.128 3.1176 3.1428 3.142932 3.1425368 3.14173752)

That’s a nice approximation, Pi is 3.14159265 so to get a Monte Carlo method to compute Pi by random evaluations is good.




Using your table tennis table to create startup revenue.

Photo by Dennis Cortés on Unsplash

(Originally posted on the DeskHoppa Medium site.)

The table tennis table, the startup’s secret weapon to get team members to work together and collaborate, allegedly. Recruiters love putting the humble table tennis area as one of the big bonuses of startup hiring, along with the beer fridge and oversized bean bags.

However, utilisation of the slab of board is usually low and if you have remote workers it’s really difficult to have a game of ping pong with them. With a full size playing area of about 19 feet by 11 feet it takes up a large amount of square footage too.

Putting The Table Tennis Table To Better Use

Here’s the DeskHoppa simple guide to putting the table tennis table to better use.

  1. Fold it up (or sell it, or burn it* outside out of harms way).
  2. 19ft x 11ft is 209 square feet. With an average desk and chair taking 30 square feet you can fit six working areas in to the same space.
  3. Create a host account on DeskHoppa. In the Live Availability section type 6 in to the “Desks Available” field and 10 in to the “Price Per Hour” field. Click on “Update Availability” and you’re ready. You can sell day, week and month passes with DeskHoppa too.
  4. In “General Host Info” section create a funky strapline, “We got rid of our table tennis table so we could meet you!” and a general description of your workspace.
  5. You can add features, things such as free tea and coffee, working Wifi, a whiteboard etc in the “Features” section.
  6. Start shouting about your listing on LinkedinTwitterInstragram and any other social channel you are using.

Here Are The Numbers

With six workspaces on DeskHoppa listing at £10 per hour, six guests staying one hour a day will give you an estimated £14,400 per annum. If those same six guests worked a four hours morning then that’s a potential £57,600 incremental revenue.

If there are certain skill sets that you are on the look out for then using DeskHoppa is the perfect tool for finding them. Guests have profiles that you, the host, review.

DeskHoppa was created to help hosts create revenue and for guests to find somewhere peaceful to work. A professional place where the clatter of coffee cups and loud discussions and reduced.

If you want to learn more about hosting on DeskHoppa please take a look at our “Becoming a DeskHost” section on the DeskHoppa website.

* If you do decide to burn your table tennis table then please note we can’t take any responsibility that may arise from you doing you. Your decision, not ours.

DeskHoppa Engineering — Twitter, Kafka and being Data Driven.

This post was originally published on the DeskHoppa Engineering Blog on Medium.

We built DeskHoppa on data driven decisions. The technology though is used to augment our decision making, not wholly make it for us. How we choose the hosts we contact is based on data, algorithms and probability.

The search and match processes to put a guest together with a host is a pursuit of accuracy that can only be done over time with data, training and evaluation.

Putting those things together is not easy, much of the ground work is done by others who put the time in on their own dime. Open source software powers a lot of what we do.

Giving Something Back To The Community

Deciding to publish any code and setups that are useful to others was a very simple decision to make. What seems supposedly simple to us may be days of work for something else, uncovering the gotchas and documenting them can save a developer days, weeks or even months of unpicking. We’ve been there and have had the development rabbit holes that others have.

We’ve put our publishable repositories on our Github account. Some of it will be code written by us, some of it is just handy scripts that might have come from other places but collated in a way that’s easy for the developer to implement.

Using Kafka and Twitter Data

There’s a natural fit for Kafka and streams of Twitter data. Using a mixture of Kafka Connect to make a connection to Twitter Streams API and then using KSQL streaming query language to transform and query the stream is powerful even in the most simplistic of contexts.

While we do an awful lot more with the data past the KSQL stages we wanted to share a really quick setup for anyone to use. For our first community release to Github we wanted to start with raw data, it’s important to collate relevant data from the outset. Our Kafka/Twitter configuration, based on the excellent blog post by Robin Moffatt on the Confluent Blog is our baseline.

The configuration and required files are on Github https://github.com/deskhoppa/kafka-twitter-collector with a README of what to put where. Assuming you’re using the community edition of the Confluent Kafka Platform everything should slot in to place without any bother.

DeskHoppa: Making Any Business a Co-Working Space #coworking #deskspace #startups #freelancers #hotdesk

When I launched DeskHoppa at the start of February the aim was clear: to enable any startup or business to rent out their desks to anyone who needed one.

Co-working spaces are great, they are allowed to use DeskHoppa too, but the monthly membership was always a barrier for me, the cost would outweigh the usage by a factor of 5 to 1. I needed something more on-demand.

Why Not Use a Cafe?

I stood in a street in Belfast and I only needed a desk for an hour, just to check in with the team I was working with at the time. When I say this to friends and founders alike the first response I get is, “You should have used a cafe”.

There are a few reasons I really don’t like working out of cafes. Firstly there’s the noise, now it’s not loud banging or crashing noise but the ambient soundscape of the daily operation of a cafe, it’s plates being stacked or a Gaggia coffee machine steaming away. It’s very difficult to conduct a call with background noise.

Laptop theft is a huge concern, it’s also a data privacy issue if you are working for a company. It also can happen very quickly as illustrated in the “Caught no Camera: Berkeley Laptop Theft” video.

Lastly, have you ever been in a crowded cafe and been drawn to a conversation in earshot? I’ve got a knack for hearing a good brainstorming session or business meeting. And like the majority of people I know I have a notebook and a pen to hand. I’m sure hundreds of startups have been beaten to market because of this.

So the question is, where you can work for an hour? How about within a business with a spare desk?

Desks By The Hour, Day, Week or Month

Many businesses, startups and co-working spaces (the host) have spare capacity and it’s costing them money. It would make sense for a host to maximise the revenue potential of the desk as a money making asset by renting it out for a period of time. That’s what DeskHoppa does, it gives the host a system to rent out desks and create incremental revenue from them.

As a visitor, DeskHoppa becomes the platform for finding somewhere to work. A network of hosts in city, a choice of locations to work from.

As a host you have full control of how many desks you list, what price you charge and what facilities are available to guests. If you want to sell day, week or month passes to guests that’s available too. DeskHoppa handles the booking, the payment and the host’s booking request process. You can review every booking or automatically accept bookings.

The benefits of offering desks to guests is that you build up a network of potential suppliers. They may be video content producers, software developers or graphic designers. For businesses who are looking to fill skill shortages within the organisation then DeskHoppa may become the first stage in building the relationship.

If you want to signup either as a guest or a host then please go to https://www.deskhoppa.com

(This post was originally posted on the DeskHoppa Blog on Medium).



Does Craig’s 10 predict the winner? #data #voting #strictly #strictlycomedancing #clojure

It started with a conversation on Clojurians Slack…..

Now, we’ve got some experience with the Strictly scores, we know that linear regression completely trumps neural networks on predicting Darcy’s score from Craig’s score.

This however is different and yet still interesting. And as we know we have data available to us up to season 14.

Does Craig’s elusive ten do much to the outcome? Who knows…..

Load Thy Data….

The data I’ve put in the resources directory of the project. To load it in to our program and make it into a nice handy map…. we have the following two functions. Historical data is from Ultimately Strictly.

(def filename "SCD+Results+S14.csv")

(defn format-key [str-key]
  (when (string? str-key)
    (-> str-key
        (clojure.string/replace #" " "-")

(defn load-csv-file []
  (let [file-info (csv/read-csv (slurp (io/resource filename)) :quot-char \" :separator \,)
        headers (map format-key (first file-info))]
     (map #(zipmap headers %) (rest file-info))))

The format-keyfunction takes the top line of the CSV file and uses the header row as the key names for each column. So when the load-csv-filefunction is called we get a map of the data with the header names as keywords.

The only downside here is the numeric scores are strings as this spans across all the judges from all fourteen series then there are plenty of “-” scores where a judge didn’t take part. Not a big deal but worth keeping in mind.

Grouping Judging Data

What I’d like is a map of weeks, this will give me a breakdown of series, the judges scores, who was dancing and the song etc. As far as the scores are concerned I’m only interested in 10’s as to test Thomas’ hypothesis.

(defn get-week-groups-for-judge [k data]
  (group-by :week (filter #(= "10" (k %)) data)))

I’d also like a collection of weeks so I can figure out which was the first week that a judge gave a score of 10.

(defn get-weeks [m]
  (map #(key %) m))

(defn get-min-week [v]
  (->> (get-weeks v)
       (map #(Integer/valueOf %))

Finally a couple of reporting things. A series report for a given week and also a full report for a judge.

(defn report-for-judge [w data]
  (filter #(= w (first %)) data))

(defn report-for-week [jk w data]
  (map #(select-keys % [:series :week jk :couple]) (data w)))

Now we can have a play around with the data and see how it looks.

With Thy REPL I Shall Inspect…

So, Craig’s scores. First of all let’s get our code in to play.

user> (require '[scdtens.core :as scd])

Load our raw CSV data in…

user> (def strictlydata (scd/load-csv-file))
user> (count strictlydata)

Now I want to extract scores from the raw data where Craig was the judge who scored a 10.

user> (def craigs-data (scd/get-week-groups-for-judge :craig strictlydata))
user> (count craigs-data)

So there’s seven weeks but which was the first week?

user> (scd/get-min-week craigs-data)

Week 8, but we don’t know how many series that covers. We can see that though, a function was created for it.

user> (scd/report-for-week :craig "8" craigs-data)
({:series "2", :week "8", :craig "10", :couple "Jill & Darren"} {:series "7", :week "8", :craig "10", :couple "Ali & Brian"})
user> (p/pprint *1)
({:series "2", :week "8", :craig "10", :couple "Jill & Darren"}
{:series "7", :week "8", :craig "10", :couple "Ali & Brian"})

So in two series, 2 and 7, Craig did score a 10. That’s all good so far, the question is did Craig’s score “predict” the winner of the series?

Looking at the final for series 2, Jill and Darren did win. And for series 7, Ali and Brian didn’t win the competition but they did top the leader board for week 8 as the data shows.

What if we pick another judge?

Craig’s scores are one thing but it turns out that Darcey is a blinder with the 10’s.

user> (def darceys-data (scd/get-week-groups-for-judge :darcey strictlydata))
user> (scd/get-min-week darceys-data)
user> (scd/report-for-week :darcey "4" darceys-data)
({:series "14", :week "4", :darcey "10", :couple "Ore & Joanne"})

Week four, no messing. And guess who won series 14….. Ore and Joanne.

Bruno perhaps?

user> (def brunos-data (scd/get-week-groups-for-judge :bruno strictlydata))
user> (scd/get-min-week brunos-data)
user> (scd/report-for-week :bruno "3" brunos-data)
({:series "4", :week "3", :order "11", :bruno "10", :couple "Louisa & Vincent"} {:series "13", :week "3", :order "14", :bruno "10", :couple "Jay & Aliona"})
user> (p/pprint *1)
({:series "4",
:week "3",
:order "11",
:bruno "10",
:couple "Louisa & Vincent"}
{:series "13",
:week "3",
:order "14",
:bruno "10",
:couple "Jay & Aliona"})

Turns out Bruno was impressed from week three. And all the better was that Jay and Aliona won series 13.

Does Craig scoring a 10 have any steer at all?

In all honesty, I think it’s very little, I mean it’s up there with a Hollywood handshake but they’re being thrown out like sandwiches at a festival now.

The earliest week that Craig scored a 10 was week 8 and only had a 50% hit rate in predicting the series winner from that score.

The judges scores only tell half the story and this is where I think things get interesting, especially in series 16, this current series. And once again it comes back down to where people are putting their money. Risk and reward.

Thomas’ question came about because Craig’s first 10 score cropped up last weekend. Ashely and Pasha get the first 40 of the series but the bookies data sees things slightly different.

Do external data forces such as social media followers have any sway and volume on the public vote? Now that’s the question I think that needs to be looked at. Joe Sugg is a YouTube personality and there’s nothing like going on social media and begging for votes for competitions and awards. So it stands to reason that Joe has a very good chance of winning the competition while being outvoted on the judges scores.

The risk of using Craig’s ten indicator as saying Ashley is going to win, well it does come with risk but increased reward. At 7/1 this is basically saying, based on previous betting movements, that there’s 12.5% chance of Ashley winning. Now only if there was a rational way of deciding…..

Get me Neumann and Morgenstern on the phone! Now! Please!

Is there a potential upside to deciding to go with Craig’s score? Let’s see if we can find out. The one book I still want for Christmas, or any other gift giving event, is The Theory of Games and Economic Behavior by John von Neumann and Oskar Morgenstern. It’s my kinda gig.

Back to Ashley, we can work out the expected utility to see if Craig’s ten and the bookies info is worth a punt.

Expected utility: You multiply the probability of winning by the potential gains and multiply the probability of losing by the potential losses. Adding the two gives you the expected utility of the gamble.

A Warning and Disclaimer

It doesn’t have to be money, I’m not encouraging you go to and place a bet with your own money. That’s your decision to make and I’m assuming no responsibility on that one. I shall, however, continue. Got that, good, now….

Within any gamble there are four elements: The potential gain, the potential loss, the chance of winning and the status quo.

The Status Quo

Forgive me, I had to, there are rules….

The status quo is the current situation we are in, which is exactly what will happen if we do not decide to participate in a gamble.

The Potential Gain

Our reward if the gamble pays off. This has to be better than the status quo.

The Potential Loss

What we lose if the gamble does not go in our favour. This should be worse than the status quo.

The Chance of Winning

The probability of the pay off, it also tells us the chance of it NOT paying off.

Ashley’s Expected Utility

With the bookies general probability of Ashley winning at 12.5% and I have a tenner in my back pocket, at 7/1 odd I’d get £80 back (£70 winnings + my original wager of £10). So I’m going to use 80 as my potential gain and 10 as my potential loss. You gain/loss numbers can be anything, it doesn’t have to be money. It’s just with these numbers in mind you have a mechanism for coming to a figure of expected utility.

The expected utility of winning is 80 multiplied by 12.5% = 10

The expected utility of losing is 10 multiplied by 87.5% = 8.75

The expected utility of the gamble is 10 – 8.75 = 1.25

As the expected utility is above zero (is greater than the status quo) then it’s worth a go. If it was below zero, down down deeper and down the status quo then you’d not want to do anything.

Interestingly Darcey’s been throwing out the 10’s to Ashley for a while. I wish I’d see the bookies odds at week six and not week eight. There may have been a more concrete expected utility to strengthen my position.

Conclusion. Well there isn’t one yet.

This series of Strictly is still raging on so we won’t know the actual outcome until 15th of December. It has been very interesting though to look at the various judge’s 10 scores and see if we can predict outcomes with additional information.

If you want to poke around the Clojure code for this post you can do.



Collecting Royalties Without the Middleman, a Concept for @DGMLive and David Singleton

This post is really in response to a Facebook post by David Singleton, the joy of Facebook algorithms means that I didn’t see the actual post until this morning. It’s worth a read especially if you an artist and you want to get paid fairly, there’s a link at the bottom of the page.

What I present here is a proof of concept and probably a shaky blueprint at best but hopefully it outlines some concepts that someone within the industry can run with.

I’ll take the angle of a musician but it could apply to anyone who creates something.

Everything in life is a transaction

A radio play of a song is a transaction, a YouTube video play is a transaction (this throws up a few more questions which I’ll get on to later), a concert ticket sale is a transaction…. you get the picture.

There are actors in every part of the process, some of them wield more power than others. With that imbalance of power the distribution effect can be manipulated, skewed and downright ignored. Over the years with the joys of the internet artists have tried, and rightly so, to regain control over their income and artistic rights. Being able to sell direct has been the goal, with offshoots of subscriptions, exclusive club releases and so on. And they’ve worked, on the whole, fairly well.

However, along with the rise of those types of services you still have the larger monopolies such as iTunes, Spotify and Amazon who control their own ecosystems. And with the same message as the National Lottery, and perhaps the same probability of a positive win, you have to be in it to win it. Once a large volume of consumers piles on to the platform the artist is under a certain amount of pressure to join on the fear of missing out on revenue.

One of the joys, especially for me who loves customer loyalty data, transactional data and the real time nature of these things, for me everything is a transactional data point. This includes every musician in the band, past and present, every track recorded, every concert ticket sold.  The question is how to combine all those data sources so everyone gets paid.

Scribbled before heading out of the door….

Yup it’s one of those drawings again…

Have notebook, a pen and a cup of tea. I will scribble.

I see radio stations, streaming services, direct sales and ticket sales as “consumers of the artist”, now they might not directly consume the product but merely act as a wholesaler to the listener/audient. However there is a transaction and that transaction will be recorded. Breaking it down a stage further everything is an entity and it relates with another entity.

The Band/Artist

Should I call this the brand? Perhaps I should, as an entity it’s what the end consumer/fan/audient connects with. It gets tribal, I’m a huge fan of King Crimson, St. Vincent and Level 42…. I connect with them all. I also connect with the members of each of that entity so it needs breaking down a little further.

Having the band/artist as an entity is important. Lineups of that entity can change over time, anyone who knows King Crimson well is aware of this fact, changing lineups may also mean changing publishing rights of the music and this gets important when it comes to compensating people over the long term.

The Asset

Assets of the brand, the type of entity opens up here and gets interesting. A concert is a live asset with multiple members, an album is an asset made up of assets (songs). Each asset has members that performed and wrote the pieces in question. What was once an administrative nightmare could actually be easy to manage in these digital data driven times.

The Individual Member

Who “works” for the band? Like I said above lineups change over time. Here though a member is a member. Interestingly this could be said for a solo artist. Is Annie Clark the member of the brand “St Vincent”? I think so. It also means that frees the individual up to work on other projects outside of the main brand. Collaborations therefore become measurable.

In this instance it doesn’t have to just be a musician, it could be a manager or a producer connected with the brand or artist. If you can negotiate a transactional amount then you can allocate reward over time.

A good case in point would be Nile Rogers who worked on (the asset) Like A Virgin by (the brand) Madonna. Nile waved his advance for producing the album and renegotiated his royalty on sales. My only surprise was the Nassim Taleb didn’t include a paragraph in his book “Skin In The Game“, it was the perfect example.

The Consumer

Once again, as with the asset, a consumer can take on multiple personas. It may be an organisation such as Apple, Spotify or Amazon. It might be Google/YouTube or just an average person who likes to purchase the wares.

A consumer at this point may not be the end user of the asset. This may be a wholesale transaction with a different volume of money associated to it. Multiple consumers can have different sale amounts attached to them.

An Asset Transaction

Now we get to the interesting part. The performance of a song is an asset transaction, whether it be live, recorded, streamed or just straight purchased (I still prefer a purchased CD for value for money played over the long term).

With the member attached to an asset then breaking down becomes much easier, it’s just a process at that point. Especially with a band like King Crimson when songwriting credits are spread over many people over a long period of time and many songs over many periods can be played live.

The live performing band can play Discipline knowing that it will be recorded in a ledger of some form (more on that in a moment). This record once processed knows that the writers: Adrian Belew, Bill Bruford, Robert Fripp and Tony Levin will be due some form of performance payment based on the agreed consumer value. The same goes for someone streaming the same song from Spotify for instance, the record of that transaction is saved and the members compensated accordingly, it’s just another consumer with another value attached to it.

This does mean though that every live performance set list needs to be recorded too. And yes I appreciate that whimsical flights of fancy happen when an audience member yells “House of the Rising Sun” and you launch into it. With a finalised set list pre or post performance you have a list of transactions and everything connects with each other.

Calculating Asset Wealth Distribution

Or, “How do I get my money!?”

As we know the transaction amount to a consumer for a specific asset calculations to what is owed to whom becomes just a case of mapping each transaction and ending up with an amount owed to a member.

We end up with a kind of conceptual graph of the relationship between the writers, the artist, the performed asset and the consumer.

(concert) [:performed] <- (asset) -> [:by] (brand) -> [:written_by] (members)

From there it’s purely data mining, finding out who is owed what. With everything recorded in some form of ledger, well you have something to reference. It just becomes a job of performing that function.

How and what frequency the royalties are calculated is another matter. Doing it in real time while possible is not feasible from a payment point of view. Payment transactions come with their own cost. Depending on transaction volumes a monthly, quarterly or annual run are perfectly reasonable. The calculations themselves are pretty much unique to the brand in question. What works for KC may not work for Level 42, which also may not work for St Vincent. Like I said, consumers have different agreements with the brands.

And you might have noticed at no point have I mentioned a middleman collecting the royalties. It should be done direct, peer to peer, no middleman. The reason, I’m wildly assuming, for them existing in the first place was that performance, recorded or otherwise, was just about impossible to monitor. Radio play may have been different but live performance was hard to monitor. So it was easier for every consumer (shop, public space) to pay a fee and hope it would get distributed fairly.

We need to talk about Youtube

Copyrighted material is difficult to police. YouTube adds to the computational woe in the fact that many who publish an artists work have nothing to do with the artist at all.

And it’s all very well that sharing the love of the band or that song but at the end of the day no one is really getting paid for it. Now there are certain things you might see like some copyright information and a link to purchase the track or album in question. It’s far from peer to peer transactions and it’s also far from perfect.

With the ledger we know that a video is being viewed. If it’s one of your songs is played (the asset) then it’s just another asset transaction, and because we know the connecting members of that asset, well it means they are due something from that consumer. At this point a consumer is every YouTube account that it is publishing your asset. Now that’s opened up tens, hundreds, even thousands of revenue streams.

As far as we’re concerned it’s just another data source.

Challenges, hmmm there’s a few.

Right now there many middlemen and it’s really bad business for them to be cut out of the loop. I know this from previous startups in aircraft leasing, I’m good at annoying brokers in the chain. What I’ve described above while not impossible to do, it’s just data at the end of the day, is an implementation challenge for one simple reason.

Not every player I’ve described will be on board.

At the moment the large ecosystems are telling the artist what they will pay them. Spotify streaming sales are fractions of a pence and even with a long long long tail could take years to make any decent money. Power laws come in to play here, 20% of the catalogue will (probably) make 80% of the revenue.

To get a large organisation to emit data per play is not impossible. It means that brands have to be savvy enough to pull the data in and process it. Investment in some form of venture to handle all this must happen first.

To decentralise the whole royalty payment system out of a few power brokers, well that’s interesting and risky. At this point you become the main broker of performance data (does that not already exist?). The power merely shifts from one place to another.

Decentralisation is hard (and no one dare mention the word Blockchain to me). Implementation to each partner is hard, time consuming and usually too technical for the lay person to understand.

Live performances I’ve already touched on, a system to record concert performances with a list of assets so it can be processed with a result of who exactly gets paid what. Once again all doable, but who are the partners that do all this, is it the band before performance? Is it the venue? Is there a gig register? Cover bands at this point should be worried if you’re filing set lists you’ll be paying everyone.


There is a lot covered here, some ideas are worth fleshing out and some ideas would take so long to implement. There are trade offs and some parts of the data model are easier to execute than others.

Back to my main point though, once the concepts are broken down and everything becomes a transaction then it’s easy to figure who is supposed to get what. And to reduce disputes, as David’s post was getting at, then you need a transaction for everything to do with an asset. After that it’s just accounting.

Getting all the players on board is an entirely different conversation.

Now where do I send my money when I belt out Sartori In Tangier on my Chapman Stick when I’m at home? It’s a live performance after all. 🙂

For those that got this far…. here’s David’s post.