The NiTechrank Retrospective. #opendata #nijobs #clojure #opensource #data

The @nitechrank was a simple index to report the changes in programming jobs in Northern Ireland. It wasn’t anything scientific and for all my hailing of the automating of all things possible, meaningless and trivial…. well it was me editing the tweets every morning, with a fresh cup of tea, half asleep and in my dressing gown 99% of the time*. Today I ran the last index as the picture became clear….



The Good News

In linear regression terms the jobs outlook is positive. It slopes upwards over time, not to say that there aren’t down days, the start of June was a bit of a surprise but also was the peak which happened on the day after the referendum. So with a low of 42 jobs and a high of 92 jobs yeah it got a little varied but nothing troubling.

Keep in mind though, the only data source, seems to list larger companies so you can assume with a certain amount of confidence that startups won’t list there and look for developers word of mouth. One of the reasons that Java may favour so highly in the results as a whole.

For Developers

If you are interested in Clojure then the repository for the code is available on my Github account. Take from it what you will, it’s slung together to get the numbers out. Note there’s no support, the README should be clear enough to get things working. I won’t be doing any updates on this repo, it’s just there for anyone one who wants to look. I also included the original shell script that did the ranking originally but only for historical purposes.

For Data Folk

Everyday the NiTechrank updated a CSV file so in the data directory is the CSV data of all the indexes that were run. Have fun with it, I’ve not done any analysis on it.


Yeah, it was a giggle.

*I was dressed for the other 1%

From Onyx Template to Working App in Thirty Seconds #onyx #clojure

It’s early days for the Onyx Platform and I’m using it in a number of projects as it plugs in with Kafka well. Though getting it up and running can be a bit confusing. So as an aide memoir for me more than anything, here’s the thirty second version.

I’m assuming you have leiningen installed. No tutorial or walkthrough, maybes later when I’ve got some more time and can present something a little more worthy but in the meantime here’s the basics.

With regards to Zookeeper Onyx is using it’s own in-memory version so there’s no need to spin one up for this quick overview.

Create The Application with Onyx Template

$ lein new onyx-app funky-avocado
Generating fresh Onyx app.
Building a new onyx app with:

Create The Uberjar

$ cd funky-avocado/
funky-avocado $ lein uberjar
Compiling funky-avocado.core
Compiling funky-avocado.core
Created /work/funky-avocado/target/funky-avocado-0.1.0-SNAPSHOT.jar
Created /work/funky-avocado/target/peer.jar

Running The Application

$ java -cp target/peer.jar funky_avocado.core start-peers 10 -c resources/config.edn
Starting peer-group
Starting env
Starting peers
Attempting to connect to Zookeeper @
Started peers. Blocking forever.

Do TripAdvisor ratings mean anything over the long term? #ratings #data

Averages, lovely things. Ratings, lovely things. If it helps the consumer then I’m all for it. Do I trust Trip Advisor ratings, well on the whole yes and then there are some gotcha’s.


Averages Can Distort the Actual Truth

Distorted averages are hardly new, if you want to learn more the Skillswise website has a good explanation. Want to earn an average of £63,000, as discussed in a recent #factbait? Chances are you might, or you might not, who knows. At the end of the day it’s just an average.

They are something to think about in reviews especially when an event may change the way a company operates.

So I’ll use the real world example of Burger Club, I’d read good things, nah, great things about this place. And anyone who knows me I like a good burger. So a visit was in order. Okay the experience didn’t go as planned….

TripAdvisor Averages

So I’m assuming that the rating for a location on Trip Advisor is calculated as an average. Take all the community ratings and then work out the average. That’s all well and good if the averages. So Burger Club was a 4/5 rating which I’m happy with but you have to delve into the detail a little closer. The Recency of ratings will tell an awful lot.

In this case a change of management was leading to a string of one and two star reviews. With it being a recent change it meant the stronger ratings were still giving a higher rating average. Nothing wrong with that if it’s a blip. The warning signs are in the actual reviews which I wasn’t reading until I was waiting for my food (not making that mistake again).

How To Fix It?

There’s a simple fix, give an overall rating (the average) and then ratings for the last six months and the last month. Then it’s down to the consumer to decide whether to go ahead eating/drinking/staying there or not. If I’d taken the time to look at the recent ratings before hand then things might have been different. That’s my fault and no one else’s, and certainly not Trip Advisor’s.

Recency is everything though especially when there’s a radical change in the structure of a place. Different people lead to different outcomes.

I hope the Burger Club just read their reviews (and not the rogue five star review from the chef’s mate) and do great things in Portstewart again.

Airline Seat Auctions… What to bid? Part 1. #travel #airlines #auctions #clojure


Airlines have long perfected the yield algorithms over the years to determine flight prices and the break even point. The basic rule of thumb is the closer you get to the travel date the more expensive your flight will be.

Going, Going, Gone…. again.

So it looks like seat auctions are on their way back, once you’ve already bought your seat. The Economist ran a report yesterday on the increasing use of auctions on premium seats, a rather nifty way of upselling to the traveller once they’ve already parted with the cash. Depending on the airline and whether you need to be enrolled on one of their programmes will determine what your options are.

Interestingly Richard Kerr, aka The Points Guy, offered a handy calculation as a rough guide as what to bid. So I thought I’d give it a whirl. Previously I’ve done auction systems in the airline industry but that was for the entire aircraft, not a single seat. Fun, fun, fun indeed.

A Working Example

Okay, suppose I want to fly from London Heathrow to Dubai (LHR -> DXB). What I first need is the basic economy price and then the price for the premium economy.

A quick look on gives me the following:

Economy LHR->DXB: £352

Premium Economy LHR->DXB: £968

So the question is, what to bid? The Points Guy offers a simple equation that gives a sensible guide price.

Bid offer = (premium seat price - economy seat price paid) * percentage

The percentage can be anything you want but the guide set by The Points Guy is between 20-40%. To implement this in Clojure is easy enough, it can be done in one line.

user> (defn calc-bid [premecon econ pc] (* pc (- premecon econ)))

So using the prices I have for the LHR->DXB flight, I’m going to add 20% and see how that looks.

user> (calc-bid 968 352 1.2)
;; => 739.1999999999999

A suggested bid price of £739.19, a potential saving of £228.81 on the premium economy class price, that’s promising but that does not take into account factors such as how many other people are bidding, scarcity and so on.

There’s nothing to stop you bidding the absolute minimum of £1 for example, depending on the number of bidders (which is highly controlled when you think about it). Perhaps you were flying on a Boeing 777 which can average a capacity of 382 seats depending on configuration, with an 80% load factor that’s 306 people who could bid but only a small percentage, 8% or so, would actually want to bid (not everyone is competitive or has the cash for example). So theoretically 24 people actually bid….

Then you’re into bid psychology and proper statistics, this post really needs a part 2…..



When everyone is saying #no doesn’t mean you should stop. #startups #bigdata #hadoop

“I want to be the new Clubcard in town…”

In 2009/2010 I was working on customer loyalty technology with a view to doing something on the phone. So uVoucher was born with probably too much fanfare and talking. The value for me wasn’t the iPhone app for the retailer, it was the data and the learning that could come off the back of it. Data to me was always the cornerstone of business decision.


At that time the word “Hadoop” kept cropping up with processing large volumes of data with commodity hardware, the idea of making one large computer out of lots of small ones to do processing appeals, especially when volumes of retail data are concerned.

Waterskiing On The Data Lake

The more I looked at Hadoop the more I saw huge potential but I also saw a huge gap in the technical to real world users. Configuration was a big pain to do well and as a technology was way above the reach of the, what I would call, common user.

That spawned Cloudatics, the human version of Hadoop, say where your data is and choose from the drop down list of things you wanted to do to your data, start the job and then wait for the output. Simple….. it seemed obvious to me that was the way things were going, towards data platforms.

I made one fundamental error, I feel, I listened to other people’s opinion too much. At that time I liked to gauge opinion of people that I trusted. Some folk got it and some folk really didn’t get it, and when I say “didn’t get it” I mean at all….. “who’d use that!?”.

One other, “learn from that for future reference”, was I entered a pitch competition. Yet again, one person got it, “yeah, data mining for everyone” and the other slammed it right in my face, “no business would use this”. I left the room with the pair of them arguing. Northern Ireland wasn’t ready for big data or Hadoop…. so pitching it was a bad idea. I’ve never pitched since and never will, build, ship and sell is the only way.

So fast forward on five years and below a picture of the Expo hall on Thursday 2nd/Friday 3rd June 2016 at Strata Conference in London. Pretty much every stand down there is a data platform, with the exception of O’Reilly, or a company who is highly integrated into a data platform.


Have a Hunch and the Data to Back It Up?

Then go for it, and I’m not saying don’t listen to anyone. Yes sometimes you have to be bloody minded and forge ahead to see what the challenges are (in fact, some would say I made a career out of it) but sometimes it is wise just to get that sounding board feedback.

Be careful where and who you pitch to. While every competition is happy for you to stand and do your three minutes, not every judge is going to get it or support what you are saying. And I don’t go for the whole “gut” thing either.

I’m certainly not bitter or unhappy, I’m quite the opposite, I love where I work. I love the BigData community once you get past the marketing “it’s just a gimmick” naysayers.

Ultimately it’s about the right message, at the right time at the right place.

Hadoop In NI Hospitals – #WildSundayThoughts #NHS #NI #data #hadoop

Sunday mornings are for tea, The Sunday Times and thinking. And with changes in my daily work routine, all for the better, that’s got me thinking on large scale things with data again.

I’ve been thinking about Hadoop a lot again over the last week. “Is it dead?” posts in Quora, Spark 2.0 coming out and me working with Terraform, DCOS, Marathon and Mesos to create some quite remarkable things. Hadoop started it all for me and a good few years ago as I was reminded.

“Can I just say, you were the first, and only person speaking about Hadoop and BD in NI for years, ….”

Which was nice…. but over tea this morning I started thinking about the bigger picture. I love Northern Ireland, I love the startup scene though I’m not really involved anymore and I love data enabled stuff. And that got me thinking….


Do Connected Health Ideas Deal With The Big Problems?

There are some excellent companies coming out of the connected health side of things, where the personal collection of data can provide some feedback of performance. Even though I occasionally whine about the reliability of the data out wearables there are companies like AppAttic that are attempting to change user behaviour in this feedback loop. Bravo! Add to that the Invent2016 finalists such as Elemental, Take Ten and Kraydel with additional heavy duty wallop from C-TRiC you can see that things are happening.

These products could be classed as external to the main health hub of the NHS and while the notion of data being passed around to each other is enticing, the reality is much harder than that in reality.

In my opinion connected health ideas deal with the prevention, monitoring and catching early and there’s certainly use in that data. Without wanting to de-humanise the notion of a hospital, are we even tapping the 1% of the data available within these institutions?

Probably not.

Data Platforms In Hospital?

Absolutely. While the emphasis on connected health seems to come from outside the health system we can’t discount what goes on within the walls of it. Every hospital, GP surgery, health centre and clinic is a separate data platform. Different sectors of data from different backgrounds and groups of population.

On a compliance level it’s very difficult, nigh on impossible, for a startup to launch a product within a health location, hospitals especially. Not without years of testing, accreditation and reported findings, being a University spin out at this point becomes very appealing, I’d wager the probability of getting inside the NHS with findings may be easier with university backing.


The large BigData vendors love the Healthcare card, selling vast infrastructure to authorities and working close with them with consultants and so on. Infrastructure on this scale is expensive and with the NHS watching the purse strings it is a hard sell to accept.

If Not The Cloud, Then Where?

So, every piece of machinery, everything that can emit data can more than likely be stored. And I’m not saying, “to the cloud”, privacy dictates otherwise (with the exception of Deep Mind doing analysis on historical data). And I for one, even as a big/open data advocate, would be very concerned if time critical raw data was pushed to the cloud for analysis.

So if not the Cloud, then where? In house….

For first line analysis the data shouldn’t be leaving the building. Now this sounds like an infrastructure nightmare waiting to happen. I think there’s a potentially simple solution.

The plus point of Hadoop infrastructure was it was built for “commodity hardware”, every node in the cluster could be a fairly low grade machine and chug away in the background. And guess what, the hospital is full of them and hardly being used to capacity.

Photo 29-05-2016, 08 21 36

The machines are already networked. It’s just a theoretical case of having a small number of machines to handle in the incoming data (a Kafka queue or two) and somewhere to store the data (HDFS). A Hadoop master can then handle the nodes. A node in this instance is a machine under the desk of each ward or office. Take an entire hospital and you have one large data process cluster without any capital spend.

It’s not a difficult system to put together and it’s mainly geared for batch work, I wouldn’t expect it to do anything in real time, nor would I unleash a Spark job on it (loads of cores, loads of memory, the hospital isn’t geared up for that kind of work). Using the existing infrastructure does mean trade offs and that’s okay, if you’re sensible about the use case and think batch processing then there’s something of use here.


There’s a string of questions to answer I know, but in the pragmatic utopian vision for cheap, scalable data processing that keeps user data private, then this is a starting point. Consultants and analysts working together with their questions could potentially see a holistic view of the hospital for that day. “Why did all the base line temperatures raise by 2 degrees in that ward alone, is that something we need to look at?”.

Merely ideas I know, you never know, someone on the inside might just run with it. If it helps lives then I’m all for it.



My #Invent2016 Predictions…. ego bruised already.

The joys of startup ideas and judging panels, you never know how these things are going to go. Now the post I did was based on gut, not data. Lack of data means lack of  the reduction of risk….. so my predictions on the group and final winners was with little certainty anyway.

I had my thoughts and the judges had theirs.

So far Jase, not great.

So 50% of my group winner predictions have gone already, plus my predicted finalist is out. Ah well, such is life… I didn’t put money on it so no one lost out. I’ve got my opinions and for once I’m keeping them to myself.

Consumer Internet

Finalists: VenueBooker, Orca Money, E+Press and Locate A Locum

Prediction: Orca Money


Finalists: Embed, JumPack, PetalRod and SmartVent

Prediction: Embed

Agri Science

Finalists: Copeland Spirits, Green Lizard Tech, Oran Oak and Purple Magic

Prediction: Purple Magic (Hunch on name alone)

Life and Health

Finalists: Take Ten, TILT, Kraydel and ECGeasy

Prediction: TILT


Finalists: PacTec, PayOx, Point Energy and SnapIt

Prediction: PacTec

Enterprise Software

Finalists: Elemental, The Shield, Right Revenue and SpaceHop

Prediction: SpaceHop

I salute any startup getting this far in the first place, it’s not easy. Putting heart and soul in to an idea takes guts and determination. So well done to all of them. I can’t put my finger on a winner though, it’s now wide open and I don’t know enough about them. The finals are going to be very interesting.


The Day #IoT Lost All Credibility

Are you working on an IoT startup? A lot of them are starting to sound like ideas devised from a bunch of village idiots dosed up on Buckfast on a Sunday afternoon.


Buckfast: Aiding Entrepreneurial Plans Since 1976……

The Day It Died

The day it ended, the day the IoT community should read and hang it’s head in shame came yesterday. And boy I bit my tongue.


Really! Really! Shock people when their bank balance gets too low? Is what the world is going to come to, I honestly don’t want part of this stupid dream world.

Currently it’s about consumer choice, but find me a consume who wants to be tagged to their bank, no one….. did market research ever enter the minds of the “ideas men and women”? And whats worse is that the BBC gave it an audience, the Tech section needs A GET IN THE SEA tab and quickly.

Just because API’s can talk to each other doesn’t mean they should. Should I get a shock from really annoying friend to tell me (not remind me, TELL ME) to get a bottle milk as I’m nearly running out? NO BECAUSE I HAVE EYES! FLIP. I would put money on the fact a blind person would not need IoT for using their sense of touch to figure the same problem out.

The End Of IoT

What you have witnessed reader is the marked beginning of the end of consumer IoT devices, there are no transforming ideas. People don’t want to be tracked and their data content “put in the cloud” so they can be marketed to.

Enterprise problems are cumbersome and boring but they do make money, mainly because they will never put up with this insane nonsense of new IoT ideas. Mud. Wall. Stick…. I doubt it.


The Eurovision Voting Aftermath – #Eurovision #Voting #Data


The changes to the voting did make Eurovision more exciting, that’s for sure. It also kept the bookies on their toes, just watching the odds for Spain go from 100/1, to 20/1 to 100/1 was enough to tell me how changeable the whole thing was going to be.

The Public Vote Changes Everything

The arguments of politicised voting, well the jury votes seemed fairly level headed all in all and at the half way point Australia were in the lead. That made sense, as the song was strong.

Without coming to conclusions too quickly without further analysis I think it’s fair to say the cultural sway of the voting lies with the public.

Consistency is key

Now not looking at any other previous voting but if a country can consistently poll over 8.4 points then they’re on to a winner. The public weren’t buying into the Australian entry (just in case we had to watch next year’s final at 11pm at night ’til 3am, though that wasn’t the case at all).

Poland was an interesting case, though by the halfway stage there was no real chance of winning the public vote was high and it totally re-arranged the scoreboard.

Country Avg. Jury Vote Avg. Public Vote Avg.
Australia 8.42 5.16 6.79
Ukraine 8.79 8.07 8.43
Russia 6.5 8.80 7.65
Poland 1.75 6.34 4.04

Interestingly the Russian jury didn’t give Ukraine a single vote, neither the Ukraine jury give anything to Russia. The public on the other hand went the complete opposite Russia giving 10 to Ukraine and Ukraine giving 12 to Russia. Kind of dispelling any myth that the respective juries were up to something.

And the UK?

The public know when they don’t like something. Only Australia, Ireland and Malta public votes scored the UK any points and even those weren’t that high. And while the jury vote saved face a little; Albania, Australia, Czech Republic, Denmark, Estonia, Ireland, Malta, Russia, San Marino and Serbia; the message was clear for the rest of Europe, there’s no such thing as Brexit vote handout.

Country Avg. Jury Vote Avg. Public Vote
United Kingdom 5.4 2.66 4.03

The new Voting Is a Keeper….

I’m fairly sure the bookies will be waking up and pouring through the voting and calculating new odds methods for next year. This voting though I think is to say, it certainly made things more exciting.

Anyone hoping for some sweeteners from Europe for the UK to send positive vibes for June 23rd, well I think it’s fairly clear they really don’t care that much.


Stuff #Eurovision Betting Odds! It’s #Invent2016 Semi Finals! #data #northernireland #startups

Invent is the new Eurovision!

When you know that Australia are 1/4 to end in the top five of Eurovision it’s time to focus on something else. So I turn my attention to Northern Ireland’s finest….

The closest thing we’ve got to a man singing in a dress is Alex Kane (fair play to him for keeping his word), all he needs is the beard and we have NI’s own Conchita, may not be a good idea to order an award cake though.


Finding Data On The Invent Semis…

With tea in hand and the sun shining (like three days in a row now, Cecilia’s on annual leave surely) I went about digging around the 24 finalists. I had good intentions to sit down and methodically go through each entry of the Invent2016 semi finalists and get some data. The original plan was to manually trawl through the excellent Mattermark website and get the growth scores and other juicy numbers….. I gave up.

The simple fact of the matter is that anything early stage has to work hard on getting some searchable data out there. Whether that’s a Crunchbase listing, some sort of social media thing and a listing on It’s all relative and remember, Northern Ireland Startups need the one big gold mine moment where all eyes look here and see more gold to be mined. So exposure to the masses is key.

Because a lot of these companies are very early stage and prototypes it’s very difficult to get any good data on them. And as I’m so out of the loop with the startup sewing circle knowing much about these ideas is pretty much impossible. So I resort to mere guesswork…. why the heck not. It currently beats looking at odds checker and knowing Russia are basically going to win Eurovision.

My Methodology…

Do you really think I have one with a lack of data. Two words: Startup Sexy. That’s my methodology.


Nailing My Predictions To The Mast

I’m going to look at each category and with the minimum of Yorkshire/Hampshire wit and sarcasm predict the winner of each category and then have a go for the overall winner. It seems I am an idiot, I’m sure there’s easier ways to live life.

One observation and I’ve come across this before. Be careful which category you are in. I think Locate A Locum is enterprise and SpaceHop is more consumer, that might have an impact on the group winners. Just a thought….

Consumer Internet

Finalists: VenueBooker, Orca Money, E+Press and Locate A Locum

Prediction: Orca Money


Finalists: Embed, JumPack, PetalRod and SmartVent

Prediction: Embed

Agri Science

Finalists: Copeland Spirits, Green Lizard Tech, Oran Oak and Purple Magic

Prediction: Purple Magic (Hunch on name alone)

Life and Health

Finalists: Take Ten, TILT, Kraydel and ECGeasy

Prediction: TILT


Finalists: PacTec, PayOx, Point Energy and SnapIt

Prediction: PacTec

Enterprise Software

Finalists: Elemental, The Shield, Right Revenue and SpaceHop

Prediction: SpaceHop

The Overall Winner?

Really difficult to know as there’s nothing gut wrenchingly jumping out at me. So I’m going to hazard a guess with Orca Money with an each way on SpaceHop. Though let’s be honest I’d not risk any of my own money on a bet.

I’m too risk averse.

Best of luck to all the competitors.


Get every new post delivered to your Inbox.

Join 901 other followers