You Need to See This!….

A kind of ritual viewing has happened with myself and the teen recently. Especially as there’s a large interest in statistics and comedy between the pair of us. They’ll suggest one thing, we’ll watch it and then I’ve gone through Monty Python, Billy Connolly, Jasper Carrott, Bill Bailey and so on….. I’ll get shown Game Theory and Film Theory in equal measure.

Then one evening, it hit me….. YOU NEED TO SEE THIS!

Dave Gorman’s Googlewhack Adventure……

(Now as this isn’t an official version of the live show I encourage you to venture here for the DVD and here for the book.)

Francophile namesakes

I owned the DVD when it came out in 2004 and it was a wonder to watch, even after being involved in the web and data industry since 1995 I was still mesmerised.

It’s the true story of Dave Gorman tasked by his friend, Dave Gorman, to find a continuous connection, a chain, of ten Googlewhacks connections. Meeting each Googlewhacker in person they supply Dave with two further Googlewhacks of their own finding. It is, in all seriousness, compelling viewing. And I know what you’re thinking….

What’s a Googlewhack? We need to go back in time a bit.

Googlewhack is a contest for finding a Google search query consisting of exactly two words without quotation marks that returns exactly one hit. A Googlewhack must consist of two actual words found in a dictionary. A Googlewhack is considered legitimate if both of the searched-for words appear in the result page. (From Wikipedia)

Watching again in 2019 it’s still a brilliant story but also it’s interesting to see how much the internet has changed, some for the better and some for the far worse. There’s a far more important question.

Can It Still Be Done?

Has our accelerated lives, data, data shadows and other digital finger prints rendered all of this history. Or, is there a minute glimmer of hope that it could still be done?

The Oxford Dictory contains 171,476 words in current use. From this point on it’s a combinatorics problem, how many word pair combinations actually exist? Back in 2004 I wouldn’t even know how to ask that question let alone find an answer for it….. oh how my life has changed.

There are 14,701,923,550 word pairs that could be searched in an attempt to find a Googlewhack. Fourteen billion….. and from my point of view that’s not a big data problem, it’s an average sized data problem. How long would it take though?

A quick Google search on “Francophile Namesakes” tells us two interesting facts.

Firstly there’s 59,800 results…. no longer a Googlewhack by any stretch of the imagination, and second, the result took 0.33 seconds to find. (14701923550 * 0.33) / 60 gives us 80,860,579 minutes to do all the word pair searches, 1.36m hours. Basically a single computer would take 155 years to just go and hit Google with all the pairs to find a Googlewhack. 

In our world of clustered computing and loads of computers doing the job at the same time, I could deploy a 1,000 machines and it would still take over fifty days to do the work. 

Ultimately, it doesn’t matter. It’s been done already, by Dave, in a time when you could easily do those kinds of searches. When human connection was the default standard of communication. And that’s when I was reminded what the internet has lost for me, the humanity of data. With all the Facebook, Twitter and all the other social networks the social aspect is, to me, lost, it’s the broadcast medium for those who want to listen. Back in 2004 the landscape was much different…… The Googlewhack Adventure just reminded me how much I missed it. 

Thanks Dave. Sadly, I’ve no idea what that’s done to the graph.