Platypus Header

Platypus Innovation Blog

26 June 2017

BitCoin: It won't make you look 5lbs thinner

Recently I've been reading up on BitCoin, Ethereum, blockchain, and the assorted crypto-currencies than now bloom across the techscape.

These are exciting times, but not clear times. Consensus ledgers and public cryptographically signed audit trails are an interesting thing, and they'll find some powerful applications. But it's notable that we've all been perfectly happy to use money systems without that -- witness banks, Visa/Mastercard, PayPal, or the numerous companies who use direct-debits to tap your wealth according to their own notion of credit/debt.

Money systems are essentially about trust (that the the token you hold can be exchanged for goods & services). Money has to be backed by something we trust, such as the nation state & it's economy, or the bank's solvency and integrity at the book-keeping level, or for IOUs, in the individual. BitCoin is the first system where trust is grounded in an algorithm.

That alone doesn't change how money is used. Perhaps BitCoin's biggest achievement is to get enough momentum to puncture the Visa/Mastercard monopoly (alongside PayPal, and now Android and Apple payment systems). These are (welcome) variations on an existing theme. We're still waiting to see a digital currency that really shakes up how commerce works.

People talk of low transaction costs -- in fact BitCoin transactions are computationally very expensive! This is by design: BitCoin uses computational cost to defend against attack by a swarm of bots. As I write this, a BitCoin transaction costs about $0.50 and takes about 20 minutes, but these numbers vary.* That BitCoin is cheaper than debit card or bank transfers is not due to technological breakthroughs. It is because the existing system is massively over-priced.

Reading the many earnest thought-pieces on how blockchain will change everything -- it seems people sometimes project their thoughts about the wider potential of digital currency onto the specific technology of blockchain and BitCoin. These thoughts then get repeated by others (and ratified by repetition), until it can become hard for everyone to see what's going on.

I've seen a similar concept-creep happen in AI. Specific techniques get described by analogy, and those analogies are extended, until people see The Terminator in every AI project. Your future toaster might have senses (to see when the toast is cooked), and voice recognition and machine-learning to learn your preferences. And probably some spare intelligence, because it's cheaper to put in generic over-powered components than to custom make simpler ones. But no amount of feature-creep will turn it into The Terminator. Although it will get hacked by some internet d-bag, inscribe rude messages into your toast, then burn your house down. If you want a picture of AI over the next 5 years, imagine everything as smart toasters.

*BitCoin transactions offer a fee to encourage BitCoin miners to process them, and to do so ahead of other transactions (there is a queue). There are plans to alter BitCoin to allow for cheaper transactions, though this requires a reduction in security, and changing a currency on the fly is not easy.

Images: (cc) FamZoo Staff, famzoo on Flickr, and the Talkie Toaster from Red Dwarf.

15 May 2017

How to make a lightbox in DoubleClick for Publishers (DfP), or Wrestling with an AdTech Monster

If you're working in adtech, you'll probably encounter DoubleClick for Publishers (DfP) - it is the most common publisher-side adserver (or SSP). We've recently produced a version of Good-Loop for DfP. This article lists the various booby traps we found along the way. For further details, see our scars.

First up: If you use an adblocker, turn it off. Understandably, that can break DoubleClick, given that it is an adserver.

So you want to write an expanding lightbox -- well DoubleClick supports that, via DoubleClick Studio Enabler (Enabler.js), and how lovely, they even provide a template to get you started.
Except it is broken. We modified the template and it didn't work. We used the original exactly as provided, and it did not work. As far as I can tell, this is obsolete and broken code, they just forgot to update the documentation. Hidden in a disused basement of the DoubleClick website, inside a cabinet labelled "beware of the leopard", there is a note to say that SafeFrame is the preferred method. Do not use DoubleClick Studio Enabler -- use SafeFrame instead.

SafeFrame requires you to know the size of the ad slot. And it provides a query for finding out page size info. However this is not as useful as you'd hope, because you have to specify the slot size first, before you can query the page size. So you need to pass that info into your creative -- which is done via a couple of macros: %%WIDTH%% and %%HEIGHT%%.

You'll need to use the SafeFrame ext.geom() method to provide the right expansion size. Beware that this can be affected by timing issues. Don't call it until you're about to do the expansion.

At the time of writing, DfP's implementation of SafeFrame was slightly flaky on iOS. To get reliable results, we put in code to retry the lightbox expansion (and the close) a few times, stopping if your handler receives a success message.

You'll probably want to know where your advert is going -- but DfP doesn't tell you by default, because: I don't know. You can get this by adding the %%SITE%% macro to your script url. As an example, Good-Loop's final DfP creative looks like this:

<script src="//"></script>

The next gotcha is the DfP Preview feature. This handy feature provides a link to see your creative in situ. Except DfP preview does not work for rich media ads. This is due to a bug in DfP: the preview simply doesn't properly support the lightbox features -- although the real DfP iframe does. This makes preview confusingly useless for lightbox ads like Good-Loop. So avoid Creative -> Preview.

So how can you preview an advert? The best approach we found was to setup a very targeted line-item. In DfP, create a line-item that targets a specific (and rare) device. Then in your Chrome browser, open the developer console, toggle device mode, and set Chrome to emulate that device type.

An annoying feature of DfP is that it will sometimes fill slots with random adverts, rather than the line item you want. Swearing and mashing the settings is one solution.

Another source of frustration is the delays. Edits in DoubleClick can take 10 minutes to percolate through the slow-as-treacle systems and actually take effect. This multiplies the pain of the trial-and-error stress test that is working with DfP. To minimise this, we moved our html code to being dynamically generated, with the DoubleClick creative reduced to a single .js script tag.

Now we're serving adverts...

Bugs from the host-page CSS: It's easy to break a SafeFrame's ability to expand to full-page. A common css rule goes something like:
iframe, some other element types {
 max-width: 100%;

Looks reasonable, but it restricts the iframe's width to 100% of the parent element's - which is almost always very small. Your SafeFrame does an expand... to the same size it was before.

If the publisher is cooperative, they can fix this by adding a very specific CSS rule such as:

iframe[id^='google_ads_iframe_'] { max-width: none; }

25 April 2017

The Council of the Gods, by Kit Wright

Lay no blame. Have pity.
Put your fingers in the wounds of the Committee.

They never reached your item.
Disputing Item One ad infinitum.

Lay no blame. Be tender.
The retrospective start of the agenda.

Was all they managed treating.
Consider, pray, the feeling of the meeting.

(They felt awful). Not surprising
They never came to matters not arising.

From Matters Arising:
Who took the chair when the standing committee last sat?
Who kept the minutes for hours and hours and hours?
Who tabled the motion,
Who motioned the table
The standing committee
Have pity.
Put your fingers in the wounds of the committee.

The gods have not been sleeping.
All night they sat, in grief and boredom, weeping.

By Kit Wright, Amazon book link

29 December 2016

O Have You Caught the Tiger?, a poem by A.E. Housman

O have you caught the tiger?
And can you hold him tight?
And what immortal hand or eye
Could frame his fearful symmetry?
And does he try to bite?

Yes, I have caught the tiger, 
And he was hard to catch.
O tiger, tiger, do not try
To put your tail into my eye,
And do not bite and scratch.

Yes, I have caught the tiger.
O tiger, do not bray!
And what immortal hand or eye
Could frame his fearful symmetry
I should not like to say.

And may I see the tiger?
I should indeed delight
To see so large an animal
Without a voyage to Bengal
And mind you hold him tight.

Yes, you may see the tiger;
It will amuse you much.
The tiger is, as you will find,
A creature of the feline kind.
And mind you do not touch.

And do you feed the tiger,
And do you keep him clean?
He has a less contented look
Than in the Natural History book,
And seems a trifle lean.

Oh yes, I feed the tiger,
And soon he will be plump;
I give him groundsel fresh and sweet,
And much canary-seed to eat,
And wash him at the pump.

It seems to me the tiger
Has not been lately fed,
Not for a day or two at least;
And that is why the noble beast
Has bitten off your head.

16 August 2016

Computer Generated Haiku - a project by Aji Alham Fikri & Daniel Winterstein

Here are a couple of write-ups of the computational creativity in poetry research that Aji and I did last year:

I'm planning to extend this into a general purpose poetry generator / evaluation. You can see the work-in-progress notes for that here: a JSON specification format for poetry

22 June 2016

Agile Procurement?

Image from Brazil, a gloriously dark comedy about bureaucracy and power by Terry Gilliam. Actual relevance to this post: low, but I like the movie.
When it comes to software, public bodies spend a lot of money yet often have second-rate web-sites and systems. Why?

Partly, because high-profile software projects are difficult, and public-service software often has to handle lots of corner-cases that make off-the-shelf solutions harder to use. But partly it is their own fault: Public procurement sets up systems that almost ensure they will pay too much for second-rate software. Why?

One such model is the framework agreement. Companies first tender to be on the list to tender for actual work. Bureaucratic framework agreements create an overhead that eliminates most small software companies (who are of mixed quality but contain many of the best developers) in favour of large contractors (who tend to charge more, often a lot more, and deliver older and less flexible software).

I expect the bureaucracy is trying to manage risk & overhead: Do some heavy vetting once, then re-use it. But this vetting is of little value. A large contractor will submit their successful past projects, possibly carried out by teams who have no connection to the teams that will then work on the tender. A small contractor has less track record to draw on, so is at a disadvantage. Inspite of the care taken by procurement, government software projects often over-run or fail to deliver. Inspite of... or because of?

There are better ways to manage risk in software projects! We need Agile Procurement. Not procurement of agile software, but agile ideas in procurement itself. That is, procurement teams who work iteratively with the supplier and consumer teams, taking small short-term risks as the best way to manage costs and avoid large risks.

I'd also like to see multiple redundancy in procurement. Instead of betting everything on one big contract with one supplier... have several suppliers produce prototypes, at least for the early stages. If one supplier fails to deliver -- it's not a problem. This would allow for lighter touch procurement -- opening the door to SME software development companies. Given the difference in costs, I believe this would actually lower the overall price. It also allows more ideas to be explored, and it introduces some post-tender competition -- and hence better software at the end.

10 May 2016

A Simple Intro to Gaussian Processes (a great data modelling tool)

A Gaussian Process, fitted using MatLab, showing most-likely-value & confidence interval. Note how the shape hugs the data, and how the uncertainty varies depending on the data - sometimes the model is confident, sometimes it isn't, and you know which is which.

Gaussian Processes (GPs) are a powerful technique for modelling and predicting numerical data. Being both relatively new and mathematically quite complex, they're not as well known as other techniques. They have some strong advantages:
  1. Flexible: Can be used to model many different patterns.
  2. You make fairly few assumptions about the model.
  3. Based on probability theory, so they have a solid mathematical grounding.
This article is an easy-to-read introduction. Instead of diving into the maths behind a Gaussian Process, let's start with a simpler algorithm.

K Nearest Neighbours (KNN)

K Nearest Neighbours is a classic AI algorithm, and very easy to understand. It also provides a surprisingly good basis for explaining Gaussian Processes (without any maths involved yet). Here's how KNN works:

The task: Given an item x, predict the category of x. E.g. you might ask, Is this email spam or not-spam?
The method:
  1. Let's pick k=5.
  2. Given an input item x... E.g. a fresh email has arrived, is it spam?
  3. Look through your training data, and find the 5 items most similar to the input item x. These are the nearest neighbours.
  4. Look at the categories those 5 items have, and predict the most common category from them.
  5. Done :)
A few things to note about the KNN algorithm:
  • It is a lazy algorithm. The model isn't trained in advance, as with say Linear Regression or Naive Bayes -- instead the work is done when an input item is presented.
  • The data is the model. If you have enough data, KNN can model a really wide range of patterns. Rather than being forced into a particular shape, the data can speak for itself. There is a cost to this though - it requires more training data.
  • The key part is judging when items are similar. How you do that will depend on the problem you're looking at.
These are also key properties of Gaussian Processes.

The classic KNN algorithm is for predicting categories (e.g. spam / not-spam), but we can modify it as follows to make numerical predictions (e.g. the price of fish):
   Step 4': Having found the k nearest neighbours, take the average value.

From KNN to a Gaussian Process

So what is a Gaussian Process?
It deals in numerical data, e.g. the price of fish -- and for the rest of this article, let's say it is the price of fish as a function of date that we're modelling. We keep all the training data, and say that when predicting new data points from old ones, the uncertainty/noise will follow a multivariate Gaussian distribution. The relationship given by that distribution lets us make predictions from the training examples.

For non-Mathematicians, let me briefly cover the terminology from that paragraph. A distribution tells you how likely different values are. The Gaussian Distribution, aka the Normal Distribution, has a bell-shaped curve: the mean is the most likely point, and the probability drops off rapidly as you move away from the mean. The multivariate Gaussian (which is what we want) can specify correlations between multiple variables. The Gaussian Distribution naturally arises in lots of places, and is the default noise model in a lot of machine learning.

Similar to KNN, the Gaussian Process is a lazy algorithm: we keep the training data, and fit a model for a specific input. Also like KNN, the shape of the model will come from the data. And as with KNN, the key part is the relationship between examples, which we haven't defined yet...

Introducing the Kernel Function

A Gaussian Process assumes that the covariance between any set of points is a multivariate Gaussian.
The "variables" here are the different values for the input x. In our price-of-fish example, the input is the date, and so every date gives a dimension! Yes, in principle, there are an infinite number of variables! However we only have a limited set of training data -- which gives us a finite set of "variables" in the multivariate Gaussian -- one for each training example, plus one for the input we're trying to predict.

A multivariate Gaussian is defined by it's mean and covariance matrix, so these are the key things for us to calculate.

The kernel function of a GP says how similar two items are, and it is the core of a specific Gaussian Process. There are lots of possible kernel functions -- the data analyst (e.g. you) picks an appropriate one.

The kernel function takes in any two points, and outputs the covariance between them. That determines how strongly linked (correlated) these two points are.

A common choice is to have the covariance decrease as the distance between the points grows, so that the prediction is largely based on the near neighbours. E.g. the price of fish today is more closely linked with the price yesterday than the price last month. Alternatively, the kernel function could include a periodic part, such as a sine wave, to model e.g. seasonal ups and downs. The Wikipedia article lists some example kernel-function formulas.[1]

The kernel function will often have some parameters -- for example, a length parameter that determines how quickly the covariance decreases with distance. These parameters are found using optimisation software -- we want parameters that optimise the likelihood of the observed data. We can write down the probability of the observed data (the likelihood) as a function of the kernel function parameters, and then pick kernel function parameter values to maximise the likelihood.

Given an input x (e.g. "next Thursday") with the training examples x1, x2, ... xn (e.g. the price of fish each day for the last few months), then the GP model is that (x, x1, x2, ...xn) has a distribution with mean 0 and a covariance matrix defined by the kernel function.

From the covariance matrix, you can then calculate the prediction for x. The prediction for x (or in probability theory terminology, the marginal distribution for x) will be a simple one-dimensional Gaussian. It has a mean value (which is the most likely value for x) and a standard-deviation for the uncertainty.

Building a GP

This article has skipped over the technical details of how you carry out certain steps. I've blithely written about "optimising the likelihood" without saying how you do that. That's partly because there are multiple ways, and partly to keep this article simple. The short answer is: you'll be using software of course, and most likely software that someone has kindly already written for you.

I'm not going to recommend a particular software tool here, as the choice really depends on what you're familiar with and where you're using it. There are GP calculators available for many environments, e.g. Weka has one for Java [2], or you can talk to your local Winterwell office[3] :)

Going Deeper

[1] Commonly used kernel functions, in Wikipedia:
[2] Weka, a Java toolkit for machine learning.
[3] Winterwell, data science consultancy

So you want to know the mathematical details? Good for you!
Try reading these resources:

[4] "Gaussian Processes: A Quick Introduction" by Mark Ebden.
[5] "Gaussian Processes for Machine Learning" by Carl Rasmussen and (my boss once-upon-a-time) Chris Williams.

Good-Loop Unit