This is the first of a series of blog posts about our cognitive pinball machine. In this post we will delve into why we gave a vintage machine a Machine Learning makeover. In following posts we will walk through the software and hardware we used so you too can build your own self-playing pinball wizard.
Over the past year, we have introduced thousands of developers to Cognitive Pinball at conferences and events across Australia. Cognitive Pinball is a vintage pinball machine in (mostly) working order with one key difference from games you might have seen or played before - we modified ours to play itself and improve over time.
Why make a self playing Pinball Machine?
Something we regularly heard from people at events was:
Why would you want a pinball machine to play itself?
Doesn’t that take away all the fun?
It’s true that a Pinball Machine that plays itself is probably not a lot of fun if you’re keen on Playing pinball.
However, it’s a great example of how combining software and relatively simple hardware can give analogue devices a new lease of life.
As software developers, typically we love shiny and new tech. The promise of File > New Project is almost always more exciting than working on someone else’s legacy; a chance to put your personal stamp on something fresh.
The reality is different. Not every project can be started from scratch. We need to realise value for the consumers of our software. As the amount of software running the world increases, so too does the need to integrate software with the physical world. Physical equipment, machines and tools like those used on manufacturing production lines or in mining sites are not necessarily natively digital, nor are they easily replaced. Some of them are complex, old, or both, but they do their job and they do it well.
The question is how we bring those physical assets into the digital world and apply the power of software to increase the benefits and value they offer.
The answer is augmentation.
Augmenting an aging, analogue asset
We wanted to prove that software could make a difference without major changes to the machine itself. Rather than digging in to the internals of the machine and drastically overhauling it to work with software, we kept the majority of changes external.
For example, instead of hacking the circuitry of the score display, we used an off-the-shelf web cam to read the dot-matrix screen and perform OCR (Optical Character Recognition) to track the score. The web cams also logged when the ball dropped through the out hole at the bottom of the play board. This provided the data for a ML reward/punishment algorithm that trained the machine to get better at keeping the ball in play.
The exception to our rule was allowing software to trigger the Start Game button, Launcher and Flippers. Internally, this amounted to 8 solder points, allowing the closure electrical circuits by an IoT device. Everything else is external to the machines casing; which has the added advantage of allowing the machine to still be played manually by person.
With the modifications above in-place, we deployed two Ml Algorithms.
- Curiosity based exploration, allows the software to discover the different parts of the physical table, and what impact they have.
- Proximal Policy Optimization based Deep Reinforcement Learning, rewards/punishes the software actions based on whether they have a positive (score increase) or negative (ball dropped) effect.
Our initial plan was that we would run training for 2 weeks, and wheel out a machine that was capable of beating any human player that stepped up to the plate. As with so many developer projects, that didn’t quite pan out.
This turned out to be a blessing in disguise as we realised people were much more interested in watching the machine learn and improve. This worked nicely because with each iteration of the software running we saw gradual improvement. Early in it’s lifecycle, the machine would seem to “thrash” about, seemingly at random.
As the Reinforcement Learning matured, the randomness would decrease, and we would start to see the machine look much more deterministic about its actions. We could start to track when it was reaching a level of maturity because it would exhibit behaviour such as, holding the ball on a flipper to enable it to make trick shots. This had 2 very exciting moments for us:
- During a 3 day event, the machine displaced one of the factory-set high scores, with a score of ~ 176 million. To date this score sits in 3rd place on the leader board; behind the factory-set #1 position (400M) and a #2 human competitor (barely scraping a lead with ~ 188M).
- During a 2 day event, we witnessed a level of deterministic play so precise the casual observer would swear that a human being was actually playing. Imagine walking up and thinking an invisible ghost was playing Pinball!
Part of the appeal of the augmentation solution is that it is relatively easily to replicate when compared with replacing everything. in fact, augmenting physical devices in this way is already happening in a range of industries and applications.
Australian resources company Newcrest Mining is using similar technology to augment its mining sites. The company has added sensors to its machinery to capture data, which is then fed through Machine Learning models. The ML provides insights to inform predictive maintenance, increased operational performance and improved safety outcomes. This is just one of many real-world applications of how ML can be a game-changer.
In our next post, we will cover the open source software powering the machine and you can get started on building your own software-based, self-playing machine in the cloud.
In the meantime, check out Microsoft AU Developer on Twitter, or sign up for our community newsletter to get access to more content like this. If you want to chat further, please feel free to reach out in the comments or on the socialz.