Mashgin: The Future of Computer Vision

twitter-picAbout a year ago I invested in and joined a startup called Mashgin. In this post I want to talk a little about what we’re working on.

Mashgin is building a self-checkout kiosk that uses cameras to recognize multiple items at once without needing barcodes.

The current version of the kiosk is designed for cafeterias, where customers can slide their tray up and everything on it is recognized instantly. Excluding payment, the process takes around 2 seconds. No more waiting for a single line held up by a price check!

But retail checkout is just a package around Mashgin’s core fundamental technology. We believe there is an opportunity to apply recent technical advancements to many fields. Advancements such as:

  • Smartphone dividends — cheap sensors and ubiquitous, miniaturized electronic components
  • Cheap parallel processing power including low-cost GPUs
  • An explosion in collaborative, open-source software tools
  • Machine learning methods, in particular convolutional neural networks (a byproduct of the 2 preceding trends)
  • Cheap cloud infrastructure

Chris Dixon talks more about some of these trends in his post What’s Next in Computing?

So how is Mashgin applying this technology?

Adaptive Visual Automation

Face swap: billionaire edition

Computer Vision transforms images into usable data (descriptions) using software. If cameras are the “eyes” of a machine, computer vision would be the brain’s visual cortex–processing and making sense of what it sees.

When computers know what they’re looking at, it opens up a world of potential. You can see it in existing use cases from facial recognition in Facebook photos (…or face swap apps) to Google Image Search and OCR. Newer, much more sophisticated applications include driverless cars, autonomous drones, and augmented reality.

Gradient Descent
A visual example of using gradient descent (the reverse of hill climbing in a fitness landscape) as part of the learning process of a neural network

These recent applications tend to be more complex, and as a result use machine learning in addition to traditional image processing methods. Machine learning, and in particular deep learning through neural networks, has changed the game in many areas of computer science, and we are just beginning to see its potential. ML can simplify a large amount of data into a single algorithm. As the name implies, it can learn and adapt to new information over time with little or no “teaching” from engineers.

Both CV and ML can be applied to many fields, but one of the biggest immediate needs is in Automation. There are a surprising amount of simple (to humans) visual tasks ripe for automation. This includes industrial use cases in manufacturing and distribution, and consumer use cases in household robotics and relief of everyday bottlenecks.

I call the above combination adaptive visual automation: using machine learning to automate vision-based tasks. Although relatively new, this combination covers a large and quickly growing class of real-world problems. Autonomous cars (and especially trucks) are a good up-and-coming example that will have huge ramifications.

Mashgin’s future

Mashgin uses adaptive visual automation to improve the speed, accuracy and cost of applications in recognition, measurement, and counting in a closed environment. That was a bit of a mouthful, so here’s the short version: Mashgin wants to make visual automation intelligent.

There’s a broader category of AI vision companies whose purpose is giving computers the ability to understand what they see. Mashgin is a subset of this group, focusing on automating well defined real-world problems.

There are further subsets such as eliminating bottlenecks in everyday circumstances — speeding up checkout lines being one example. In many of the activities you do on a daily basis, intelligent automation has the ability to save a huge amount of time and money.

Retail checkout is a big market (even for just cafeterias) but it only scratches the surface of the value Mashgin will eventually be capable of. We have already established a foundation for applying recent advancements to these problems and it will only get better from here.