Much has been written and discussed about the merging of the physical, digital and biological worlds with machines.
Artificial intelligence has become a catch-all description for replicating something inherently human and embedding it in man-made systems that can perhaps function better, if not at least more efficiently, than our bodies and brains.
Unquestionably, progress has been made.
And, for better or worse, there is an unstoppable march toward the quest for mimicking human characteristics and capabilities in the ones and zeroes, the silicon and substrates that form the vital systems of modern machines.
But, like the four-minute mile or conquering Mount Everest, there are certain inflection points – indeed, limitations bound by the laws of physics – that require a rethinking of computational paradigms.
Not just sight but vision
We have reached such a point in taking the next steps toward making machines that do not just see, but truly sense the environments in which they exist and interact.
Power, performance and predictability are all barriers that must be overcome in order to push forward towards the achievement of reliable and practical, intelligent, vision-enabled systems.
Using inspiration from human vision we can reshape computer vision and enable a new generation of vision-enhanced products and services.
The challenges are significant, but so too are the rewards to society.
New types of machine vision hold the potential to revolutionize how we more safely and efficiently move about, work, communicate, and even, in an ironic paradox, in how bodies themselves can overcome natural limitations or deficiencies in our own ability to see.
From our perspective, we believe biologically-inspired vision technology could result in:
● Seamless transportation with 100% accident-free autonomous vehicles
● High-speed drones and delivery vehicles operating without risk of collision
● Elimination of danger or risk through smart-device advanced warning systems
● Symbiotic collaboration between humans and robots
● The end of blindness and vision loss
Machine vision: time has stood still
Capturing images through machines is certainly not a new concept, and in fact the first photograph ever taken dates back almost two hundred years.
But, believe it or not, when it comes to how machines view and record, cameras have worked the same way for decades.
That is, all the pixels in an array measure the light they receive at the same time, and then report their measurements to the supporting hardware.
Do this once and you have a stills camera. Repeat it rapidly enough and you have a video camera – an approach that hasn’t changed much since Eadweard Muybridge accidentally created cinema while exploring animal motion in the 1880s.
Selfies for supercomputers
This approach made sense when cameras were mainly used to take pictures of people for people.
Today, computers are fast becoming the world’s largest consumers of images, and yet this is not reflected in the way that images are captured.
Essentially, we’re still building selfie-cams for supercomputers.
The route to machine-friendly imaging has been mapped out for us in a discipline known as neuromorphic engineering, which uses clues derived from the architecture and processing strategies of our brains to build a better, biologically inspired approach to computer vision.
For decades, this endeavour has been an exercise in pure research, but over the past 10 years or so, we and others have been pursuing this approach to build practical vision systems.
Product of evolution
The human vision system gives us a huge evolutionary advantage, at the cost of sustaining a brain powerful enough to interpret the vast amount of data available in the visual scene.
Evolution’s frugal nature led to the emergence of shortcuts to cope with this data deluge.
For example, the photoreceptors in our eyes only report back to the brain when they detect change in some feature of the visual scene, such as its contrast or luminance.
Evolutionarily, it is far more important to be able to concentrate on movement within a scene than to take repeated, indiscriminate inventories of its every detail.
This becomes especially relevant when we are talking about the vast amounts of data being captured and analyzed in certain applications and use models – autonomous cars, for example.
In controlled environments, sophisticated post-processing can deal with this limitation of traditional video imaging.
But this brute-force approach simply won’t work in real-time – in-the-field use cases with limited power, bandwidth, and computing resources, including mobile devices, drones, or other kinds of small robots.
Less is More
As with humans, less is more is the key to more efficient vision in machines.
Rather than analyze images on a frame-by-frame basis (our eyes certainly do not do this), the new paradigm is based on selectively capturing visual information according to changes in the scene.
This approach optimizes the trade-off between speed and accuracy, and opens up a huge temporal space for computer vision and imaging applications.
By using an event-based model, we can enable such critical benefits (and current limitations of traditional cameras), such as short-latency event detection, low-power consumption, and lower-bandwidth requirements, which means vision can be used effectively in a far wider range of applications and products.
As one of our company’s founders likes to say about how we developed our strategy: “We didn’t invent it. We observed it. Humans capture the stuff of interest—spatial and temporal changes—and send that information to the brain very efficiently.”
If the strategy is good enough for humans, it should be good enough for a new generation of bio-inspired vision sensors, and the related artificial intelligence algorithms that underpin computer vision.
It’s now time to bring to market an approach that is similar to the way the human eye operates in order to enable new levels of convenience, safety and quality of life.