How the “Artificial Intelligence of Things” is Transforming Video Security

By Shawn Guan

26 June 2018

The creation of the Internet is rightfully seen as revolutionary for its wholesale transformation of how we use and share information around the world. Many now look to the Internet of Things (IoT) in a similar way. That's because—as a vast cyber-scape of data from electronic sensors and other machine-generated sources —the IoT is now indexing our world with unprecedented granularity for remarkable new levels of visibility, efficiency, and decision support.

IoT includes everything from environmental gauges and telemetry from industrial machines, to wearable fitness monitors and inventory sensors on grocery shelves. Impressive as these and other examples may be, however, the IoT is best understood less as a revolution in and of itself—and more as a transition toward an even greater revolution that we call the Artificial Intelligence of Things, or AIoT.

Indeed, this article will propose that AIoT—the infusion of AI and machine learning throughout the IoT ecosystem for new levels of automation and performance—is a revolution on a par with the Internet itself or the sequencing of the human genome. In short, that AIoT is a paradigm shift where digital capabilities develop a kind of consciousness; where intelligent systems distributed across the IoT become self-learning and self-decisioning. Against this backdrop, the article will examine how AIoT is redefining what's possible for video security.

AI and Machine Learning

In the nearly two decades since the term "Internet of Things" was coined, sensors, actuators and networked intelligence have made their way into every corner of society—from homes and cities to industry, energy exploration and environmental monitoring. It's no surprise, then, thatGartner forecasts IoT growth at some 26 billion units, more than $300 billion in revenue and $1.9 trillion in global economic value by 2020. By the following year, Gartnerpredicts that new IoT devices will be sold at a rate of one million every hour. As another measure, IDC shows sensor signals from embedded systems—a major IoT component—will make up 10 percent of the entire digital universe by the end of the decade.

The progress is nonstop, with eye-popping innovation examples from the early days (such as chip-enabled light bulbs for remote activation) quickly becoming eclipsed by more advanced IoT applications (such as modern, smart LED street lights networked together to let municipal managers see, hear, and sense conditions across an entire city).

Especially at scale, IoT-driven efficiency gains of even one-percent can have major impacts over time—big data generated by planes can save $30 billion worth of jet fuel over 15 years, for example, according to a March 2015 report by GE.

However, the vast majority of IoT applications remain focused on gathering data and decision support. What if the vast IoT network could be leveraged for more than that? What if advances in machine learning and AI could be overlaid onto the IoT for distributed systems that become more predictive, self-learning and even self-decisioning? That's the definition of AIoT, and it's already happening to some extent today.

For example, the global engineering giantSiemens employs an intelligent sensor network embedded in locomotive engines that can anticipate and predict a part failure before it happens. Machine learning helps identify false positives and give a clear prediction of actual part failures. All of this happens in real time and at scale (the sensor data from just one fleet of trains can fill 100 billion lines of code). Reliability is such that the system allows one rail line between Barcelona and Madrid to offer full refunds to any traveler delayed more than 15 minutes.

Sharp has also developed anAIoT augmented kitchen that talks and consults with users about preferred cooking methods, and educates itself by learning a family's preferred cooking routines and food preferences. Neither of these examples would be possible just with AI or just with IoT alone. It's when the intelligence and self-learning of AI is combined with the connectivity and sensory power of IoT that the transformational capabilities of AIoT emerge. These and other advances in AIoT are tailor-made for some of the most daunting challenges faced by the security industry.

The Challenge of Scale

More than anything, the security video sector is suffering from a crisis of scale, as an avalanche of content outpaces the human ability to monitor all the data. Unfortunately, the gold standard of one screen to one person is a fantasy for most cost-conscious security centers. And in fast moving environments like casinos or nightclubs, a person can reach cognitive overload at just five screens. Against those human limitations is the exploding growth of data: Today, more than billions of hours of security video are recorded each day. The stark reality is that much of that footage is essentially ignored until something – some disaster or accident – occurs.

The IoT has made various "intelligent video systems" possible, but the vast majority of such systems aren't intelligent enough to contend with the deluge of data or get proactive enough to make a difference. Visual recognition, for instance, is a constant struggle. Part of the problem involves the complex and data-rich nature of security video. To get a sense of this, just convert the amount of information a human processes visually into digital terms.About 30 percent of cortex neurons in the brain are devoted to visual processing (compared with eight percent for touch and just three percent for hearing), according to Discover Magazine. When you consider how any security video system must approximate this level of performance, you begin to understand the challenge.

Particularly troublesome are the false alarms that might be triggered by something as simple as blowing wind or a camera tremor. Unfortunately, as a decision support tool for operators, these limited IoT-driven systems trigger so many false alarms that the most frequent decision made by the operator is to simply turn the system off.

Thankfully, just as AIoT and machine learning minimized the false alarms for Siemens, AIoT can weed out false alarms and add nuanced understanding to both real-time and historical video security. It's just one of many advantages AIoT can bring to our industry through the ability to learn from experience and constantly improve performance with little to no human intervention.

Self-Learning AIoT Systems

Let's take a closer look at AIoT and how it can address some of the video security sector's toughest challenges. Given how humans quickly get overloaded watching multiple screens, AIoT can give a much-needed assist – detecting accurately, and in real time, suspicious activity like unauthorized entry, physical violence, loitering and wall-scaling.

AI derives its power from algorithms and processes that replicate human intelligence, judgment and learning; and perhaps the greatest AI approach is machine learning. Machine learning techniques can be applied to secuirty video through what's known as a convolutional neural network ("CNN") involving advanced deep learning algorithms that work with learning cameras to handle object detection, image classification, visual tracking and action recognition.

The most simple CNNs are good at answering straightforward yes-or-no questions, i.e., "Is there a person in the video?" But advanced machine learning can take the analysis even further, identifying everything in the photo and creating probability maps about behaviors. Such probability maps are what power advanced video capabilities like human intrusion alerts, fight detection, object recognition and people-counting.

Assuming IoT infrastructure, graphics, and cloud capabilities are powerful enough, machine learning can analyze large amounts of visual information to learn from examples, programmed configurations and historical data. This self-learning happens as the computer examines many examples of behavior and builds models to identify those behaviors more accurately and quickly in the future.

As new examples and data come in, the system gets better at recognizing nuances in those behaviors. Those nuances may come in the form of discerning a false alarm from a real threat (a cat scaling the wall instead of a cat burglar, for example). They can also improve established security video capabilities. For instance, consider object detection: Most commercial systems today are fairly adept at detecting objects at a fairly close range of 20 feet. But what about at longer distances, say 150 feet away? If one has data from the highest resolution camera available, long-range object detection is made easier by applying machine learning to that data to pick out appearance and movement subtleties.

As another example, machine learning can apply experience and context to identify specific behaviors with remarkable accuracy. Consider the fight detection function mentioned earlier: It's one thing for a security video system to detect aggressive behavior in an otherwise sedate setting, such as a disruptive person at a church service or seated concert. But what if one is monitoring the mosh pit of a punk rock concert? How can you distinguish the dancing from something truly destructive or dangerous? AIoT and the right machine learning capabilities could make all the difference in such situations.

The Future of AIoT

The previous examples illustrate how AIoT can revolutionize video security performance to operate more proactively at scale. But whatever the AIoT application may be, success relies on several important factors:

Data quantity and quality are crucial. Machine learning models rely on the quantity and quality of the data flowing into them in order to deliver the fastest and most accurate performance. It helps if the system is optimized—from camera to cloud—for compatibility, so that performance isn't affected by confusing anomalies and differences in a stream, image size, or other parameter that might negatively impact performance.

Processing power is another important ingredient. Consider the example of self-driving cars: Even if the key systems of perception, prediction and motion planning for autonomous drive are well-designed, processing power can make the difference between the car being able to drive itself at five miles per hour versus 50 miles per hour. Similarly, AIoT detection of events in real-time and at scale relies on high-speed processing systems that can support that level of performance. Indeed, processing power is often the difference between a proof of concept in a research lab and commercially available systems for use in the real world.

Workforce expertise is a third ingredient to consider; and finding the right people to design and operate AIoT systems may be harder than you think. Some estimates put the number of people with adequate AI skills and training at less than10,000 worldwide, according to startup Element AI Inc. That means top talent will be in high-demand.

Connectivity between capabilities, systems and databases will have a transformative effect on what can be achieved in security video systems with AIoT. We mentioned object detection earlier. Imagine that object is a man, and imagine that your system is able to detect that person from far away. Now imagine connecting that image with facial recognition systems, which then could be tied to a missing persons database. You begin to see how AIoT-driven security video might be a life-saving, real-time tool to thwart an abduction.

The above examples demonstrate that AIoT is more than just one capability or innovation. It is instead the result of numerous innovations and resources that—taken together—are the building blocks for powerful systems that will transform the security video industry. It's also clear that AIoT security video is more than just a cost-saving measure to augment what humans can do. Instead, the technology is advancing our industry and will continue to unlock new capabilities and use cases that the world has yet to imagine.

Shawn Guan is CEO of Umbo Computer Vision.