Beyond the Bounding Box

A technical update regarding the next wave of AI-powered use cases in physical security.

In September, Brent announced our latest round of funding and talked in detail about “the other side of hard”, in which he discussed our journey to date, thanked those that have joined with us to bring our plans to life and talked a bit about the future-forward technologies that will drive the next wave of AI-powered video analytics.

From my perspective, I am very excited to lead such a smart team of computer vision experts as we continue to push our solution well beyond the “bounding box” and into the next wave of AI-powered use cases in physical security.

Wave 1 was about getting deep learning-based object detection to work on fixed and mobile videos to then extracting both descriptive (men in blue shirts near white trucks) and definitive attributes (face, person, vehicle identities) and select events. This technology has to work across challenging scene conditions, ever smaller objects, and when the objects aren’t fully visible.

The technology also has to deliver lightning-fast search, accurate tracking and robust re-identification capabilities for situational awareness and investigative use cases. All of these things are built into the end-to-end solution that our customers use in our current Vintra Prevent and Vintra Investigate products. In spite of the fact that not all video analytic solutions available in the market are not yet fully embracing Wave 1, we are at a point where even the flourishing players offering Wave 1-like solutions will coalesce.

The new wave will end up with a system that will be able to work alongside human operators to make the right decision faster and more simply.

Wave 2 is about applying deep reasoning systems to the ever-growing corpus of machine learning-produced metadata captured by video sources. It’s about delegating more responsibility to the system and it means making the system control its accuracy, learn from its own outputs and distinguish specific priorities. All in all, unlike Wave 1, in which the main role is pointed out to a defined object in a time window, the new wave will end up with a system that will be able to work alongside human operators to make the right decision faster and more simply.

As we dive deeper into Wave 2, there are a few key aspects to consider:

ML metadata generation of Wave 1 will still remain critical: At Vintra, one of the most exciting parts of our mission is that we prefer tackling large, tough challenges. Thousands of cameras, critical assets to protect and multiple use cases in play — these are the types of clients that we pursue. And with these tougher challenges comes an ever-present need for our training data and models to constantly be pushed and expanded.
To that end, our ML team at Vintra is constantly improving their ML models architecture and enriching training and validation datasets to ensure that the platform is performing at state-of-the-art levels. We are also one of the leading proponents of using synthetic data and domain adaptation in the physical security space and have been so for more than three years.

To extract more information from videos such as new detection categories, Wave 2 will demand flexible model architectures that make it easy to add new detection categories quickly and accurately. From its inception, Vintra foresaw this tendency and designed its architecture with the ability to create tailored detectors and classifiers

Pattern recognition moves from science fiction to real science: In September, Vintra launched its own first step into Wave 2, which we call Vintra IQ. Imagine yourself trying to identify — over six months of video footage with millions of detections — who are the people that interact with a particular person of interest (POI). Where do they meet? On which days? At what time? With what objects do they interact? And what about the other additional people with whom that POI has met? It is this “pattern of life” discovery that we are enabling, helping our users quickly answer these key questions and paint a more holistic security picture

A smarter system will know when and what to show you: As part of Wave 2, the system will control the monitor selection to show only the camera feeds on which you need to focus at that moment in time. I call this technology “Intelligent Monitoring”, in which AI-powered video analytics will not need to define any rule or simplistic or heuristic condition — the system will already know about and be able to present to the user only the things that truly matter

Contextual awareness will be key: In Wave 2, AI-powered video analytics solutions will be able to know exactly at what each camera is looking, providing a level of detail and context never seen. This contextual awareness — knowing the difference between a parking garage, a lobby, an elevator, for example — will help frame a particular issue more clearly and improve situational awareness to an entirely new level

At Vintra, we are excited by the challenge of Wave 2 as we continue to build upon our best-in-class video analytics solution. If you’d like to join us on this journey of securing environments so people can flourish, we’re hiring and would love to talk.

To our customers, we appreciate your support throughout this journey. The training data that is regularly provided by your real-world use cases continues to power our solution to new heights. To my team of incredibly sharp engineers and ML experts, thank you for your dedication to our mission and to our ever-growing customer base — for it is their large, tough challenges that drive our innovation engine every single day.