Q&A

On AI, biases in data models used in AI, and the future promise of computer vision in safety & security

AI for a Safer Future: Reducing Bias in Machine Learning

An interview with Vintra CTO, Dr. Ariel Amato

Dr. Ariel Amato, Chief Technology Officer with a PhD in Computer Vision & Artificial Intelligence, is leading Vintra’s mission to deliver AI-powered video analytics that transform any pre-recorded or live video into actionable, trusted and tailored intelligence. Dr. Amato has over ten years of video surveillance experience and has won countless awards and accolades. He’s an MIT award winner, and a globally recognized and published expert in computer vision research for video surveillance. Dr. Amato has led pioneering AI work for Formula One and Volkswagen and is Vintra’s visionary when it comes to addressing our customers’ most pressing use cases and risks that threaten human life. 

Like most leaders in AI, he has a unique perspective and familiarity with computer vision graphics, video analytics, and data. As a well-respected thought leader in computer vision, he is well positioned to address the many concerns in AI. 

Luca Angeli: How are biases introduced into (AI) ML models?

Dr. Ariel Amato: Unlike other computer algorithms, supervised Machine Learning (ML) systems use training data to learn patterns, which will later be used to perform tasks such as image detection, recognition, classification, etc. An extremely accurate ML system is possible when the training data has a strong correlation with the data being analyzed in the real world. 

If there is not a homogeneous distribution in terms of quantity and quality across all the data groups used to train the ML system, it is highly probable that the lack of training data within a certain data group will ultimately be reflected in the ML system’s accuracy related to that group.

Luca Angeli: What is a generally accepted definition of Bias in ML?

Dr. Ariel Amato: There is not a single definition of bias that is used by the research community, but for the sake of simplicity let’s agree that bias happens when the accuracy of an ML model is different among the disparate data groups for which the model is used to analyze. Furthermore, those differences are statistically relevant, meaning that the differences are consistent and repetitive when the ML system is tested on a large validation set among in which different conditions are represented.

Luca Angeli: What are your chief concerns with bias in ML?

Dr. Ariel Amato: My chief concern is that we reduce bias as low as possible to minimize any unnecessary negative impacts on the people impacted by the technology.  As we go on this journey, my second concern is, “What happens if the system doesn’t work?” By this, I mean, what happens if the system produces false-positive or false-negative rates that are different for varying data sets?  To make this practical in a way that popular culture often thinks of ML bias, what happens when a face recognition system produces false-positives for people of one ethnicity at a higher rate than another? To counteract this, we make our products and services accountable to people.  We design systems that provide appropriate opportunities for feedback and our ML technologies are always subject to appropriate human direction and control. We also make all of our performance metrics fully available to current and potential customers.  Said simply, the results from an ML system are the starting point, not the ending point for a decision. The third concern is, “What happens if the system does work?”

Again, to make this practical, if the technology does work but contains bias, could it then be deployed to reinforce existing negative societal biases or make the world less safe and secure?  To counteract this, we recommend that organizations using our solutions clearly communicate to their stakeholders what data is being collected and how it is being used so that a roll-out can be done transparently. We also require that users do not weaponize our technology or use it to initiate a criminal justice process without human review.

Luca Angeli: With that in mind, should a bias ML model be used?

Dr. Ariel Amato: The answer to this question depends on the type of use. Imagine, for example, an ML model that was built to detect chairs in images. Over a test set (which is relatively large and variable) it performs poorly in detecting wood chairs. This low performance is contrasted with a higher performance in detecting other types of chairs, for example, those that are made of metal or plastic. One could then reasonably guess that the model was trained using a data set that contained less wood chairs than plastic or metal. The end result is that there is a bias in the ML model against wood chairs. 

Now, should this specific, hypothetical ML model be used or not? The answer might be yes, if the overall performance is acceptable and the fact of having one category with less accuracy will not damage the final use case

However, when we use technology that directly affects people, the answer is not so simple. The answer should take the form of another question: Are we propagating the bias inherent in the ML model to its final use case? If yes, what are the potential implications of the bias?  

An example of this potential problem can be found in face recognition systems where much has been written about the the bias of some models generating different accuracy based on gender, skin color, or ethnicities. It used to be the case that the propagation of the bias to the final use case was unavoidable and could lead to negative, unwanted outcomes. For this reason, Vintra has a group of researchers devoted solely to understanding and mitigating the effect of bias in face recognition and re-identification systems.

Luca Angeli: How has Vintra worked to address biases in its data models used for facial recognition?

Dr. Ariel Amato: We have built our own data set, pulling from over 76 countries and more than 20k identities, trying to accurately represent Caucasian, African, Asian and Indian ethnicities and identities. Our use of the term “ethnicity” is intended to summarize the general appearance (skin color, eye shape, and other such easily noticed characteristics) of individuals rather than make a scientifically valid categorization by DNA or other such means. This work has resulted in a much fairer balance with each group representing roughly 25% of the total population data which provides a truer picture of what our world really looks like. With a goal of ensuring the accuracy of our facial recognition results remained in the top 10% of solutions globally, the team set out to reduce the bias gap – the percentage between correctly identifying white faces and all other non-white identities. The results? We have cut the bias gap in half compared to very well known methods public available in the literature based on the well-accepted Racial Faces in the Wild (RFW) validation dataset.

See below for our current results and those of other well-known face recognition systems:

*Bias is defined as the maximum accuracy delta between two classes. Vintra Face 2.0 results as of January 1, 2020.

Luca Angeli: What is the future promise of computer vision in the safety and security industries?

Dr. Ariel Amato: The future is going to involve a few concepts becoming reality. First, we’ll be able to search just about any type of video and any type of camera feed with a low latency very quickly for any object, event, or scene characteristic. This will give investigators and analysts force-multiplying power to solve crimes and resolve security incidents quicker and more accurately. It will also enable them to respond smartly to an event that is in progress in a way that is completely impossible today.  Second, and more importantly, we’ll be able to prevent certain events that threaten human life and safety from happening or escalating. To deliver on this, we’ll have to be able to teach systems how to better understand an holistic view of the scene by understanding relationships between objects, their actions, and combining information of the past with the present to predict the future. There’s a lot of work to do and we feel very fortunate to be pushing the world forward to a safer and more secure future.

What's your use case? The Analytics Foundry was made for you.