The Bias Risk in Health Index: Humans vs. Ronin AI

If artificial intelligence (AI) is to reach its potential in transformer industry, then it is imperative to minimize the risk of bias in decision making, particularly for asset management. A blog post from Mckinsey suggests that one of ways we can tackle the risk of bias by investing in transparent and multidisciplinary research over larger data sets in multi-dimensional scenarios.

To overcome this challenge, Project: Human and Ronin was launched in 2022 to research on the level of bias and risk in decision making between AI and humans for transformer health indexing.

Join us to find some critical and data-driven insights from this project in this blog.

Bias risk in transformer industry

As transformers represent the backbone of electricity supply chain, it is imperative to follow comprehensive condition assessment practices to ensure sustainable and realiable asset performance. The need for such elaborate and comprehensive assessment stems from customer requirements. Utilities, for instance, are targeted towards highly efficient transformers; whereas, industrial customers wants to avoid production downtime and thus looking for asset reliability. It is clear that within a defined context an asset manager is always looking for a quick, reliable, easy-to-use/manage decision support tool.

Transformer health indexing is a popular tool for making such asset-intended decisions viz., repair, refurbishment, and even replacement. Such strategies transform qualitative data in to a composite quantitative indicator that allows comprehensive asset management. One of its particular advantages for fleet management is the asset ranking ability based on high priority-high risk matrix. On the other hand, lack of modelling transparency and bias-variance overshooting can often lead to overfitting.

Project: Human and Ronin

This project is an open call to all asset managers and industry advisors to identify, evaluate, and mitigate the risk of bias in predicting health index for predictive maintenance and asset management. It aims to explore the overlays in transformer-oriented decision making between humans experts and Ronin AI.

For this purpose, global experts were invited to inspect three specific transformers using typical condition monitoring data such as;

gas-in-oil concentration (in ppm) of hydrogen, methane, ethane, ethylene, acetylene, carbon monoxide and carbon dioxide
water content (in ppm)
acidity (mg KOH/g)
breakdown voltage (in kV)
interfacial tension (mN/m)
Tan delta (absolute value)
2-furfuraldehyde (ppm)
actual age of transformer (in years)

It was interesting to observe human judgement on transformer state assessment on the criticality of asset in terms of normal ageing vs end-of-life. Some of the stunning insights from this research are now open to all.

Seetalabs’ Ronin AI helps in making reliability decisions on large transformer fleets by ranking assets with higher likelihood of failure and reactive timeline by using transformer oil condition data. Ronin health index rating is between 0% (Very poor) and 100% (Very good), and have additional comments/remarks for justification as per CIGRE recommendations.

Accuracy and Error

Ideally, a transformer health index requires all information about the asset including its auxillary components. This would mean conducting almost all prescribed thermal, electrical, mechanical, and chemical tests on the asset. In real world scenario, this data could be sparse and often faulty leading to an error in judgement. The primary set of parameters to monitor has remained largely unchanged since their inception, making it imperative for the industry to embrace new paradigms. This is where artificial intelligence (AI) finds the scope of work and improving decision-support systems.

On an average, Ronin AI outperforms humans in terms of accuracy (80% versus 64%), particularly when there is a lot of missing data across a large dataset. On the other hand, the average error in human judgement increases with poorer data quality (from 25% to 50%). This means that with proper training AI can handle missing data and provide relatively reliable output than humans.

Asset wise insights

We found some interesting insights on asset-wise accuracy.

- Asset 1 shows that humans consistently achieved high accuracy (average 77%) with moderate error rates (average 25%).
- Asset 2 showed mixed results for humans with varying accuracy (65%) and error rates (35%).
- Asset 3 had different performance patterns for humans and AI. While humans performed better in terms of identifying issues with the asset (average 49%) , AI’s performance varied due to its ability to compensate for missing data.

Another intersting aspect was the bias in decision making due to job function of the human expert, their operational environment, years of experience, and even gender!

p.s: women were more accurate then men (80% versus 59%) and were more observant of the overall data integrity.

Variability and Error discrepancies

It’s important to note that there is significant variability in both human and AI performance across different assets. This suggests that the difficulty of the tasks or the nature of the assets may impact performance. It was more intersting to observe the problem of “masking of fault condition” by AI because of its unilateral thinking. For instance, humans very quick to observe multiple issues with asset#3 given its poor condition and age.

Based on the environmental circumstances, we found some more insights from the survey.

Humans outperform AI if they have good and complete data available or the transformer state is more apparent.
AI tends to make fewer mistakes if there are recognizable patterns in datasets or when the differences are subtle
Utilities focus more on data quality and monitoring technique
Finding correct oil temperature data and extent of decarbonization is of more interest for OEMs
Chemists have a tendency to check for data integrity before proceeding to diagnosis

Mitigating the risk of bias in health index

At Seetalabs, we understand the intricacies of changing industry landscapes. We believe that the synergy between power transformers, artificial intelligence, sustainability, and economic viability is vital. Our ultimate goal is clear – to strike the perfect balance between innovation, sustainability, and reliability.

However, when faced with identical AI inputs, humans tend to make entirely different choices based on their own decision-making styles. Such bias can have far-reaching implications on asset management strategies such as health indexing.

This research was evident that bias risk in transformer health indexing exists. While the magnitude of this pilot test was small, it was enough to motivate us to invest heavily in AI bias research by gathering more data, exploring multiple dimensions, and being more inclusive.