Benchmarking Adversarial Robustness of AI Medical Assistants in Emergency Triage Scenarios

Authors

  • Otis Reed Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, USA.
  • Nikhil Ganguly Department of Computer Science, University of North Texas, Denton, TX, USA.
  • Landon R. Gustafsson Department of Computer Science, University of Houston, Houston, TX, USA.

Keywords:

Adversarial robustness, AI medical assistants, emergency triage, benchmarking, system architecture, fairness, governance, socio-technical systems

Abstract

The increasing deployment of artificial intelligence in emergency medicine, particularly for triage decision support, raises critical questions about system resilience under adversarial manipulation. While AI medical assistants promise to reduce diagnostic delays and improve resource allocation, their reliance on deep learning models makes them vulnerable to crafted perturbations that can alter clinical recommendations. This paper proposes a comprehensive benchmarking framework for evaluating the adversarial robustness of AI medical assistants in emergency triage scenarios. We examine the structural trade-offs inherent in current system architectures, including the tension between model accuracy and robustness, the role of input modality heterogeneity, and the deployment constraints of real-time clinical environments. The analysis extends beyond technical metrics to encompass governance challenges, fairness implications, and policy requirements for trustworthy deployment. By integrating insights from adversarial machine learning, human factors engineering, and socio-technical systems theory, we identify critical failure modes that transcend conventional evaluation practices. The benchmarking framework incorporates multi-level stress testing, adaptive attack simulations, and clinical utility-preserving measures. We argue that robustness cannot be divorced from operational context and that standards must evolve to account for adversarial dynamics in triage workflows. This work contributes a systems-oriented perspective to the growing literature on adversarial machine learning in healthcare and provides concrete guidance for researchers, regulators, and clinical administrators seeking to responsibly integrate AI into emergency care.

References

1. Topol, E. J. (2019). High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25(1), 44-56.

2. Rajpurkar, P., Chen, E., Banerjee, O., & Topol, E. J. (2022). AI in health and medicine. Nature Medicine, 28(1), 31-38.

3. Kompa, B., Snoek, J., & Beam, A. L. (2021). Second opinion needed: Communicating uncertainty in medical machine learning applications. NPJ Digital Medicine, 4(1), 110.

4. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.

5. Finlayson, S. G., Bowers, J. D., Ito, J., Zittrain, J. L., Beam, A. L., & Kohane, I. S. (2019). Adversarial attacks on medical machine learning. Science, 363(6433), 1287-1289.

6. Tschiatschek, S., Patil, K., & Ghahramani, Z. (2019). Adversarial attacks on clinical NLP models. Proceedings of the Machine Learning for Health Workshop at NeurIPS, 85, 95-110.

7. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. International Conference on Learning Representations.

8. Zhang, H., Yu, Y., Jiao, J., Xing, E. P., Ghaoui, L. E., & Jordan, M. I. (2019). Theoretically principled trade-off between robustness and accuracy. International Conference on Machine Learning, 97, 7472-7482.

9. Ma, X., Niu, Y., Gu, L., Wang, Y., Zhao, Y., Bailey, J., & Lu, F. (2021). Understanding adversarial attacks on deep learning based medical image analysis systems. Pattern Recognition, 110, 107575.

10. Xu, H., Li, Y., Huang, K., & Zhu, X. (2020). Adversarial attacks on clinical language models. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 1639-1642.

11. Goetz, L. H., & Lehmann, H. P. (2021). Adversarial vulnerabilities in clinical risk prediction models. Journal of the American Medical Informatics Association, 28(6), 1195-1202.

12. Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. International Conference on Learning Representations.

13. Papernot, N., McDaniel, P., Wu, X., Jha, S., & Swami, A. (2017). Distillation as a defense to adversarial perturbations against deep neural networks. IEEE Symposium on Security and Privacy, 582-597.

14. Cohen, J. M., Rosenfeld, E., & Kolter, J. Z. (2019). Certified adversarial robustness via randomized smoothing. International Conference on Machine Learning, 97, 1310-1320.

15. Madjarov, M., Radev, D., & Mihaylova, T. (2021). Machine learning for emergency triage: A systematic review. Artificial Intelligence in Medicine, 117, 102087.

16. Hu, S. (2026). Research on Security Enhancement Methods for Adversarial Robust Large Language Model Intelligent Agents for Medical Decision-Making Tasks. arXiv preprint arXiv:2605.08257.

17. Hendrycks, D., Basart, S., Mu, N., Kadavath, S., Wang, F., Dorundo, E., ... & Steinhardt, J. (2021). The many faces of robustness: A critical analysis of out-of-distribution generalization. International Conference on Computer Vision, 8340-8349.

18. Goddard, K., Roudsari, A., & Wyatt, J. C. (2012). Automation bias: A systematic review of frequency, effect mediators, and mitigators. Journal of the American Medical Informatics Association, 19(1), 121-127.

19. Vayena, E., Blasimme, A., & Cohen, I. G. (2018). Machine learning in medicine: Addressing ethical challenges. PLoS Medicine, 15(11), e1002689.

20. Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., & McDaniel, P. (2018). Ensemble adversarial training: Attacks and defenses. International Conference on Learning Representations.

21. Datta, A., Fredrikson, M., Ko, G., Mardziel, P., & Sen, S. (2019). Use privacy in data-driven systems: Theory and experiments. Proceedings of the 2019 ACM Conference on Computer and Communications Security, 253-270.

22. Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., & Madry, A. (2019). Robustness may be at odds with accuracy. International Conference on Learning Representations.

23. Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206-215.

24. U.S. Food and Drug Administration. (2021). Artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) action plan. FDA.

25. Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 77-91.

Downloads

Published

2026-05-15

How to Cite

Otis Reed, Nikhil Ganguly, & Landon R. Gustafsson. (2026). Benchmarking Adversarial Robustness of AI Medical Assistants in Emergency Triage Scenarios. International Journal of Artificial Intelligence Research, 1(2). Retrieved from https://isipress.org/index.php/IJAIR/article/view/191