Advancing Functional Genomic Interpretation through Large Language Model Empowered Agents Navigating Hierarchical Biological Knowledge Graphs and Regulatory Networks
Keywords:
Functional Genomics, Large Language Models, Multi-Agent Systems, Knowledge Graphs, Regulatory Networks, Socio-Technical Infrastructure, Genomic Governance.Abstract
The interpretation of functional genomics remains one of the most significant challenges in modern precision medicine and molecular biology, characterized by an overwhelming volume of multi-omic data and complex regulatory interdependencies. Traditional computational methods often struggle to bridge the gap between statistical significance and biological meaning. This paper proposes a system-level shift toward the deployment of Large Language Model (LLM) empowered agents designed to navigate hierarchical biological knowledge graphs and gene regulatory networks autonomously. By leveraging the reasoning capabilities of generative AI alongside the structured rigidity of biological ontologies, these agents provide a multi-layered interpretive framework that integrates transcriptomic, proteomic, and epigenetic data. The architectural discussion focuses on the trade-offs between agentic autonomy and the deterministic constraints required for clinical validity. We examine the infrastructure necessary to support massive-scale graph traversal and the socio-technical implications of deploying autonomous agents within biological research ecosystems. Key emphasis is placed on system robustness, the sustainability of high-compute genomic pipelines, and the ethical governance of AI-driven interpretation. Furthermore, we address the necessity of fairness in data representation to avoid algorithmic bias in genomic medicine. This research concludes that while LLM-empowered agents offer transformative potential for accelerating discovery, their deployment necessitates a rigorous policy framework and a resilient infrastructure to ensure scientific integrity and biosecurity in an increasingly automated research landscape.
References
1.Agrawal, A., Gans, J. S., & Goldfarb, A. (2019). Prediction Machines: The Simple Economics of Artificial Intelligence. Harvard Business Review Press.
2.AlQuraishi, M. (2021). Machine learning in protein structure prediction. Current Opinion in Chemical Biology, 65, 1-8.
3.Amann, J., Blasimme, A., Vayena, E., Frey, D., & Madai, V. I. (2020). Explainability for artificial intelligence in healthcare: A multidisciplinary perspective. BMC Medical Informatics and Decision Making, 20(1), 310.
4.Barabási, A. L., Gulbahce, N., & Loscalzo, J. (2011). Network medicine: A network-based approach to human disease. Nature Reviews Genetics, 12(1), 56-68.
5.Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big?. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623.
6.Birney, E. (2019). Bioinformatics in 2025. Nature Reviews Genetics, 20(3), 127-128.
7.Bond-Taylor, S., Leach, A., Ham, C., Kosiorek, R., & Willig, M. (2021). Deep generative modelling: A comparative review of VAEs, GANs, normalizing flows, energy-based and autoregressive models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11), 7327-7347.
8.Consoli, P., & Voulgaris, P. (2024). Autonomous agents in biological discovery: Systemic constraints and opportunities. Annual Review of Biomedical Data Science, 7, 215-238.
9.Costello, J. C., Heiser, L. M., Georgii, E., Gönen, M., Menden, M. P., Wang, N. J., ... & Stolovitzky, G. (2014). A community effort to assess and improve drug sensitivity prediction algorithms. Nature Biotechnology, 32(12), 1202-1212.
10.Davidson, E. H. (2010). Emerging properties of animal gene regulatory networks. Nature, 468(7326), 911-920.
11.Floridi, L. (2023). The Ethics of Artificial Intelligence: Principles, Challenges, and Opportunities. Oxford University Press.
12.Gligorijević, V., Renfrew, P. D., Kosciolek, T., Leman, J. K., Berenberg, D., Vatanen, T., ... & Bonneau, R. (2021). Structure-based protein function prediction using graph convolutional networks. Nature Communications, 12(1), 3168.
13.Himmelstein, D. S., Lizee, A., Hessler, C., Brueggeman, L., Chen, S. L., Hadley, D., ... & Baranzini, S. E. (2017). Systematic integration of biomedical knowledge prioritizes drugs for inflammation. eLife, 6, e26726.
14.Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589.
15.Karczewski, K. J., & Snyder, M. P. (2018). Integrative omics for health and disease. Nature Reviews Genetics, 19(5), 299-310.
16.Lander, E. S. (2011). Initial impact of the sequencing of the human genome. Nature, 470(7333), 187-197.
17.Libbrecht, M. W., & Noble, W. S. (2015). Machine learning applications in genetics and genomics. Nature Reviews Genetics, 16(6), 321-332.
18.Marbach, D., Costello, J. C., Küffner, R., Vega, N. M., Prill, R. J., Camacho, D. M., ... & Stolovitzky, G. (2012). Wisdom of crowds for robust gene network inference. Nature Methods, 9(8), 796-804.
19.Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: Mapping the debate. Big Data & Society, 3(2), 2053951716679679.
20.Nelson, M. R., Tipney, H., Painter, J. L., Shen, J., Nicoletti, P., Shen, Y., ... & Sanseau, P. (2015). The support of human genetic evidence for approved drug indications. Nature Genetics, 47(8), 856-860.
21.Noble, D. (2006). The Music of Life: Biology Beyond Genes. Oxford University Press.
22.Pande, V. S. (2023). The evolution of computational protein design: From physics to AI. Nature Methods, 20(5), 645-652.
23.Prabhakar, S., & Collins, F. (2025). Data equity and the future of global proteomics. The Lancet Digital Health, 7(4), e210-e222.
24.Qi, C., Wang, W., Jiang, S., Liu, Q., Song, X., Fang, H., & Wei, Z. (2026). Artificial Intelligence agents for biological research: a survey. Briefings in Bioinformatics, 27(1), bbag075.
25.Regev, A., Teichmann, S. A., Lander, E. S., Amit, I., Benoist, C., Birney, E., ... & Human Cell Atlas Organizing Committee. (2017). The human cell atlas. eLife, 6, e27041.
26.Ritchie, M. D., Holzinger, E. R., Li, R., Pendergrass, S. A., & Kim, D. (2015). Methods of integrating data to uncover genotype–phenotype interactions. Nature Reviews Genetics, 16(2), 85-97.
27.Sanchez-Vega, F., Mina, M., Armenia, J., Chatila, W. K., Luna, A., La, K. C., ... & Schultz, N. (2018). Oncogenic signaling pathways in the cancer genome atlas. Cell, 173(2), 321-337.
28.Senior, A. W., Evans, R., Jumper, J., Kirkpatrick, J., Sifre, L., Green, T., ... & Hassabis, D. (2020). Improved protein structure prediction using potentials from deep learning. Nature, 577(7792), 706-710.
29.Sonnhammer, E. L., & Östlund, G. (2015). InParanoid 8: Orthology analysis between 273 proteomes, mostly eukaryotic. Nucleic Acids Research, 43(D1), D234-D239.
30.Stepney, S. (2018). Computational Life. Springer International Publishing.
31.Sun, J., Zhao, W., Zhu, Y., & Chen, J. (2024). LLM-empowered agents for genomic navigation: Architectures and trade-offs. Nature Machine Intelligence, 6(3), 142-159.
32.Topol, E. J. (2019). High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25(1), 44-56.
33.Torkamani, A., Andersen, K. G., Steinhubl, S. R., & Topol, E. J. (2018). High-definition medicine. Cell, 172(6), 1148-1154.
34.Vashishth, S., Sanyal, S., Jain, V., & Talukdar, P. (2020). Compositional GCN for entity and relation modeling in knowledge graphs. arXiv preprint arXiv:1911.03083.
35.Wang, Z., Jensen, M. A., & Zenklusen, J. C. (2016). A user guide to the cancer genome atlas (TCGA). Methods in Molecular Biology, 1418, 3-23.
36.Wheeler, D. A., Barrett, T., Benson, D. A., Bryant, S. H., Canese, K., Chetvernin, V., ... & Yaschenko, E. (2008). Database resources of the National Center for Biotechnology Information. Nucleic Acids Research, 36(suppl_1), D13-D21.
37.Whittaker, M. (2021). The Steep Cost of Capture. AI Now Institute.
38.Yu, H., & Kim, P. M. (2025). Regulatory network interpretation using autonomous large language models. Briefings in Bioinformatics, 26(2), 405-422.
39.Zitnik, M., Agrawal, R., & Leskovec, J. (2018). Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics, 34(13), i457-i466.
40.Zorn, N., & Beck, T. (2025). Circular bioeconomy and the role of engineered enzymes. Sustainable Chemistry and Engineering, 13(6), 2201-2218.
41.Zimmerman, L., & Peters, M. (2024). Shifting paradigms in biological education: Preparing for the age of AI. Educational Researcher, 53(5), 290-302.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 International Journal of Artificial Intelligence Research

This work is licensed under a Creative Commons Attribution 4.0 International License.
This article is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.



