Improving Domain-Specific Text Understanding with Large Language Models via Hybrid Fine-Tuning Strategies

Peng Xu; Tingmeng Li; Jintao  Liang

doi:10.66280/ijair.v1i1.4

Authors

Peng Xu School of Aeronautics and Astronautics, Zhejiang University
Tingmeng Li Department of Aerospace Engineering, University of Illinois Urbana-Champaign
Jintao Liang Department of Electrical Engineering, University of Washington

DOI:

https://doi.org/10.66280/ijair.v1i1.4

Keywords:

large language models; domain adaptation; continued pretraining; instruction tuning; parameter-efficient fine-tuning.

Abstract

Large language models (LLMs) often show strong general capabilities but can underperform on domain-specific text understanding when terminology, style, and label definitions differ from general web text. Fine-tuning is a natural remedy, yet a single strategy rarely satisfies all practical constraints: full fine-tuning is expensive and brittle, lightweight parameter-efficient tuning may underfit, and retrieval-only methods depend heavily on index coverage.
This paper presents a hybrid fine-tuning framework for domain-specific text understanding that combines (i) continued pretraining on domain corpora, (ii) parameter-efficient instruction tuning, and (iii) task-specific calibration and evaluation. We describe a training recipe that is modular, reproducible, and designed for realistic constraints such as limited labeled data and strict compute budgets.
We provide ablations that isolate the contributions of each component and a set of anal- ysis tools for diagnosing failures related to terminology shift, long-context evidence, and label ambiguity.

References

[1] A. Vaswani et al., “Attention is all you need,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.

[2] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in NAACL-HLT, 2019.

[3] T. B. Brown et al., “Language models are few-shot learners,” Advances in Neural Information Processing Systems (NeurIPS), 2020.

[4] S. Gururangan et al., “Don’t stop pretraining: Adapt language models to domains and tasks,” ACL, 2020.

[5] J. Wei et al., “Finetuned language models are zero-shot learners,” arXiv preprint arXiv:2109.01652, 2021.

[6] L. Ouyang et al., “Training language models to follow instructions with human feedback,” arXiv preprint arXiv:2203.02155, 2022.

[7] N. Houlsby et al., “Parameter-efficient transfer learning for nlp,” in International Conference on Machine Learning (ICML), 2019.

[8] E. J. Hu et al., “Lora: Low-rank adaptation of large language models,” International Confer- ence on Learning Representations (ICLR), 2022.

[9] P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive nlp tasks,” Advances in Neural Information Processing Systems (NeurIPS), 2020.

[10] V. Sanh, A. Webson, C. Raffel, S. H. Bach, L. Sutawika, Z. Alyafeai, et al., “Multitask prompted training enables zero-shot task generalization,” International Conference on Learn- ing Representations (ICLR), 2022.

[11] C. Raffel et al., “Exploring the limits of transfer learning with a unified text-to-text trans- former,” Journal of Machine Learning Research, 2020.

[12] H. W. Chung et al., “Scaling instruction-finetuned language models,” arXiv preprint arXiv:2210.11 2022.

[13] H. Touvron et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.

Improving Domain-Specific Text Understanding with Large Language Models via Hybrid Fine-Tuning Strategies

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Versions

How to Cite

Issue

Section

License

Make a Submission

Journal Information

Current Issue

Information

Indexing & Infrastructure