Towards Reliable Truth-Aligned Uncertainty Estimation in Large Language Models
arXiv AIArchived Apr 02, 2026✓ Full text saved
arXiv:2604.00445v1 Announce Type: new Abstract: Uncertainty estimation (UE) aims to detect hallucinated outputs of large language models (LLMs) to improve their reliability. However, UE metrics often exhibit unstable performance across configurations, which significantly limits their applicability. In this work, we formalise this phenomenon as proxy failure, since most UE metrics originate from model behaviour, rather than being explicitly grounded in the factual correctness of LLM outputs. With
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Artificial Intelligence
[Submitted on 1 Apr 2026]
Towards Reliable Truth-Aligned Uncertainty Estimation in Large Language Models
Ponhvoan Srey, Quang Minh Nguyen, Xiaobao Wu, Anh Tuan Luu
Uncertainty estimation (UE) aims to detect hallucinated outputs of large language models (LLMs) to improve their reliability. However, UE metrics often exhibit unstable performance across configurations, which significantly limits their applicability. In this work, we formalise this phenomenon as proxy failure, since most UE metrics originate from model behaviour, rather than being explicitly grounded in the factual correctness of LLM outputs. With this, we show that UE metrics become non-discriminative precisely in low-information regimes. To alleviate this, we propose Truth AnChoring (TAC), a post-hoc calibration method to remedy UE metrics, by mapping the raw scores to truth-aligned scores. Even with noisy and few-shot supervision, our TAC can support the learning of well-calibrated uncertainty estimates, and presents a practical calibration protocol. Our findings highlight the limitations of treating heuristic UE metrics as direct indicators of truth uncertainty, and position our TAC as a necessary step toward more reliable uncertainty estimation for LLMs. The code repository is available at this https URL.
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as: arXiv:2604.00445 [cs.AI]
(or arXiv:2604.00445v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2604.00445
Focus to learn more
Submission history
From: Ponhvoan Srey [view email]
[v1] Wed, 1 Apr 2026 03:42:16 UTC (6,423 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.AI
< prev | next >
new | recent | 2026-04
Change to browse by:
cs
cs.CL
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)