arXiv:2603.27338v1 Announce Type: new Abstract: Recent advancements in language model technology have significantly enhanced the ability to edit factual information. Yet, the modification of moral judgments, a crucial aspect of aligning models with human values, has garnered less attention. In this work, we introduce CounterMoral, a benchmark dataset crafted to assess how well current model editing techniques modify moral judgments across diverse ethical frameworks. We apply various editing tech
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Artificial Intelligence
[Submitted on 28 Mar 2026]
CounterMoral: Editing Morals in Language Models
Michael Ripa, Jim Davies
Recent advancements in language model technology have significantly enhanced the ability to edit factual information. Yet, the modification of moral judgments, a crucial aspect of aligning models with human values, has garnered less attention. In this work, we introduce CounterMoral, a benchmark dataset crafted to assess how well current model editing techniques modify moral judgments across diverse ethical frameworks. We apply various editing techniques to multiple language models and evaluate their performance. Our findings contribute to the evaluation of language models designed to be ethical.
Comments: 7 pages (10 + 1 reference + 6 appendix). Honors thesis completed in June 2024, write-up completed in 2025
Subjects: Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.27338 [cs.AI]
(or arXiv:2603.27338v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2603.27338
Focus to learn more
Submission history
From: Michael Ripa [view email]
[v1] Sat, 28 Mar 2026 17:13:30 UTC (192 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.AI
< prev | next >
new | recent | 2026-03
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)