Experiment Finds LLM Bias in Drupal Core Contribution Credit Attribution

Why LLMs Fail Drupal’s Values on Contribution Credit

In a continuation of his earlier reflections on Claude Code and AI-assisted development, Theodore Biadala has published an experiment testing whether large language models can automate credit assignment in Drupal Core issues. The results point to consistent bias against non-code contributions, even when Drupal’s official guidelines are embedded directly into prompts.

Drupal’s credit system formally recognises discussion, governance, documentation, and coordination alongside code contributions, as outlined in the Drupal Core credit guidelines. Credit data dating back to 2003 is publicly available under Drupal.org’s content license, providing a substantial dataset for evaluation.

Biadala analysed more than 10,000 resolved core issues and fine-tuned several models using 4,000 recent examples. He then evaluated 400 issues across three categories—generous, balanced, and selective—measuring precision, recall, and F1 scores. While precision reached as high as 93.0% in some models, recall remained significantly weaker, especially for selective issues involving large contributor pools.

Across both commercial and open-weight models, a recurring pattern emerged: contributors who authored code were more likely to be credited than those who participated in triage, documentation, governance, or discussion. This persisted even after prompt engineering and fine-tuning efforts.

The experiment builds on Biadala’s earlier post about his month-long trial of Claude Code, previously summarised by The DropTimes, in which he examined both the acceleration and cognitive risks of AI-assisted workflows. In this latest iteration, the concern shifts from productivity to alignment with Drupal’s community values.

The relevant core issue that originally introduced structured credit tracking remains part of Drupal’s governance history (#1149078). Biadala argues that automated credit suggestions risk narrowing that broader definition of contribution to something more code-centric.

If successfully implemented we would spend time evaluating the LLM output, not thinking about people that helped improve Drupal.

Theodore Biadala

Although the technical experiment continues with larger models still being trained and evaluated, the conclusion is cautious. AI tools can assist with drafting, summarisation, and pattern extraction, but credit attribution remains rooted in human judgment. The broader question raised is not whether automation is possible, but whether it is desirable in a system designed to reflect community recognition rather than mere activity.

Note: The vision of this web portal is to help promote news and stories around the Drupal community and promote and celebrate the people and organizations in the community. We strive to create and distribute our content based on these content policy. If you see any omission/variation on this please reach out to us at #thedroptimes channel on Drupal Slack and we will try to address the issue as best we can.

Related Organizations

Upcoming Events