Implicit vs Explicit Behavioral Theory: Why Regulated AI Cannot Afford Black-Box Assumptions

Every machine-learning system for regulated decisioning encodes behavioral assumptions about the humans it acts on. In most stacks those assumptions are invisible. We argue that in high-stakes regulated contexts, invisibility is no longer an acceptable design choice.

regulated-aiebc-frameworkexplainabilitydebt-recovery

1. The assumption that nobody names

When a machine-learning system recommends a communication action in an institutional debt recovery context, it does something less visible than its marketing copy suggests: it encodes claims about human behavior. A classifier trained to predict payment probability implicitly assumes that certain features of the debtor — account history, response latency, debt amount, past contact patterns — are predictive of future behavior. When that classifier is then used to select a communication action, it implicitly assumes further that those same features are causally relevant to the effect of the communication. These are empirical claims about how people respond to institutional authority, about how financial obligation shapes social identity, about the conditions under which compliance follows legitimacy rather than coercion.

Nobody states them. Nobody labels them. Nobody signs off on them.

In a conventional training pipeline, those theoretical commitments are absorbed into model weights, encoded in training data, and hidden inside optimization objectives. They cannot be inspected, challenged, or corrected without retraining the model — and when retrained, the new assumptions are equally invisible. In a low-stakes setting — a movie recommender, a product carousel — this opacity is acceptable. The feedback loop is tight, the individual cost of a wrong recommendation is low, and nobody is legally entitled to an explanation.

In a regulated institutional context, that is not the setting.

2. What regulated decisioning actually requires

The EU Artificial Intelligence Act (Regulation (EU) 2024/1689) classifies consumer debt collection systems as high-risk AI applications. The classification carries specific documentation and oversight obligations: the technical basis of decisions must be recorded, risk management measures must be identified and reviewable, human oversight must be effective rather than nominal. The European Banking Authority Guidelines on Internal Governance (EBA/GL/2021/05) require that automated decision-making in credit processes be explainable to supervisors and, on request, to data subjects. GDPR Article 22 creates an individual right to not be subject to decisions based solely on automated processing where those decisions produce legal effects or similarly significant consequences — and where the processing is carried out, the controller must be able to provide meaningful information about the logic involved.

These requirements are not satisfied by post-hoc explanation techniques alone. Post-hoc methods like SHAP or LIME (Ribeiro et al., 2016; Lundberg & Lee, 2017) produce feature-importance approximations over a model’s output surface. They answer the question “which inputs contributed to this specific prediction?”. They do not answer the question “which behavioral assumption justified treating this debtor this way?”. Those are different questions. The first is computable over model internals. The second requires a named theory and a traceable chain from that theory to the action.

Rudin (2019) argues forcefully that for high-stakes decisions, the choice is not between black-box models with post-hoc explanations and interpretable models — it is between interpretable models and non-deployment. We extend the argument one layer further: for regulated behavioral decisioning, the choice is between systems whose behavioral assumptions are named and auditable and systems that will fail a serious supervisory review.

3. The Explicit Behavioral Constraint framework

The Explicit Behavioral Constraint (EBC) framework defines a design discipline for incorporating behavioral theory into regulated AI systems as auditably-constraining policy modules, not as learned weights. An EBC module is formally specified as a tuple:

M = ( C , X , f , V , K )

where:

The central property of an EBC module is not that f(X) correctly measures C — that is an empirical question with its own validation status. The central property is that C, X, f, V, and K are all documented, reviewable, and auditable independently of the rest of the system. A compliance reviewer can examine the behavioral commitment without touching model internals. A data subject can be given a meaningful answer about the logic involved, because the logic is named.

4. The consequence for system architecture

When the behavioral theory of a decisioning system is explicit, several architectural consequences follow that are difficult to achieve with implicit approaches.

Modules can be deactivated individually. Each EBC module is a named, bounded unit of behavioral theory. An ablation study — “what happens if we turn off the Tyler legitimacy module for this cohort?” — becomes a single configuration change, not a retraining exercise. This matters when a supervisor asks “are you relying on this assumption or not?”.

The validation status column is the honest differentiator. Most behavioral-AI pitches conflate “theoretical inspiration” with “empirical validation”. The EBC framework forces the distinction by making V a required field on every module. A module whose V is Conceptual can be used — but only with a documented acknowledgment that the specific operationalization has not been empirically tested in the domain. This is precisely the kind of honesty that a serious regulatory audit rewards.

The constraint trace becomes the audit primitive. A pure decision function decide(state, t) that takes a frozen state as input can write a complete constraint trace T — a record of which modules activated, what scores they computed, which constraints they imposed, and whether those constraints changed the top-ranked action. T becomes the unit of reviewability. A regulator who asks “why did this case receive this particular communication?” receives T as the answer — not a feature-importance plot.

The interaction graph is falsifiable. When two behavioral modules influence each other — “when legitimacy is violated, face threat intensifies” (Tyler 1990 → Goffman 1967) — that coupling is stated as an edge in an interaction graph with a documented theoretical justification. The edge is falsifiable: a domain expert can disagree with the coupling, propose an alternative, and the system can be evaluated under both. In an implicit-theory pipeline, the same coupling is buried in correlations between training features, and disagreement has no surface on which to land.

5. The counterargument we take most seriously

The strongest objection to the EBC framework is that it trades predictive performance for auditability. If the seven EBC modules currently specified for debt recovery impose constraints that a well-trained learned model would not, the system may produce measurably worse outcomes — lower payment rates, longer resolution times, higher operational cost per recovered euro — in exchange for a stronger audit posture.

We take the objection seriously and we do not yet have the field evidence to dismiss it. Three points, however, reframe what the objection is actually claiming:

  1. The baseline comparison is the wrong one. The choice is not between EBC and “a well-trained learned model”. The choice is between EBC and a model that, when challenged by a regulator, will be deployed under heavy restrictions, retrained repeatedly, or withdrawn. The cost of explainability failure is not usually counted in the baseline.
  2. EBC does not exclude causal estimation. The framework explicitly wraps a causal component — Double Machine Learning (Chernozhukov et al., 2018) and causal forests (Wager & Athey, 2018) — that provides Individual Treatment Effect estimates as inputs to the EBC modules. The modules constrain the recommendation; they do not replace the causal estimation.
  3. Outcome evaluation is a future question. We have not run a pre-registered field trial. If we had, we could offer a numerical answer to the performance-tradeoff question. We have not, and we do not claim to. (See the next article in this series for why this distinction matters.)

6. What this article does not claim

7. Further reading