top of page
  • Writer's pictureDan Ingvarson

How is policy enforceable in the probabilistic world of Large Language Models?

Updated: May 26

Dan Ingvarson, Senior Technical Policy Consultant

This is part four of a four part series: In part one, we demonstrated the inevitability, massive scale and range of LLMs, part two explained the policy points for content controls, the lack of existing rules and the policy imperatives required to establish transparency. Part three explained the likely evolution of LLM adoption in education.

A colorful keyboard

Foto von Omkar Patyane

“We live in a stage of politics, where legislators seem to regard the passage of laws as much more important than the results of their enforcement”. - William Howard Taft

There is a history of hiding behind new technologies, which compound societal difference and contribute to real world breakdowns in norms and behaviours [1]. Large Language Models (LLMs) will use a modern approach where LLM technology is spread through many different user applications in a way that can be hard to detect. As AI becomes much more pervasive, embedded, and most likely required for companies and countries to remain competitive, the enforcement of our currently accepted standards will be constrained by both the influences on jurisdictions to close existing loopholes and a community’s ability to define measures for our accepted standards which apply in the digital world to enable effectiveness of monitoring and enforcement.

We live in a world that is governed by laws. There are existing social and regulatory enforcements: Anti-discrimination, accessibility, civil rights, privacy, defamation, and even anti-terrorism laws. There are also areas of our world governed by unwritten but well-known rules often associated with grave consequences for non-adherence.

“Ethics and policy are linked. Policy and law are amongst the key instruments societies have for implementing their ethical visions in society, to safeguard rights, provide opportunity, and promote values like justice”. - Schiff

It seems logical to want to apply known regulation systems and expectations of recourse to newer and unknown technologies. However, asking an LLM to be fair, unbiased, accountable, transparent, a benefit to society, and safe means that we are projecting the real world and human rules onto a vast and opaque system. Where possible, these “principles'' need to be mapped to our legal enforcement systems or structures structures using clear language, ensuring indicators of success or failure are easily understood and establishing where jurisdiction should be enforced.

Policy and law should articulate the link to social norms and existing acceptable standards, and provide a measure that can be applied to LLMs and updated to promote and enforce responsibility.

“How do you keep a wave upon the sand”

The rapid developments of the past months have seen controversial precedents being set. With OpenAI no longer sharing their training data under the guise of protecting against competition, it will become increasingly difficult to ensure the necessary transparency required to regulate against things like (intentional) bias within the systems. The sheer speed required for these competing organisations to keep up and produce better tools will lead to a reduction in safety [2] and the fact that the developers of these tools are themselves asking for regulation means there are inherently dangerous and challenging aspects to these systems that we must address.

In fact, it will become increasingly important to ensure that we have developed reliable and well-designed mechanisms to measure AI against principles, which are codified into legislation or equivalent guidelines. This comes with a number of challenges requiring new abilities for both parties, human and virtual, including;

  • MAINTAINABILITY: How such mechanisms can be maintained and updated

  • DURABILITY: Their ability to persist in many contexts

  • AFFORDABILITY: Being cost effective to deploy

  • ADAPTABILITY: Providing local adaptability whilst maintaining widely accepted standards

  • COMPATIBILITY: Ensuring they are appropriate to the actual usage situation

  • ACCESSIBILITY: They are understandable and accessible to all users and stakeholders

  • ACCOUNTABILITY: Most importantly, they need to be linked to principles in a way that enables recourse in the real world

A key distinction must be made between measuring an AI system’s components and measuring its outputs. It will become impossible to track what these LLMs are probabilistically associating together (e.g., what we often perceive as thinking or intent). We can only measure the output. However, this can also be misleading if we are only measuring according to what we know the system has been trained on (e.g., a specific text) and don’t have mechanisms to explore how the system has created connections or new concepts. Furthermore, as seen in part 2, with output based on context, probabilities are malleable. If an AI can be made to answer that 1+9 is 19, who is responsible for that AI context that’s then been created, and how could it be further used, changed or manipulated?

A key distinction must be made between measuring an AI system’s components and measuring its outputs.

In part 3, we showed that there are already three layers within the AI ecosystem where an ‘output’ can be manifested:

  • There is the core LLM model - this is the core basis and where measures addressing bias or discrimination need to be applied.

  • There is the Specific Context where the LLMs model is adapted for a purpose.

  • And there is the Application or local context where the user (or other computer) interactions are used to enable actions based on interpretation of meaning.

Deciding where to apply our society’s standards as well as any relevant and appropriate measurements and possibilities of recourse is a serious question. The common foundation for each of these issues includes where outputs are exposed and the impacts of various contexts.

Because there are multiple layers where LLM outputs can be manifested, it will be necessary to design policy and policy mechanisms that comprehend these layers of connections and reliances, and that focus attention where it can have the most beneficial impact.

The importance of accountability measures and what to do in the meantime

In the development of modern LLMs, researchers currently conduct more that 400,000 tests for reading comprehension alone. It is therefore possible to create sophisticated—and, based on their sheer volume—relatively reliable tests which could be a path for developing measures for some of the more nuanced principles. However, there are a gamut of testing procedures, even including psychological tests, which are emerging and are expanding our ideas of what needs to be measured to ensure safety. Such comprehensive testing mechanisms will take time to develop.

EdSAFE Reading tests for AI comprehension results

In part three of this series, we detailed three phases to adoption of AI technologies. The second phase outlined using deliberate testing with feedback loops to ensure appropriate measures are in place. We will need a deliberate international program for implementation research where efficacious measures can be pooled. This will essentially create a clearing house of measures (tests) for “Turing tests for SAFE AI”.

LLMs need to be held accountable according to standards, some of which exist and some of which need to be developed, by working toward measures which can apply to an LLM’s context and can be linked to existing societal expectations. Additionally, we need to identify and extend current policy and regulations to account for the discovery of exposed gaps.

Some of the links between a principle and a mechanism of measurement will be straightforward to implement, for example, providing consent for an AI to remember your visits could be added as an extension to existing cookie policies in the GDPR. In fact, there are a range of current data retention and data privacy principles which can be translated into measures and rules, and reviewed for LLM applicability.

Researchers, however, have identified a significant number of definitions laid out across twenty-four national and organisational AI principles papers.

A list of ethical principles for AI implementation

Derived from Fjeld 2020 [4]

This raises the point that we have not yet reached agreement on a full framework for measuring or testing AI technologies. It is also becoming evident that we need to quickly determine levels of urgent and emergent priorities so that we can create the mechanisms for implementation, measurement and enforcement in a timely fashion according to the greatest risks, threats and needs. Testing for enforcement action of some principles will be difficult, take time and likely even need an additional AI designed specifically for this purpose.

Safety layers are part LLM and Generative AI.

Safety Layers in AI products

Is policy enforceable in the probabilistic world of Large Language Models? LLMs implement a layer of safety, however the “Layer” concept is actually an illusion. These are, in fact, probabilistic concept maps which means they have many unexpected connections. This can make applying regulations difficult. Current practice with LLMs is that, if issues are ‘discovered’ they then have a rule applied to prevent the exposure of this trait or behaviour to users. It is important to note, the trait has not been removed, it has been ‘throttled’ for a specific context. And there are already a number of examples of how to bypass these throttling safety measures. Ongoing work exploring the ability of updating LLMs directly (so that patches become included in the core LLM) should be encouraged and further tested.

Given that the nature of an LLM is an unfathomably complex and opaque network of probabilities, the impact of the changes are equally likely to contain unknowns. This continues to reinforce the idea that a robust set of performance or output methods will be needed and with each new LLM development these measures will need to be updated.

Urgency and timing

Current headlines alone are compelling if not anxiety-inducing. AI’s concentrated breakthrough in the LLM space with UI/UX making it accessible and relevant to mainstream users with applications for all fields and professions is proof of a paradigm shift. LLMs are not yet in high risk, critical systems. Stanford’s Alpaca is a recent example of how quickly this issue can get away from us, however, when autonomous agents can be combined or given the agency to iterate new LLMs or similar platform subsets within the AI landscape.

The fact is that the development of sound policy can take years, as evinced by the timeline required for the implementation of the General Data Protection Regulation (GDPR) in Europe.

Policy development stages and timing GDPR AI

Unfortunately, time is not on our side when faced with the fast pace of change and competitive nature of the AI field—especially since LLMs will soon self-evolve faster than any developer can write code. Here are the essentials driving our efforts:

  • It will be necessary to leverage existing and aligned regulatory practices to support societal and contextual needs.

  • It will also be important to establish new methods of developing policy from the definition phase through its iterations before we reach the implementation stage. This can be achieved with the use of policy frameworks and guidelines and aligned principles.

  • It will be imperative throughout this process to ensure we are addressing the different layers of outputs and understand the best places to apply the various regulatory processes to ensure safety moving forward.

The work is vast, it is urgent, and it is critical. And we will need each other in order to be successful. Please consider the SAFE AI pledge and join us in our work. Together, our contributions will make a difference.


[1] Facebook has denied any responsibility for the evolution of discourse and targeting of misinformation which lead to attacks on minorities and rise in hate speech in the real world. However, while many concluded that it is difficult to see how the company was not a contributing factor, there is no mechanism to impact business practice and mitigate a recurrence of these negative outcomes.

[2] Open AI states that it would take 12 months to run a complete and full test of the system and yet launched in 6 months.


[4] Fjeld, J., Achten, N., Hilligoss, H., Nagy, A., & Srikumar, M. (2020). Principled artificial intelligence: Mapping consensus in ethical and rights-based approaches to principles for AI. Berkman Klein Center Research Publication, (2020-1).

128 views0 comments