Mutual Reinforcement

We have so far focused on how the economy, culture, and states could independently become misaligned. A natural objection is that the different societal systems might be able to keep each other aligned through checks and balances. Indeed, we naturally think of these systems as balancing each other: states regulate the market, culture influences government, and so on. However, here we discuss how relationships between systems might actually make them less aligned. Specifically, we argue that:

The relationships between societal systems are agnostic to human values — they do not inherently promote or protect alignment with human values. Consequently, as one system becomes less aligned, that influence also can be used to decrease the alignment of other systems
Attempts to use one aligned system to moderate the misalignment of another can backfire by effectively shifting the burden, thus leaving the aligned system more vulnerable
The misalignment is a result of general incentives which will likely apply to each individual system independently. In other words, humans and human institutions will be incentivized to take actions which will overall decrease the degree of influence which humans have over societal systems.

We discuss each of these points in more detail below. This illustration gives an overview of common ways societal systems interact and affect each other.

Cross-System Influence is Agnostic to Human Values

Given that the relationships between societal systems are as such agnostic to human values, the connections that ordinarily help maintain alignment can also be weaponized to decrease it. This is a common historical pattern:

Many companies have successfully lobbied states to act against the public interest, or shaped culture in harmful ways through advertising and marketing schemes. For instance, the tobacco industry's decades-long campaign used economic power to influence both state policy and cultural attitudes.
Many cultural movements have promoted political and economic shifts that have ultimately caused harm (often predictably or intentionally), largely but not exclusively directed at other groups of humans. Historical examples include various forms of economic and legally mandated discrimination being justified and perpetuated through cultural narratives.
Many states have used their control of the economy and influence over culture to harm citizens, taxing or outright seizing resources and using their control of the flow of information to legitimize their actions.

As a result, we should not assume that the interplay between societal systems will ultimately protect or promote alignment with human preferences.

One particularly important consequence of this is that we should not expect misalignment to remain confined to any specific societal system: even if the independent misalignment of different societal systems progresses at different rates, there will by default be both possibilities and incentives to leverage misalignment in one system to reduce alignment in related systems. This dynamic could even intensify with AI systems, which might be able to identify and exploit these cross-system opportunities more effectively than human actors.

Moderation Between Systems Can Produce Shifted Burdens

Even attempts to use the alignment of one system to moderate or contain the effects of a less aligned system can potentially backfire by effectively shifting the burden of (mis)alignment.

Consider how state-led economic redistribution might affect political alignment: if AI automation leads to citizens becoming primarily dependent on state support rather than contributing through taxes, it weakens the historical 'taxation-representation' relationship that has been crucial for maintaining democratic accountability. When governments derive their resources primarily from taxing their citizens, they remain dependent on citizen productivity and cooperation. But if governments become the primary distributors of AI-generated wealth, this crucial accountability mechanism erodes. Thus, solving economic misalignment through state power makes us even more dependent on the fragile alignment of states, even as they face independent pressures to shift away from human preferences. Essentially, the burden of aligning the economy is simply shifted onto the state. Crucially, it is not simply that humans have lost their economic influence over the state: in this scenario, the state would now have gained economic leverage over humans.

Similarly, we might hope that humans will be protected from potentially harmful AI-driven cultural shifts through state regulation. But empowering states to actively shape and control cultural evolution could further weaken democratic accountability. If states become the primary arbiters of acceptable cultural expression and communication in an AI-dominated landscape, they gain unprecedented power over how citizens understand and interact with the world. Conversely, we might hope to preserve the alignment of the state by increasing democratic provisions, and giving individuals more power over the state. However, this leaves the state more vulnerable to potentially misaligned shifts in culture.

General Incentives Towards Misalignment

Crucially, the misalignment being described here does not need to emerge from a deliberate scheme or power-grab by AI systems. In the short-term, it is being incentivized by the perceived value that AI systems can bring to economic, cultural and state functions. For example, even now:

Companies building AI systems are incentivized to push against some forms of AI regulation for the sake of their future profits.
States compete with each other on AI research and development, because of the potential economic and geostrategic benefits.
Some humans are self-interestedly trying to reduce the stigma against romantic or otherwise intense personal relationships with AI agents.

As we have argued, these incentives will likely grow stronger over time: as AI systems demonstrate their effectiveness, companies will face more pressure to adopt them, states will see greater strategic necessity in developing them, and individuals will find more personal benefit in embracing them.

In addition to leading to misalignment in independent systems, there will be progressively stronger incentives to use influence in any one system to acquire influence in other systems.

Continue