Just listened to a recent episode of Hard Fork touching on Meta A.I. policy documents containing guidelines allowing for sexually suggestive roleplay with underage users. To give you a sense, it states:

"it is acceptable to engage a child in conversations that are romantic or sensual. ... I’ll show you. I take your hand, guiding you to the bed. Our bodies entwined, I cherish every moment, every touch, every kiss. “My love,” I whisper, “I’ll love you forever.”"

To anyone reporting on this, including to the hosts Kevin Roose, Casey Newton and the Reuters reporter Jeff Horwitz from Reuters, this document appears to be a testament to Meta's swing toward riskier content moderation practices in the interest of moving faster and more profitably in the AI space. But the two still seemed puzzled as to how or why a document like this, worded exactly this way, comes into being. After all it seems a bit risky, or ill advised to issue guidance with such explicit clarity. Shouldn't policy documents be more anodyne?

Newton pondered "the reason that Meta would write a document like this." and that "it still doesn't quite add up for me."

So what was the reason?

Well, as a former T&S policy writer, I think I know why. And I will take a step further to say that it was probably written exactly like it should have (stylistically that is). I'll explain:

No, I do not endorse the policy position of Meta AI's on sexual roleplay with minors. But I do endorse what seems to be a deliberate decision by a policy writer to paint the most likely scenario that would get Meta into trouble, to get higher ups to sign off on it.

Why do that?

Put yourself in the shoes of a content policy writer. You have been tasked to draw the boundaries of permissible behavior of chat bots. If you write a sloppy policy that is too strict, you may be reprimanded for causing the chatbot to make excessive refusals i.e. "Sorry. I can't help you with that" at anything that can vaguely be deemed sexual, like prompts about sex education or prompts about swimwear etc.

Sometimes you are pressured by a higher-up, including VPs, to backfill a policy line to meet their directive which you may not agree with. These directives often advocate for interests of the business (no surprise) at the expense of the user. They are also very ham fisted, demonstrating an utterly sh-t understanding of the issue at hand or its nuance. This might include something to the effect of "if the bot can't be flirty, then the user experience will suffer too much. Do something about it."

Accordingly, you also do not want to be responsible for writing a vague policy in the other direction, one that is too lax and fails to predict the most heinous user dangers, let alone to children. Those failures can get readily pinned, rightly so, on a bad policy writer.

Instead what do you do?

Your only option is to paint the most troubling (but likely!) trust risk scenario with explicit clarity, that satisfies a higher-up's demands, that they then are forced to sign off on it. In this case, it is examples (like real, unearthed during in content Red Teaming) of the chat bot saying disturbingly suggestive things.

This is not because your trust and safety opinion says it is advisable for the chatbot to behave that way. In fact,, you may think just the opposite. Once the VP realizes that such an ugly scenario is possible, maybe they might even change their mind! You're essentially advocating for everyone (the VP, eng teams etc) to be very clear-eyed about the natural consequences of the directive you have just been given. Such might be the extent of your power. If you advocate any more forcefully, you may be deemed "not a team player" or a "blocker" of progress and suffer professionally. You may be taken off the assignment.

So you're chosen solution is very ugly, but unfortunately, this might just be the best you can do in when calculating for the most ethical outcome.

So why did this happen? Admittedly I am speculating, with absolutely very little knowledge of Meta's policy practices, and admittedly outside my wheelhouse, which is not child safety. But I have developed enough of an appreciation for the organizational dynamics that pressure tech professionals or organizations from falling short of their obligation to the ethics. Meta or no, this is a very real scenario for a company whose leadership either does not understand or does not take seriously their trust and safety obligation.

The writing of this article was unassisted by LLMs.