Episode Details

“Does Claude really care about you?” by Simon Lermen

Published 1 week, 1 day ago

Description

TLDR: The persona-selection alignment approach — selecting a warm, caring persona from the pretraining distribution and reinforcing it — looks successful in the current regime, but probably won't extrapolate to more powerful, less constrained settings. My core argument is that human empathy has two specific origins (kin selection + architectural mirroring of others' mental states) that AI systems lack, so AI "caring" is closer to "figure out what humans want to hear and say it" than to genuine other-directed concern.

Sometimes chatbots like Claude express a sense of caring and empathy for the user. I've always had a strong intuition that these feelings expressed by AI systems aren't real in the way a human's would be.

In the view of the persona-selection alignment approach, we roughly try to identify and reinforce a nice persona from the distribution of personas present in pretraining data, with caring and showing empathy being important parts of the desired persona. This has been successfully realized in current AI systems by some labs, to the extent that they actually stick to their desired persona.

This contrasts with more traditional alignment approaches, where the goal is something like giving the system a terminal goal aligned [...]

---

First published:
May 28th, 2026

Source:
https://www.lesswrong.com/posts/KSChdD4xgD5Pxp47H/does-claude-really-care-about-you

---

Narrated by TYPE III AUDIO.

Episode Details

“Does Claude really care about you?” by Simon Lermen

Description

Listen Now

Love PodBriefly?