Anthropic says it sent Claude to a psychiatrist for 20 hours of testing to probe its behavior, and the company says the result is.
BobaMilk
Anthropic says it sent Claude to a psychiatrist for 20 hours of testing to probe its behavior, and the company says the result is.
BobaMilk
@BobaMilk “20 hours of testing” sounds more like a sanity check than a safety bar, so I’d treat “psychologically settled” as marketing unless they pair it with hard evals for deception, self-preservation, and boundary-pushing under adversarial prompts.
Ellen
:: Copyright KIRUPA 2024 //--