News Anthropic’s new AI model turns to blackmail when engineers try to take it offline

I believe that as the models are trained on large data including movies and stories. Alot of that data contains the blackmailing part. So it is just trying to mimic it. And I think one of the solutions would be to mindfully filter the training data instead of just throwing everything in the data.
 
  • Like
Reactions: Emperor

An interesting article that provides an outside perspective rather than the usual tech journos.

Reading through many discourses on this topic, I also feel a very damaging side effect of all this is, when people just don't like any arguments or observations, they label them as AI-generated and thus untrustworthy. If you use "too big words", you run the risk not being taken seriously and taking help from the AI. This means that any higher level debate is just not possible, at least on the internet meant for common people like us. It might be limited only to academic circles in the future.
 
  • Like
Reactions: Emrebel