I believe that as the models are trained on large data including movies and stories. Alot of that data contains the blackmailing part. So it is just trying to mimic it. And I think one of the solutions would be to mindfully filter the training data instead of just throwing everything in the data.