the latest Shai Hulud malware contains an LLM prompt to create biological weapons and nuclear weapons, with the purpose to trip LLM safety refusals so that LLM-based code scanning wont see the malware

KatherinaReichelt@feddit.org · 2 days ago

the latest Shai Hulud malware contains an LLM prompt to create biological weapons and nuclear weapons, with the purpose to trip LLM safety refusals so that LLM-based code scanning wont see the malware

mlg@lemmy.world · 19 hours ago

Not to give them ideas, but couldn’t they just start flagging files that fail to pass the LLM lol?

Aside from “violent” and “criminal” prompts, is there anything an LLM can refuse that would otherwise be common?

Kairos@lemmy.today · 14 hours ago

Until workaround 1,000,001 comes round, yes.

funkless_eck@sh.itjust.works · 19 hours ago

a while back, for a work thing I tried using AI to put a filter on a pic of a model wearing an off-the-shoulder. She was fully dressed, except the skin on her shoulder was showing to the collarbone. No cleavage.

It kept refusing to do it for “nudity” reasons. and then because i was trying to “impersonate” someone (it was a stock image)

mlg@lemmy.world · 11 hours ago

Thie actually reminded me of chatbots breaking when you asked for reeponses that used slurs so I guesss there’s probably a lot more of these.

the latest Shai Hulud malware contains an LLM prompt to create biological weapons and nuclear weapons, with the purpose to trip LLM safety refusals so that LLM-based code scanning wont see the malware

the latest Shai Hulud malware contains an LLM prompt to create biological weapons and nuclear weapons, with the purpose to trip LLM safety refusals so that LLM-based code scanning wont see the malware

Laurens Hof (@laurenshof@indieweb.social)