source: simon willison: openai help: lockdown mode

level: technical

openai has started rolling out lockdown mode to eligible personal accounts, including free, go, plus, pro, and self-serve chatgpt business. the feature aims to stop the final stage of data exfiltration in prompt injection attacks by limiting outbound network requests that could send sensitive data to an attacker. it does not prevent prompt injections from appearing in content processed by chatgpt, such as cached web pages or uploaded files, which can still affect responses.

the approach targets what is known as the lethal trifecta: when a language model system has access to private data, exposure to untrusted content, and a way to steal and transmit data. lockdown mode cuts off the exfiltration vector, which is often the easiest leg to restrict without making the system less useful. the mechanisms are deterministic and do not rely on ai evaluation, reducing the risk of subversion by sophisticated attacks.

the existence of lockdown mode suggests that chatgpt's default settings do not offer strong protection against determined data theft attempts. by providing this opt-in feature, openai gives users a way to harden their accounts against specific threats, though it does not eliminate all risks from prompt injection. the rollout is gradual, and users should check their account settings for availability.

why it matters: it gives ai users a practical tool to reduce the risk of sensitive data leaks from prompt injection attacks without crippling functionality.


source: simon willison: openai help: lockdown mode