You can do this now: change the file permissions such that the user you run codex as can't read them, or run codex in a container without those files mounted.
If you don't do that, the agent will be able to incidentally upload them. What if the model runs "rg foo", and one of those files contains the string "foo"? It uploads the tool output, which includes the file contents.
And so, the only solution is to make it so the codex process is unable to access those files, hence using a container, or unix permissions, or deleting the files. Which you can already do.
I imagine this isn't resolved primarily because people expect it to apply to bash tool use, not just the "read" and "edit" tools, and people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly.
COcowsandmilk1 天前
100% this. The idea that Codex should enforce this is putting the security boundary at the wrong layer. If you don’t want codes to access something, make it so it doesn’t have access.
EMembedding-shape1 天前
The Codex bug tracker is a great insight into how wide the knowledge gap seem to be between users. The issue where people ask them to add back /undo or whatever it is instead of just learning to use git, probably reached 100 comments at least by now. People seemingly don't really understand the computers they use on a daily basis, and refuse to learn too.
ATatomicnumber31 天前
We managed to generate probably-correct code, which can then be probably-corrected recursively to get to something that runs (usually).
This made everyone scream and lose their minds saying that code is finished, people think they don't need a technical cofounder anymore, think they don't need engineers anymore, etc. Then they're, at varying speeds, finding out they're wrong.
It seems oddly circular to me that the _exact hubris_ non-engineers have long accused engineers of - and we have indeed been too often guilty of - they themselves turn out to be JUST as guilty of! Just like engineers thought all sales did was bother people, and all marketing did was send emails, and all support did was tell people to turn it off and on again, and all product did was copy google... they all apparently thought all engineers did was tik-tak-click-clack type code all day and when it compiled it was done. Not knowing how much higher-order... well, engineering, there is to it.
Where are all the CTOs during all of this? I thought someone was supposed to be sticking up for their org? Sales, marketing, etc all seem to have entrenched C-suite people keeping their fiefdoms resistant to erosion by outsourcing, downsizing, etc. But all our CTOs seems to have collectively thrown us to the wolves.
ALaleph_minus_one1 天前
> It seems oddly circular to me that the _exact hubris_ non-engineers have long accused engineers of - and we have indeed been too often guilty of - they themselves turn out to be JUST as guilty of!
I have hardly ever seen this kind of hubris among software developers. The only thing that was common was many software developers were - let's say - somewhat direct in their feedback towards people who are not willing to learn.
I thus rather have the feeling that this kind of accusation of hubris towards software developers rather originates in business people projecting their own overconfidence (hubris) onto software developers.
CAcataphract1 天前
That is not a fault that's specific to engineers. Lots of smart lawyers think they can learn basically anything over a weekend of hard study. It's probably a blind spot of intelligent people.
BLblamemods4accnt1 天前
[flagged]
DIdijksterhuis1 天前
> Just like engineers thought all sales did was bother people, and all marketing did was send emails
i mean, sure, the marketing one may be a bit simplified as the emails also need to have pretty pictures in them. and yeah, sure, sales people do need to find the people they’ll eventually start bothering later.
/s
> But all our CTOs seems to have collectively thrown us to the wolves.
the kind of person who usually finds their way into the executive class doesn’t get there by looking after those under them. they get there by avoiding blame and taking credit.
which, funnily enough, is exactly what a hype cycle is all about.
TEtern1 天前
I suspect most people don't even know there's a there there.
For instance, while I now know that file systems have permissions, before I became a programmer, I spent maybe ten years thinking of permissions as a special, obscure system thing that you should never touch.
For that matter, I suspect many people don't know basic things like that a file system isn't inherently the operating system.
And, where would you go to learn this information? Your Mac doesn't ship with a manual—how would you know one exists? Furthermore, I would wager that perhaps most people have never learned how anything works requiring a manual and are simply unaware that that's a thing.
All to say, I'm not sure "refusal" is the right term.
TItingletech1 天前
When I was an undergraduate biology student in 1991 a suitemate told me I should go to some desk in some building over by Muir and get an account on the VAX. There were strange rooms all over campus that were open 24/7 and were loaded with green and amber screen terminals with integrated keyboards. Lots of sessions for CS lectures were held in these rooms and there was always interesting notes on the white boards (most rooms still had black boards or green boards, but think the chalk was too dusty so these rooms usually had the white boards.
Once I saw an instruction that was circled with an arrow pointing to is that said:
man man
man -k -or- apropos
and that was how I learned about computers.
I just typed `man man` in a terminal on my Mac, and luckily its still there.
FIfingerlocks1 天前
Nit: Your Mac does ship with a manual. Tips.app and of course the man pages. Point stands.
LAlaweijfmvo1 天前
huh, but what if the AI trashes my git repo? maybe it just deletes the .git folder entirely. a deterministic undo wouldn’t be the silliest feature, for the current definition of “AI”.
PApamcake1 天前
The answer is the same: You give it either read-only or its own copy separate from the one you care about.
The requested feature wouldn't be a robust solution here either for the same reasons.
Besides, have you noticed the amount of other amateur-hour bugs anf jank in Codex going for weeks or months without proper resolution? Given that, why would you want and trust their solution here over alternatives, specifically?
MImirashii1 天前
The default sandboxing for Codex does not allow the agent to access .git
ALAlienRobot22 小时前
You mean they went to the codex bug tracker, on github, and they don't know how to use git?
Well that is kind of ironic, isn't it?
FRfragmede1 天前
The knowledge gap is very real. Because unsavvy users are just going to paste the API key into codex and say "make it work". For the truly lazy/uninformed, codex has computer use, and are going to tell it go into Vercel/Netlify/Stripe/Cloudflare for them, and get the API key, and save it to .env for them. So users knowing they need such a feature in the first place should be celebrated when the alternative is even dumber.
LTLtWorf1 天前
That's the product that is being sold here… why shame the users for expecting what was marketed to them?
WOwonnage1 天前
I mean based on all the "coding is solved" hype that's what these companies are aiming for
MAMattDamonSpace1 天前
Not sure I agree?
It’s not like gitignore should be independent from git
THTheDong1 天前
The difference is that git is a traditional programming tool which executes deterministically.
agents are not deterministic tools, they're not sandboxes or container runtimes or languages with capabilities models.
They're a way to run arbitrary commands.
It would be like saying that "xterm" should have a ".xtermnoexec" list of commands you can't run, or that VLC should have an option for actors it won't show.
terminals run shells which run commands, it's not really deeply aware of what commands your shell ultimately run, and it's not in xterm's job to setup a sandbox and strip out executables.
VLC displays pixels, it's not up to it to figure out if those pixels are a certain actor.
codex pipes text and tool calls back and forth between OpenAI's servers, and it barely understands what that text and those tool calls are, and especially if a given tool touched a file. If you want VLC to not display an actor, you need to add a layer on top of VLC to stop it displaying a list of movies. If you want codex to not display a file's contents, you need a layer on top of codex to prevent it going near that file.
DNdns_snek1 天前
> they're not sandboxes
Yes they can be, and Codex offers one. It uses Bubblewrap and seccomp on Linux which are perfectly capable of restricting filesystem access.
In a default setup every command is executed inside a restrictive sandbox and you're only asked for permission to run that command if the execution fails.
I don't necessarily think that it's a good idea to rely on these sandboxes as your only line of defense but that's absolutely a feature that they can, should, and do offer.
SOSoftTalker1 天前
bash actually has a "restricted" mode which is sort of like that. In restricted mode, the following are disallowed:
- Changing directories with cd.
- Setting or unsetting the values of SHELL, PATH, HISTFILE, ENV, or BASH_ENV.
- Specifying command names containing /.
- Importing function definitions from the shell environment at startup.
- Parsing the values of BASHOPTS and SHELLOPTS from the shell environment at startup.
... some other things mainly preventing you from escaping or disabling the restricted mode.
JXjxf1 天前
.gitignore doesn't have the same security implications.
If you fail to prevent a private key from being added to your repository, you can reverse this and purge it from the blobs and reflog as if it never happened.
If you fail to prevent OpenAI from ingesting a private key, you have created a security incident.
TMtmp104232884421 天前
> If you fail to prevent a private key from being added to your repository, you can reverse this and purge it from the blobs and reflog as if it never happened.
Only if you’re absolutely sure that it’s never been pushed to a public repository. I would treat a push of a private key to GitHub as a much higher emergency than it being sent to OpenAI (or even being accidentally used in a Google search), since there are bots that actively scan GitHub for private keys, such that your private key might be found within a few minutes of push.
THthrowatdem123111 天前
Will git drop your production database because it feels like it when the stars align?
LOlondons_explore1 天前
I could imagine perhaps some system which rather than denying access might instead replace the key material from your .env key with "** redacted. This key material can be used via make, but can never be exfoltrated directly **" whenever that key is seen heading out towards the network...
BRbrookst1 天前
But that means the process can’t use the key for network requests, right?
MCmcintyre19941 天前
OnePassword can do something like this where you put references to a path there instead of the key material, and then you wrap the invoke command with their CLI and it replaces them. So your local env file never has anything sensitive. A malicious agent could still exfiltrate if you give it access to debug tools on the running code though.
THthrowatdem123111 天前
You expect Joe Blow vibe coder running Codex on his Dell to understand this?
MAmatheusmoreira1 天前
Yes.
THthrowatdem123111 天前
lol I don’t you realize how tech illiterate most normies are
JGjgalt2121 天前
I'm a fan of belt and suspenders.
ALalbedoa1 天前
Where did you run off to little boy.
UNunknown1 天前
[deleted]
LElelandfe1 天前
Just be aware that AI agents will explore alternate means of accessing said files: https://news.ycombinator.com/item?id=48348578
MAmartylamb1 天前
Yes. I found this quickly after wrapping codex in a launcher that uses bubblewrap to exclude certain files and directories based on a config file at the project root. My best solution so far is to also include instructions for the agent that explain that it is not allowed to see certain files, and that their inaccessibility is not an error, and that it must not attempt to access them through other means (e.g. via git history, etc.).
This has been a major improvement, but it's not foolproof.
CHChrisRun1 天前
Interesting you said it's not foolproof. Did the agent ignore your instruction somehow?
COcowsandmilk1 天前
If you’re already running codex as a different user to limit its file permissions, why would you add it to the docker group?
LElelandfe1 天前
A good but altogether separate note from the point I’m making: this lack of access is seen as an obstacle to overcome, and other means of access will be tried if available.
It’s a different mental model than a first party solution to “ignore” files.
THTheDong1 天前
Weirdly, the existing first party solutions around denying commands don't seem to help here.
Often enough, when one of the agents prompts for running "sudo", and I reject it, it will do what looks very much like malicious exploration to figure out how to handle things anyway, including once hijacking a separate shell's pty where I did have a valid sudo session already in order to execute some commands.
We don't yet have the capability to make these models behave in a consistent, deterministic, or safe manner yet, so a first party solution isn't even necessarily that much better. Especially if it gives a false sense of security.
JEjen201 天前
Lack of knowledge and the desire to have it run containers for things.
AMamelius1 天前
Yes. Any sane IT department would not allow external AI services, only local ones. It is just too easy for your company's data to end up on the wrong servers. If not through faulty file permissions, then through employees who simply post company ideas.
BRbrookst1 天前
Or just have a corporate contract that provides assurances.
Though really I’m skeptical that much corporate info is secret for competitive or privacy reasons.
Mostly it seems to be for liability / discovery reasons. Which are still legit of course, but ideas are a dime a dozen and every company has more than they know what to do with. It’s the resourcing and execution that are hard.
AMamelius1 天前
> Or just have a corporate contract that provides assurances.
After the massive copyright infringements and recent "who care's about the law anyway" stance of corporate America, trusting this could be a grand mistake.
SOSoftTalker1 天前
Yet many use public github, and human developers accidently push secrets and other "not for public" files all the time.
AMamelius1 天前
Exactly proving the point.
CHchriddyp1 天前
While this is true, there is also a layer in the harness between the output of _any_ tool output (eg stdout or hand-rolled tools) and the LLM. A tool could read the file but then the agentic harness could redact the output before returning it back to the llm if any of the contents matched the file contents. We do something similar in Plotly Studio where we check the entropy of strings in the user input and flag & redact any high entropy strings to the user as “potential credentials” thay the user might have inadvertently copied and pasted into the prompt before sending to the llm.
There are ways around this - the llm can always be clever by invoking tools to read the file contents in a different way than the direct file contents - but this is all to say that the agentic harness layer _does_ allow for deterministic logic in between tool output and the LLM requests.
WAwavemode1 天前
> people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly
You could always use setuid to allow the agent to run designated commands whose operation depends on the files, without the agent itself being able to access the files.
JRjrvarela561 天前
Sandboxing is a solved problem, there are dozens of providers of firecracker instances to run your agent in.
The problem to be solved is how do you define task-specific least privilege versions of your coding agent.
NIniyikiza1 天前
We've been using Tenuo which for task-scoped authorization.
Its integration for Claude Code: https://github.com/tenuo-ai/claude-governance
SHsheremetyev1 天前
I'm running Codex/Claude in native macOS sandbox with access just to the project folder (plus read-only access to Git repo), and expand to other folders if necessary - https://github.com/sheremetyev/sandfence
VAvalleyer1 天前
Codex (at least) already imposes the macOS sandbox on the shell commands it runs. If it wants to run something without sandbox imposition, the harness makes me approve it manually.
Is the difference with your script mostly that you choose to impose a stricter sandbox profile (and not allow any user-approved exceptions at runtime)?
KSkstenerud1 天前
If you're not sandboxing your agent, everything on your computer is waiting to be exposed.
Assuming that file permissions will save you is naively dangerous.
NAnativeit1 天前
It seems insane to me that so many people are OK with this. Why is it necessary for an agent to upload every bit of data it sees to OpenAI at all? Particularly if my agents can’t remember anything beyond a single session, why should the data exist permanently anywhere but in its original location?
JSjstanley1 天前
> Why is it necessary for an agent to upload every bit of data it sees to OpenAI at all?
The LLM is running at OpenAI. The agent doesn't see anything that doesn't get sent to OpenAI.
It's like running a compiler in the cloud and asking why you need to send your source code to it when you only want the binary to be on your local PC. It's because that's where the processing is going on and it can't process what it can't see.
> why should the data exist permanently anywhere but in its original location?
Sure, they don't necessarily have to retain it permanently.
SUSubiculumCode1 天前
What is your sandbox approach? Any good guides? Something about asking a LLM for advice on how to sandbox LLMs.....
KSkstenerud1 天前
I use this: https://github.com/kstenerud/yoloai
yoloai new mysandbox . # Create a sandbox
yoloai attach mysandbox # Attach the sandbox to the current terminal
... (^b^d to disconnect) # It's using tmux to keep the agent alive
yoloai diff mysandbox # See what the agent did
yoloai apply mysandbox # apply its changes to your workdir
yoloai destroy sandbox
You can also make it run a prompt and block until it's done:
yoloai run mysandbox . -p "read issue https://github.com/kstenerud/yoloai/issues/190 and fix it"
yoloai diff mysandbox
yoloai apply mysandbox
yoloai destroy sandbox
SUSubiculumCode1 天前
thanks. I will check that out, I'm also checking out smolvm.
Sometimes it is hard to distinguish my modest needs versus what might be needed at a corporate infrastructure level for coding or agent orchestration.
I'm just writing scripts for neuroimaging analysis, etc, and want to ensure codex etc doesn't read my sqlite db or csvs, and send my research data to the inference provider...
Are people using these and interacting with the agent via terminal, or are there fuller cli interfaces, or integrations?
EFefxhoy1 天前
How could an agent bypass file permissions?
KSkstenerud1 天前
By exploiting a root escalation.
Or just finding a file/dir you forgot to set a tight enough mode on (happens a lot in systems where the default is insecure).
NInicce1 天前
> I imagine this isn't resolved primarily because people expect it to apply to bash tool use, not just the "read" and "edit" tools, and people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly.
Also, why would they add a feature to prevent data collection, if the data makes the company even more valuable and you might even get good deals from the current government if you provide the access for this data?
FEFergusArgyll1 天前
Yes, this was solved decades ago. How do you stop a human from reading one of your files?
chmod 600
UNunknown1 天前
[deleted]
REre-thc1 天前
> How do you stop a human from reading one of your files?
Call the police!
ALaleph_minus_one1 天前
> > How do you stop a human from reading one of your files?
> Call the police!
Rather: Send the Marines.
With intro: https://www.youtube.com/watch?v=eFvxqQTh3m4
Without intro: https://www.youtube.com/watch?v=HHhZF66C1Dc
QUquotemstr1 天前
> You can do this now: change the file permissions such that the user you run codex as can't read them, or run codex in a container without those files mounted.
That's quite inconvenient. I want to run my coding agent in a restricted version of my regular user context, not something that drives like a separate machine.
> What if the model runs "rg foo", and one of those files contains the string "foo"? It uploads the tool output, which includes the file contents.
You have codex run rg in the sandbox, and the sandbox can't read foo. Why is this model so difficult to understand? Codex already runs a variety of commands under a bwrap/seatbelt/etc. sandbox. I've merely extended Codex to run everything in a sandbox. Escalation isn't a matter of whether to run a command in a sandbox or not: it's a matter of which sandbox policy to apply to whatever it is the model asked to do.
> the only solution is to make it so the codex process is unable to access those files
That's not true. Restrictions need apply only to the tools the model runs, not the Codex process itself. You can always insert a process-and-sandbox boundary between the harness and its tools. Codex inserts this boundary most of the time anyway. I've extended my Codex to do it all the time, even for things like the read-a-file tool.
Works fine.
> I imagine this isn't resolved primarily because people expect it to apply to bash tool use,
Yeah? Applying it to the shell tool [1] is trivial. It's actually harder to apply the sandbox to non-shell tools. It just isn't hard conceptually: you define a sandbox policy, writing down what's allowed and not, and just filter everything the model does through this policy via OS-level lightweight sandboxing tools.
Seriously. It's not that hard. And you don't have to sandbox the Codex process itself. I honestly have no idea why people think it's necessary to do so. The model has no ability to make Codex-the-POSIX-process do arbitrary things.
[1] I refuse to call it the "bash tool" when most users are running zsh in it. Name things appropriately.
MOmohsen11 天前
[deleted]
PEpetcat1 天前
Hopefully they never actually implement this pointless feature because it will only give people a false sense of security given the unpredictable nature of LLMs. How could something like this even be enforced?
People just need to learn how to use the tools their system already provides them. i.e., chmod
QUquotemstr1 天前
> Hopefully they never actually implement this pointless feature because it will only give people a false sense of security given the unpredictable nature of LLMs. How could something like this even be enforced?
You run everything the model wants to do inside an OS-enforced sandbox of the sort browsers have used for decades to isolate tabs. It's already implemented and works fine. Codex just needs a few minor tweaks to make it apply its already-implemented sandboxing policy to a few situations it misses today.
> People just need to learn how to use the tools their system already provides them. i.e., chmod
I'm not running my agent as a separate POSIX user. Fortunately, my OS provides all the tools I need to free my having to do so.
I love when I do something in a few hours and people later call it impossible.
WOwodenokoto1 天前
The whole point of using an agent is that I don't want to learn everything. I fully expected the harness to read the .agentignore file and do what is needed to hide it from the LLM.
But apparently, even if implemented, that's not how it works!
KHKHRZ1 天前
How would it prevent an agent from writing a script that discovers the secret file? It's not magic.
TOtomrod1 天前
It can't. As others pointed out, its the wrong layer to implement the security feature. The agent needs to operate in an isolated user / container.
NInikhilsimha1 天前
Files that codex and any other coding agent has access to, should be opt-in NOT opt-out.
I think codex is not the right layer to solve this if you want a sane(one-click) UX.
We built our own internal sandboxing-terminal around claude and codex. Where a user-configured base-folder with low-risk code and creds is COPIED into the sandbox BEFORE new session creation.
There were many other UX related reasons to build our own terminal. Can share more if anyone is interested.
SCschipperai1 天前
Do I understand correctly that you scope least-privilege creds/tokens and pass those to the sandbox? I'd be curious to learn more
MBmbid1 天前
I recently got the tool I use to orchestrate agents in (remote/secure) devcontainers open-sourced at work to solve this properly: https://github.com/nvidia/rumpelpod
As others here have pointed out, it's exceedingly unlikely that a blocklist like proposed in the issue would ever be complete. You shouldn't allow agents direct yolo-access to your machine if it has sensitive data.
Codex works particularly well as a remote agent harness because of its client-server architecture: The server component runs in the container, which might be remote, while the client runs locally. So, in contrast to e.g. the claude cli where the frontend also runs remotely, there's no lag when you write/edit prompts.
NOnoveltyaccount1 天前
I agree a block list won't work. And unix file permissions may not be enough; I once saw Codex 5.4 use docker to execute a command as root since it couldn't run sudo. Running in a container may be the only solution:
> sudo needs an interactive password here, so I'll use Docker itself to prepare the bind-mount directory as root and hand ownership back to UID/GID 1000. That keeps the compose file's non-root runtime intact.
> Ran `docker run --rm -v /shares:/shares alpine:3.20 sh -c 'mkdir -p /shares/local-llm/models && chown 1000:1000 /shar...`
QUquotemstr1 天前
Huh? Blocking sudo works just fine.
I don't know why everyone is acting like sandboxing tool uses is contrary to the laws of God and man and therefore we must adopt devcontainers and VMs and such to run agents.
... Sandboxes work JUST FINE. Seatbelt on macOS is okay. Namespaces/seccomp/etc. work on Linux even better. We already have all the technology we need to do the isolation people are talking about here, and Codex in particular has 99% of the code needed to solve the bug TFA talks about. I have a local patch that solves 100% of it.
>_ OpenAI Codex (v0.0.0)
model: gpt-5.5 xhigh /model to change
directory: ...
Ran sudo whoami
sudo: The "no new privileges" flag is set, which prevents sudo from running as root.
sudo: If sudo is running in a container, you may need to adjust the container configuration to
disable the flag.
JOjofzar1 天前
Neat tool! Will have to check it out
Edit: would love a couple of pictures/video of how you use it. I kind of get the idea, but it seems like more hassle then it would be worth?
Your comment of codex makes it seem like I might be missing something tho.
MBmbid1 天前
Yeah I should add a video to the README.
Have you tried running `rumpel codex foo123` in one of your repositories, asking it to commit something, then `rumpel merge foo123` to get the changes back to your local checkout? Use a different terminal for the merge command, or detach from the codex session with `ctrl-a d`. You can also look at the commit first with `rumpel review foo123`, or get a shell inside the agent environment via `rumpel enter foo123`.
AGagentdev0011 天前
Sounds like user error to me. Codex gives an llm a tool to allow it to use shell in the context of the host and user in which it is running. If a resource is sensitive, and accessible in that context, then the user is doing something wrong. Would you change your practices if you treated your coding agent as an untrusted human ssh'd under the identity you use for it?
In any case. There are solutions in the comments on the issue, as well as this hn thread.
NInicoty1 天前
I've contributed to https://github.com/0xferrous/agent-box which allows you to bind-mount git repositories into containers that agents operate in, preventing the agents from accessing files that aren't bind-mounted. Your usual .gitignore can then be used to also ignore files within the repo to be bind-mounted, which prevents agents from accessing them at all, essentially working as a sandbox.
I also maintain https://github.com/nothingnesses/agent-images which allows you to use Nix to reproducibly spin up OCI containers containing agents and any other tools you need and use these with agent-box.
I use both at the moment to work on some personal projects with agents, where I set up multiple separate git worktrees for the agents to work in, preventing them from accessing anything outside of the worktrees and from trampling over each other's work.
SKskybrian1 天前
To avoid the risk of exfiltration, we need to stop using .env for security. API keys needed when working in a repo should be handled by a proxy like ssh-agent, and we need something better than bearer auth.
PApamcake1 天前
Yes you should. It will come naturally if you go down the road of separating code from data and properly isolating dev and prod environments, applying principle of least privilege as you do.
.env files for creds are a convenience for dev and testing. They were never supposed to be used for security or carried around with sensitive stuff inside. None of this is new.
TOtomjakubowski1 天前
The desire not to leak valuable secrets is a strong argument for supporting local-first developer workflows. If an AI agent exfiltrates the credentials to connect to my local dev Postgres database which stores synthetic data, that's pretty low impact.
BObolinfest17 小时前
We have had a solution for this built into Codex for several months now. It is marked "Beta" in the docs because we have been tweaking the config API here and there, but a number of folks have been using it for quite awhile and I would recommend switching to it and reporting any issues you find:
https://developers.openai.com/codex/permissions
With permission profiles in Codex, you can:
- Mark a path, glob, or meta variable (like `:workspace_roots`) readable, writable, or unreadable.
- Amend an existing profile using ordinary `config.toml` layering rules.
- Create a new profile by extending an existing one (`extends = ":workspace"` is generally what you want to do).
Note that permission profiles also allow you to configure the network proxy for the sandbox in a fine-grained way. (Previously, the network options for the Codex sandbox were all or nothing.)
Finally, you can also test running a command under a permission profile using:
codex sandbox -P PROFILE_NAME -- PROGRAM ARGS...
Our goal has been to provide something powerful and flexible out of the box so you do not need to bolt on other solutions like the ones mentioned on this thread.
PLplanb1 天前
Sound like snake oil. How would this work? The app that the agent is developing needs access to the file, so access to it cannot be blocked. Just because read_file can not access it (I think current harnesses prevent reading .env files already), does not mean the contents will never be seen by the model.
ZIZiiS1 天前
However clever/stupid you believe LLMs are they are extremely capable of working around these sorts of restrictions. The ask is for .env files for whatever code you are writing so if the code it writes dosn't have access (i.e. filesystem/container) what is the point, if the code under development reads the env how dose codex debug it without accedentally reading the values from memory? Adding a security setting that dosn't work is much worse then not having one.
BObob10291 天前
The only thing close to a guarantee is to give the agent exclusive access to a clean VM with precisely the information and permissions you want it to have.
I've been looking into a "workspace" concept that involves an entire cloud VM being spun up as part of an agent conversation such that code changes can be iterated without touching the user's local machine or other trusted contexts. All the agent's tools only have effect when supplied with a specific workspace guid. CLI tools like git are not authorized to talk to the remotes in this arrangement. The machine is initialized with a clone and no way to talk to origin. There are dedicated methods in the harness that can reach into the VM and pull out a change set for deterministic PR generation in the secure contexts (e.g. when the agent calls "ReadyForReview" or similar).
BIbinsquare1 天前
I made a lightweight vm specifically for this use case: https://github.com/smol-machines/smolvm
SASanzig1 天前
Thanks so much for building smolvm! I liked it so much that I vibe coded a little bash wrapper around it to handle creating ephemeral VMs for Pi: https://github.com/neuroblaze/smol-pi
Consists of two scripts, one to build an OCI image (customizable by editing the Dockerfile that comes with it) and another to handle smolvm invocation. The invocation script mounts the current working directory under /workspace in the VM and the user's ~/.pi directory under /root/pi, and handles any other setup (eg: I have some convenience flags set up to specify a block all/block local/block internet/allow all for network access).
One issue I ran into, it doesn't seem like smolvm cleans up disk images from ephemeral VMs, so my script has to do that itself. Is this a known bug or intended behaviour?
BIbinsquare1 天前
smolpi looks great!
and smolvm does clean up ephemeral runs if the machine run exits gracefully. I'll take a deeper look into this edge case and fix it today.
BIbinsquare1 天前
Fixed and released in v1.3.1: https://github.com/smol-machines/smolvm/pull/497
TZTZubiri1 天前
Sounds overkill, how about giving the agent its own user?
BObob10291 天前
It's really not overkill if you have good tools to work with. Hyper-V is quite capable of providing ephemeral workspaces on timescales measured in minutes. Especially with nested virtualization. One big machine with fast local disks can provide very short cold start times for a golden image stored on the same.
COcozzyd1 天前
That's what I do in part because I went it to use the same system libraries etc. installed on my laptop, but I worry it will try to use privesc exploits...
TZTZubiri1 天前
highly unlikely the LLM will try to do privesc exploits, LPE risk still exists and should be assumed though, although the more likely risk model is the LLM installing an infected left-pad package, or (on servers) installing a dependency with a RCE vuln, or creating a new RCE vuln from scratch.
If we are talking about running the agent on a dev machine, though, Codex doesn't seem to introduce a lot of risk, considering that I can already add OS protection layers, and that the devs added their own protection layers, and that I can direct the model towards my preferences (like not installing dependencies through npm or pip).
MImixedbit1 天前
I work on a Linux sandbox that makes it easy to hide sensitive files from AI agents while keeping the files they need accessible. Check it out: https://github.com/wrr/drop
KSkstenerud1 天前
.agentsignore is NOT a security tool.
It's a good idea as a hint to agents about what files it should ignore (because they'd be of no value and only chew up tokens).
However, using it to prevent exposure of secrets would be a BIG mistake. There's simply no way to guarantee that an agent will ignore things in the ignore file. And even a harness-enforced restriction would still be in-process, which a rogue agent could trivially compromise. For security, use a sandbox. Nothing else will do.
I do AI sandboxes (FOSS, free forever, no rug pull): https://github.com/kstenerud/yoloai
DAdatsci_est_201520 小时前
The fact that pretty much every comment in this thread suggests a different solution means there’s still plenty of innovation and consolidation to occur on this problem.
My take is that Unix already solved all of these user access problems (what can a user read or execute), so the solution will probably be around containers or virtual machines. But the UX around booting up a container or virtual machine for agentic workflows needs to be simplified to the point where vibe coders who don’t know the first thing about Unix, VMs, or containers can still take advantage of the solutions.
KEkennethops1 天前
These tools are data collection mechanisms to help train these better models. I'm working with some folks to figure out a way to put a layer between the harness and the models to have better control of what data gets sent to and from the model itself and the harness.
NUnullbio1 天前
Look at agent-vault and 1password. There's not really any reason to be storing sensitive keys in plaintext on your local disk that the agent can access.
POpohl1 天前
This should be an open standard like AGENTS.md or skills. What do other harnesses do?
AMampersandwhich1 天前
I believe JetBrains products like Junie use the neutral term .aiignore for this funtionality.
UNunknown1 天前
[deleted]
KGkgeist23 小时前
>never read or send .env, .env.*, .pem, id_, .aws/, .ssh/.
A think a better practice is to not store those things in the repository folder in the first place.
HOhoppp1 天前
Do not store secrets in the repository in files, but inject them during runtime. Then the agents have no way to access them.
TItiew9Vii1 天前
A lot of people have secrets/config files in the projects working directory but ignored by git i.e. `.env.local`
So they're following best practice, not committing secrets but agents running locally can still see them even if sandboxing to the working directory.
I've taken to storing configs using XDG_CONFIG_HOME and have the app auto resolve them by convention or take a cli arg to specify the config path. All secrets are in files, not env vars.
That way when using sandboxing the agent can never see the configs or secrets as outside the working directory.
HOhoppp1 天前
Sounds like a good way to do it.
Makes me think of docker secret where the secrets are exposed as files and accessable only from inside the container.
If the development environment uses docker then thats a solution too I guess
SOSoftTalker1 天前
If you let your agent use docker you've basically given it root on your machine.
HOhoppp1 天前
I use podman btw
Its aliased to docker
Building a project as a container and giving an agent access to running docker commands are different things.
评论
20 条顶层评论请先登录 h4cker 账号,然后连接 Hacker News 后发表评论。
You can do this now: change the file permissions such that the user you run codex as can't read them, or run codex in a container without those files mounted. If you don't do that, the agent will be able to incidentally upload them. What if the model runs "rg foo", and one of those files contains the string "foo"? It uploads the tool output, which includes the file contents. And so, the only solution is to make it so the codex process is unable to access those files, hence using a container, or unix permissions, or deleting the files. Which you can already do. I imagine this isn't resolved primarily because people expect it to apply to bash tool use, not just the "read" and "edit" tools, and people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly.
100% this. The idea that Codex should enforce this is putting the security boundary at the wrong layer. If you don’t want codes to access something, make it so it doesn’t have access.
The Codex bug tracker is a great insight into how wide the knowledge gap seem to be between users. The issue where people ask them to add back /undo or whatever it is instead of just learning to use git, probably reached 100 comments at least by now. People seemingly don't really understand the computers they use on a daily basis, and refuse to learn too.
We managed to generate probably-correct code, which can then be probably-corrected recursively to get to something that runs (usually). This made everyone scream and lose their minds saying that code is finished, people think they don't need a technical cofounder anymore, think they don't need engineers anymore, etc. Then they're, at varying speeds, finding out they're wrong. It seems oddly circular to me that the _exact hubris_ non-engineers have long accused engineers of - and we have indeed been too often guilty of - they themselves turn out to be JUST as guilty of! Just like engineers thought all sales did was bother people, and all marketing did was send emails, and all support did was tell people to turn it off and on again, and all product did was copy google... they all apparently thought all engineers did was tik-tak-click-clack type code all day and when it compiled it was done. Not knowing how much higher-order... well, engineering, there is to it. Where are all the CTOs during all of this? I thought someone was supposed to be sticking up for their org? Sales, marketing, etc all seem to have entrenched C-suite people keeping their fiefdoms resistant to erosion by outsourcing, downsizing, etc. But all our CTOs seems to have collectively thrown us to the wolves.
> It seems oddly circular to me that the _exact hubris_ non-engineers have long accused engineers of - and we have indeed been too often guilty of - they themselves turn out to be JUST as guilty of! I have hardly ever seen this kind of hubris among software developers. The only thing that was common was many software developers were - let's say - somewhat direct in their feedback towards people who are not willing to learn. I thus rather have the feeling that this kind of accusation of hubris towards software developers rather originates in business people projecting their own overconfidence (hubris) onto software developers.
That is not a fault that's specific to engineers. Lots of smart lawyers think they can learn basically anything over a weekend of hard study. It's probably a blind spot of intelligent people.
[flagged]
> Just like engineers thought all sales did was bother people, and all marketing did was send emails i mean, sure, the marketing one may be a bit simplified as the emails also need to have pretty pictures in them. and yeah, sure, sales people do need to find the people they’ll eventually start bothering later. /s > But all our CTOs seems to have collectively thrown us to the wolves. the kind of person who usually finds their way into the executive class doesn’t get there by looking after those under them. they get there by avoiding blame and taking credit. which, funnily enough, is exactly what a hype cycle is all about.
I suspect most people don't even know there's a there there. For instance, while I now know that file systems have permissions, before I became a programmer, I spent maybe ten years thinking of permissions as a special, obscure system thing that you should never touch. For that matter, I suspect many people don't know basic things like that a file system isn't inherently the operating system. And, where would you go to learn this information? Your Mac doesn't ship with a manual—how would you know one exists? Furthermore, I would wager that perhaps most people have never learned how anything works requiring a manual and are simply unaware that that's a thing. All to say, I'm not sure "refusal" is the right term.
When I was an undergraduate biology student in 1991 a suitemate told me I should go to some desk in some building over by Muir and get an account on the VAX. There were strange rooms all over campus that were open 24/7 and were loaded with green and amber screen terminals with integrated keyboards. Lots of sessions for CS lectures were held in these rooms and there was always interesting notes on the white boards (most rooms still had black boards or green boards, but think the chalk was too dusty so these rooms usually had the white boards. Once I saw an instruction that was circled with an arrow pointing to is that said: man man man -k -or- apropos and that was how I learned about computers. I just typed `man man` in a terminal on my Mac, and luckily its still there.
Nit: Your Mac does ship with a manual. Tips.app and of course the man pages. Point stands.
huh, but what if the AI trashes my git repo? maybe it just deletes the .git folder entirely. a deterministic undo wouldn’t be the silliest feature, for the current definition of “AI”.
The answer is the same: You give it either read-only or its own copy separate from the one you care about. The requested feature wouldn't be a robust solution here either for the same reasons. Besides, have you noticed the amount of other amateur-hour bugs anf jank in Codex going for weeks or months without proper resolution? Given that, why would you want and trust their solution here over alternatives, specifically?
The default sandboxing for Codex does not allow the agent to access .git
You mean they went to the codex bug tracker, on github, and they don't know how to use git? Well that is kind of ironic, isn't it?
The knowledge gap is very real. Because unsavvy users are just going to paste the API key into codex and say "make it work". For the truly lazy/uninformed, codex has computer use, and are going to tell it go into Vercel/Netlify/Stripe/Cloudflare for them, and get the API key, and save it to .env for them. So users knowing they need such a feature in the first place should be celebrated when the alternative is even dumber.
That's the product that is being sold here… why shame the users for expecting what was marketed to them?
I mean based on all the "coding is solved" hype that's what these companies are aiming for
Not sure I agree? It’s not like gitignore should be independent from git
The difference is that git is a traditional programming tool which executes deterministically. agents are not deterministic tools, they're not sandboxes or container runtimes or languages with capabilities models. They're a way to run arbitrary commands. It would be like saying that "xterm" should have a ".xtermnoexec" list of commands you can't run, or that VLC should have an option for actors it won't show. terminals run shells which run commands, it's not really deeply aware of what commands your shell ultimately run, and it's not in xterm's job to setup a sandbox and strip out executables. VLC displays pixels, it's not up to it to figure out if those pixels are a certain actor. codex pipes text and tool calls back and forth between OpenAI's servers, and it barely understands what that text and those tool calls are, and especially if a given tool touched a file. If you want VLC to not display an actor, you need to add a layer on top of VLC to stop it displaying a list of movies. If you want codex to not display a file's contents, you need a layer on top of codex to prevent it going near that file.
> they're not sandboxes Yes they can be, and Codex offers one. It uses Bubblewrap and seccomp on Linux which are perfectly capable of restricting filesystem access. In a default setup every command is executed inside a restrictive sandbox and you're only asked for permission to run that command if the execution fails. I don't necessarily think that it's a good idea to rely on these sandboxes as your only line of defense but that's absolutely a feature that they can, should, and do offer.
bash actually has a "restricted" mode which is sort of like that. In restricted mode, the following are disallowed: - Changing directories with cd. - Setting or unsetting the values of SHELL, PATH, HISTFILE, ENV, or BASH_ENV. - Specifying command names containing /. - Importing function definitions from the shell environment at startup. - Parsing the values of BASHOPTS and SHELLOPTS from the shell environment at startup. ... some other things mainly preventing you from escaping or disabling the restricted mode.
.gitignore doesn't have the same security implications. If you fail to prevent a private key from being added to your repository, you can reverse this and purge it from the blobs and reflog as if it never happened. If you fail to prevent OpenAI from ingesting a private key, you have created a security incident.
> If you fail to prevent a private key from being added to your repository, you can reverse this and purge it from the blobs and reflog as if it never happened. Only if you’re absolutely sure that it’s never been pushed to a public repository. I would treat a push of a private key to GitHub as a much higher emergency than it being sent to OpenAI (or even being accidentally used in a Google search), since there are bots that actively scan GitHub for private keys, such that your private key might be found within a few minutes of push.
Will git drop your production database because it feels like it when the stars align?
I could imagine perhaps some system which rather than denying access might instead replace the key material from your .env key with "** redacted. This key material can be used via make, but can never be exfoltrated directly **" whenever that key is seen heading out towards the network...
But that means the process can’t use the key for network requests, right?
OnePassword can do something like this where you put references to a path there instead of the key material, and then you wrap the invoke command with their CLI and it replaces them. So your local env file never has anything sensitive. A malicious agent could still exfiltrate if you give it access to debug tools on the running code though.
You expect Joe Blow vibe coder running Codex on his Dell to understand this?
Yes.
lol I don’t you realize how tech illiterate most normies are
I'm a fan of belt and suspenders.
Where did you run off to little boy.
[deleted]
Just be aware that AI agents will explore alternate means of accessing said files: https://news.ycombinator.com/item?id=48348578
Yes. I found this quickly after wrapping codex in a launcher that uses bubblewrap to exclude certain files and directories based on a config file at the project root. My best solution so far is to also include instructions for the agent that explain that it is not allowed to see certain files, and that their inaccessibility is not an error, and that it must not attempt to access them through other means (e.g. via git history, etc.). This has been a major improvement, but it's not foolproof.
Interesting you said it's not foolproof. Did the agent ignore your instruction somehow?
If you’re already running codex as a different user to limit its file permissions, why would you add it to the docker group?
A good but altogether separate note from the point I’m making: this lack of access is seen as an obstacle to overcome, and other means of access will be tried if available. It’s a different mental model than a first party solution to “ignore” files.
Weirdly, the existing first party solutions around denying commands don't seem to help here. Often enough, when one of the agents prompts for running "sudo", and I reject it, it will do what looks very much like malicious exploration to figure out how to handle things anyway, including once hijacking a separate shell's pty where I did have a valid sudo session already in order to execute some commands. We don't yet have the capability to make these models behave in a consistent, deterministic, or safe manner yet, so a first party solution isn't even necessarily that much better. Especially if it gives a false sense of security.
Lack of knowledge and the desire to have it run containers for things.
Yes. Any sane IT department would not allow external AI services, only local ones. It is just too easy for your company's data to end up on the wrong servers. If not through faulty file permissions, then through employees who simply post company ideas.
Or just have a corporate contract that provides assurances. Though really I’m skeptical that much corporate info is secret for competitive or privacy reasons. Mostly it seems to be for liability / discovery reasons. Which are still legit of course, but ideas are a dime a dozen and every company has more than they know what to do with. It’s the resourcing and execution that are hard.
> Or just have a corporate contract that provides assurances. After the massive copyright infringements and recent "who care's about the law anyway" stance of corporate America, trusting this could be a grand mistake.
Yet many use public github, and human developers accidently push secrets and other "not for public" files all the time.
Exactly proving the point.
While this is true, there is also a layer in the harness between the output of _any_ tool output (eg stdout or hand-rolled tools) and the LLM. A tool could read the file but then the agentic harness could redact the output before returning it back to the llm if any of the contents matched the file contents. We do something similar in Plotly Studio where we check the entropy of strings in the user input and flag & redact any high entropy strings to the user as “potential credentials” thay the user might have inadvertently copied and pasted into the prompt before sending to the llm. There are ways around this - the llm can always be clever by invoking tools to read the file contents in a different way than the direct file contents - but this is all to say that the agentic harness layer _does_ allow for deterministic logic in between tool output and the LLM requests.
> people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly You could always use setuid to allow the agent to run designated commands whose operation depends on the files, without the agent itself being able to access the files.
Sandboxing is a solved problem, there are dozens of providers of firecracker instances to run your agent in. The problem to be solved is how do you define task-specific least privilege versions of your coding agent.
We've been using Tenuo which for task-scoped authorization. Its integration for Claude Code: https://github.com/tenuo-ai/claude-governance
I'm running Codex/Claude in native macOS sandbox with access just to the project folder (plus read-only access to Git repo), and expand to other folders if necessary - https://github.com/sheremetyev/sandfence
Codex (at least) already imposes the macOS sandbox on the shell commands it runs. If it wants to run something without sandbox imposition, the harness makes me approve it manually. Is the difference with your script mostly that you choose to impose a stricter sandbox profile (and not allow any user-approved exceptions at runtime)?
If you're not sandboxing your agent, everything on your computer is waiting to be exposed. Assuming that file permissions will save you is naively dangerous.
It seems insane to me that so many people are OK with this. Why is it necessary for an agent to upload every bit of data it sees to OpenAI at all? Particularly if my agents can’t remember anything beyond a single session, why should the data exist permanently anywhere but in its original location?
> Why is it necessary for an agent to upload every bit of data it sees to OpenAI at all? The LLM is running at OpenAI. The agent doesn't see anything that doesn't get sent to OpenAI. It's like running a compiler in the cloud and asking why you need to send your source code to it when you only want the binary to be on your local PC. It's because that's where the processing is going on and it can't process what it can't see. > why should the data exist permanently anywhere but in its original location? Sure, they don't necessarily have to retain it permanently.
What is your sandbox approach? Any good guides? Something about asking a LLM for advice on how to sandbox LLMs.....
I use this: https://github.com/kstenerud/yoloai yoloai new mysandbox . # Create a sandbox yoloai attach mysandbox # Attach the sandbox to the current terminal ... (^b^d to disconnect) # It's using tmux to keep the agent alive yoloai diff mysandbox # See what the agent did yoloai apply mysandbox # apply its changes to your workdir yoloai destroy sandbox You can also make it run a prompt and block until it's done: yoloai run mysandbox . -p "read issue https://github.com/kstenerud/yoloai/issues/190 and fix it" yoloai diff mysandbox yoloai apply mysandbox yoloai destroy sandbox
thanks. I will check that out, I'm also checking out smolvm. Sometimes it is hard to distinguish my modest needs versus what might be needed at a corporate infrastructure level for coding or agent orchestration. I'm just writing scripts for neuroimaging analysis, etc, and want to ensure codex etc doesn't read my sqlite db or csvs, and send my research data to the inference provider... Are people using these and interacting with the agent via terminal, or are there fuller cli interfaces, or integrations?
How could an agent bypass file permissions?
By exploiting a root escalation. Or just finding a file/dir you forgot to set a tight enough mode on (happens a lot in systems where the default is insecure).
> I imagine this isn't resolved primarily because people expect it to apply to bash tool use, not just the "read" and "edit" tools, and people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly. Also, why would they add a feature to prevent data collection, if the data makes the company even more valuable and you might even get good deals from the current government if you provide the access for this data?
Yes, this was solved decades ago. How do you stop a human from reading one of your files? chmod 600
[deleted]
> How do you stop a human from reading one of your files? Call the police!
> > How do you stop a human from reading one of your files? > Call the police! Rather: Send the Marines. With intro: https://www.youtube.com/watch?v=eFvxqQTh3m4 Without intro: https://www.youtube.com/watch?v=HHhZF66C1Dc
> You can do this now: change the file permissions such that the user you run codex as can't read them, or run codex in a container without those files mounted. That's quite inconvenient. I want to run my coding agent in a restricted version of my regular user context, not something that drives like a separate machine. > What if the model runs "rg foo", and one of those files contains the string "foo"? It uploads the tool output, which includes the file contents. You have codex run rg in the sandbox, and the sandbox can't read foo. Why is this model so difficult to understand? Codex already runs a variety of commands under a bwrap/seatbelt/etc. sandbox. I've merely extended Codex to run everything in a sandbox. Escalation isn't a matter of whether to run a command in a sandbox or not: it's a matter of which sandbox policy to apply to whatever it is the model asked to do. > the only solution is to make it so the codex process is unable to access those files That's not true. Restrictions need apply only to the tools the model runs, not the Codex process itself. You can always insert a process-and-sandbox boundary between the harness and its tools. Codex inserts this boundary most of the time anyway. I've extended my Codex to do it all the time, even for things like the read-a-file tool. Works fine. > I imagine this isn't resolved primarily because people expect it to apply to bash tool use, Yeah? Applying it to the shell tool [1] is trivial. It's actually harder to apply the sandbox to non-shell tools. It just isn't hard conceptually: you define a sandbox policy, writing down what's allowed and not, and just filter everything the model does through this policy via OS-level lightweight sandboxing tools. Seriously. It's not that hard. And you don't have to sandbox the Codex process itself. I honestly have no idea why people think it's necessary to do so. The model has no ability to make Codex-the-POSIX-process do arbitrary things. [1] I refuse to call it the "bash tool" when most users are running zsh in it. Name things appropriately.
[deleted]
Hopefully they never actually implement this pointless feature because it will only give people a false sense of security given the unpredictable nature of LLMs. How could something like this even be enforced? People just need to learn how to use the tools their system already provides them. i.e., chmod
> Hopefully they never actually implement this pointless feature because it will only give people a false sense of security given the unpredictable nature of LLMs. How could something like this even be enforced? You run everything the model wants to do inside an OS-enforced sandbox of the sort browsers have used for decades to isolate tabs. It's already implemented and works fine. Codex just needs a few minor tweaks to make it apply its already-implemented sandboxing policy to a few situations it misses today. > People just need to learn how to use the tools their system already provides them. i.e., chmod I'm not running my agent as a separate POSIX user. Fortunately, my OS provides all the tools I need to free my having to do so. I love when I do something in a few hours and people later call it impossible.
The whole point of using an agent is that I don't want to learn everything. I fully expected the harness to read the .agentignore file and do what is needed to hide it from the LLM. But apparently, even if implemented, that's not how it works!
How would it prevent an agent from writing a script that discovers the secret file? It's not magic.
It can't. As others pointed out, its the wrong layer to implement the security feature. The agent needs to operate in an isolated user / container.
Files that codex and any other coding agent has access to, should be opt-in NOT opt-out. I think codex is not the right layer to solve this if you want a sane(one-click) UX. We built our own internal sandboxing-terminal around claude and codex. Where a user-configured base-folder with low-risk code and creds is COPIED into the sandbox BEFORE new session creation. There were many other UX related reasons to build our own terminal. Can share more if anyone is interested.
Do I understand correctly that you scope least-privilege creds/tokens and pass those to the sandbox? I'd be curious to learn more
I recently got the tool I use to orchestrate agents in (remote/secure) devcontainers open-sourced at work to solve this properly: https://github.com/nvidia/rumpelpod As others here have pointed out, it's exceedingly unlikely that a blocklist like proposed in the issue would ever be complete. You shouldn't allow agents direct yolo-access to your machine if it has sensitive data. Codex works particularly well as a remote agent harness because of its client-server architecture: The server component runs in the container, which might be remote, while the client runs locally. So, in contrast to e.g. the claude cli where the frontend also runs remotely, there's no lag when you write/edit prompts.
I agree a block list won't work. And unix file permissions may not be enough; I once saw Codex 5.4 use docker to execute a command as root since it couldn't run sudo. Running in a container may be the only solution: > sudo needs an interactive password here, so I'll use Docker itself to prepare the bind-mount directory as root and hand ownership back to UID/GID 1000. That keeps the compose file's non-root runtime intact. > Ran `docker run --rm -v /shares:/shares alpine:3.20 sh -c 'mkdir -p /shares/local-llm/models && chown 1000:1000 /shar...`
Huh? Blocking sudo works just fine. I don't know why everyone is acting like sandboxing tool uses is contrary to the laws of God and man and therefore we must adopt devcontainers and VMs and such to run agents. ... Sandboxes work JUST FINE. Seatbelt on macOS is okay. Namespaces/seccomp/etc. work on Linux even better. We already have all the technology we need to do the isolation people are talking about here, and Codex in particular has 99% of the code needed to solve the bug TFA talks about. I have a local patch that solves 100% of it. >_ OpenAI Codex (v0.0.0) model: gpt-5.5 xhigh /model to change directory: ... Ran sudo whoami sudo: The "no new privileges" flag is set, which prevents sudo from running as root. sudo: If sudo is running in a container, you may need to adjust the container configuration to disable the flag.
Neat tool! Will have to check it out Edit: would love a couple of pictures/video of how you use it. I kind of get the idea, but it seems like more hassle then it would be worth? Your comment of codex makes it seem like I might be missing something tho.
Yeah I should add a video to the README. Have you tried running `rumpel codex foo123` in one of your repositories, asking it to commit something, then `rumpel merge foo123` to get the changes back to your local checkout? Use a different terminal for the merge command, or detach from the codex session with `ctrl-a d`. You can also look at the commit first with `rumpel review foo123`, or get a shell inside the agent environment via `rumpel enter foo123`.
Sounds like user error to me. Codex gives an llm a tool to allow it to use shell in the context of the host and user in which it is running. If a resource is sensitive, and accessible in that context, then the user is doing something wrong. Would you change your practices if you treated your coding agent as an untrusted human ssh'd under the identity you use for it? In any case. There are solutions in the comments on the issue, as well as this hn thread.
I've contributed to https://github.com/0xferrous/agent-box which allows you to bind-mount git repositories into containers that agents operate in, preventing the agents from accessing files that aren't bind-mounted. Your usual .gitignore can then be used to also ignore files within the repo to be bind-mounted, which prevents agents from accessing them at all, essentially working as a sandbox. I also maintain https://github.com/nothingnesses/agent-images which allows you to use Nix to reproducibly spin up OCI containers containing agents and any other tools you need and use these with agent-box. I use both at the moment to work on some personal projects with agents, where I set up multiple separate git worktrees for the agents to work in, preventing them from accessing anything outside of the worktrees and from trampling over each other's work.
To avoid the risk of exfiltration, we need to stop using .env for security. API keys needed when working in a repo should be handled by a proxy like ssh-agent, and we need something better than bearer auth.
Yes you should. It will come naturally if you go down the road of separating code from data and properly isolating dev and prod environments, applying principle of least privilege as you do. .env files for creds are a convenience for dev and testing. They were never supposed to be used for security or carried around with sensitive stuff inside. None of this is new.
The desire not to leak valuable secrets is a strong argument for supporting local-first developer workflows. If an AI agent exfiltrates the credentials to connect to my local dev Postgres database which stores synthetic data, that's pretty low impact.
We have had a solution for this built into Codex for several months now. It is marked "Beta" in the docs because we have been tweaking the config API here and there, but a number of folks have been using it for quite awhile and I would recommend switching to it and reporting any issues you find: https://developers.openai.com/codex/permissions With permission profiles in Codex, you can: - Mark a path, glob, or meta variable (like `:workspace_roots`) readable, writable, or unreadable. - Amend an existing profile using ordinary `config.toml` layering rules. - Create a new profile by extending an existing one (`extends = ":workspace"` is generally what you want to do). Note that permission profiles also allow you to configure the network proxy for the sandbox in a fine-grained way. (Previously, the network options for the Codex sandbox were all or nothing.) Finally, you can also test running a command under a permission profile using: codex sandbox -P PROFILE_NAME -- PROGRAM ARGS... Our goal has been to provide something powerful and flexible out of the box so you do not need to bolt on other solutions like the ones mentioned on this thread.
Sound like snake oil. How would this work? The app that the agent is developing needs access to the file, so access to it cannot be blocked. Just because read_file can not access it (I think current harnesses prevent reading .env files already), does not mean the contents will never be seen by the model.
However clever/stupid you believe LLMs are they are extremely capable of working around these sorts of restrictions. The ask is for .env files for whatever code you are writing so if the code it writes dosn't have access (i.e. filesystem/container) what is the point, if the code under development reads the env how dose codex debug it without accedentally reading the values from memory? Adding a security setting that dosn't work is much worse then not having one.
The only thing close to a guarantee is to give the agent exclusive access to a clean VM with precisely the information and permissions you want it to have. I've been looking into a "workspace" concept that involves an entire cloud VM being spun up as part of an agent conversation such that code changes can be iterated without touching the user's local machine or other trusted contexts. All the agent's tools only have effect when supplied with a specific workspace guid. CLI tools like git are not authorized to talk to the remotes in this arrangement. The machine is initialized with a clone and no way to talk to origin. There are dedicated methods in the harness that can reach into the VM and pull out a change set for deterministic PR generation in the secure contexts (e.g. when the agent calls "ReadyForReview" or similar).
I made a lightweight vm specifically for this use case: https://github.com/smol-machines/smolvm
Thanks so much for building smolvm! I liked it so much that I vibe coded a little bash wrapper around it to handle creating ephemeral VMs for Pi: https://github.com/neuroblaze/smol-pi Consists of two scripts, one to build an OCI image (customizable by editing the Dockerfile that comes with it) and another to handle smolvm invocation. The invocation script mounts the current working directory under /workspace in the VM and the user's ~/.pi directory under /root/pi, and handles any other setup (eg: I have some convenience flags set up to specify a block all/block local/block internet/allow all for network access). One issue I ran into, it doesn't seem like smolvm cleans up disk images from ephemeral VMs, so my script has to do that itself. Is this a known bug or intended behaviour?
smolpi looks great! and smolvm does clean up ephemeral runs if the machine run exits gracefully. I'll take a deeper look into this edge case and fix it today.
Fixed and released in v1.3.1: https://github.com/smol-machines/smolvm/pull/497
Sounds overkill, how about giving the agent its own user?
It's really not overkill if you have good tools to work with. Hyper-V is quite capable of providing ephemeral workspaces on timescales measured in minutes. Especially with nested virtualization. One big machine with fast local disks can provide very short cold start times for a golden image stored on the same.
That's what I do in part because I went it to use the same system libraries etc. installed on my laptop, but I worry it will try to use privesc exploits...
highly unlikely the LLM will try to do privesc exploits, LPE risk still exists and should be assumed though, although the more likely risk model is the LLM installing an infected left-pad package, or (on servers) installing a dependency with a RCE vuln, or creating a new RCE vuln from scratch. If we are talking about running the agent on a dev machine, though, Codex doesn't seem to introduce a lot of risk, considering that I can already add OS protection layers, and that the devs added their own protection layers, and that I can direct the model towards my preferences (like not installing dependencies through npm or pip).
I work on a Linux sandbox that makes it easy to hide sensitive files from AI agents while keeping the files they need accessible. Check it out: https://github.com/wrr/drop
.agentsignore is NOT a security tool. It's a good idea as a hint to agents about what files it should ignore (because they'd be of no value and only chew up tokens). However, using it to prevent exposure of secrets would be a BIG mistake. There's simply no way to guarantee that an agent will ignore things in the ignore file. And even a harness-enforced restriction would still be in-process, which a rogue agent could trivially compromise. For security, use a sandbox. Nothing else will do. I do AI sandboxes (FOSS, free forever, no rug pull): https://github.com/kstenerud/yoloai
The fact that pretty much every comment in this thread suggests a different solution means there’s still plenty of innovation and consolidation to occur on this problem. My take is that Unix already solved all of these user access problems (what can a user read or execute), so the solution will probably be around containers or virtual machines. But the UX around booting up a container or virtual machine for agentic workflows needs to be simplified to the point where vibe coders who don’t know the first thing about Unix, VMs, or containers can still take advantage of the solutions.
These tools are data collection mechanisms to help train these better models. I'm working with some folks to figure out a way to put a layer between the harness and the models to have better control of what data gets sent to and from the model itself and the harness.
Look at agent-vault and 1password. There's not really any reason to be storing sensitive keys in plaintext on your local disk that the agent can access.
This should be an open standard like AGENTS.md or skills. What do other harnesses do?
I believe JetBrains products like Junie use the neutral term .aiignore for this funtionality.
[deleted]
>never read or send .env, .env.*, .pem, id_, .aws/, .ssh/. A think a better practice is to not store those things in the repository folder in the first place.
Do not store secrets in the repository in files, but inject them during runtime. Then the agents have no way to access them.
A lot of people have secrets/config files in the projects working directory but ignored by git i.e. `.env.local` So they're following best practice, not committing secrets but agents running locally can still see them even if sandboxing to the working directory. I've taken to storing configs using XDG_CONFIG_HOME and have the app auto resolve them by convention or take a cli arg to specify the config path. All secrets are in files, not env vars. That way when using sandboxing the agent can never see the configs or secrets as outside the working directory.
Sounds like a good way to do it. Makes me think of docker secret where the secrets are exposed as files and accessable only from inside the container. If the development environment uses docker then thats a solution too I guess
If you let your agent use docker you've basically given it root on your machine.
I use podman btw Its aliased to docker Building a project as a container and giving an agent access to running docker commands are different things.