Skip to content

[#105] - add -o/--outfile option to allow writes to input file#2616

Closed
drm wants to merge 0 commit into
jqlang:masterfrom
drm:master
Closed

[#105] - add -o/--outfile option to allow writes to input file#2616
drm wants to merge 0 commit into
jqlang:masterfrom
drm:master

Conversation

@drm
Copy link
Copy Markdown

@drm drm commented Jun 16, 2023

This PR attempts to solve a very long-standing debate on "in-place" editing. The simple solution is to add an outfile option, which can be used to write the resulting output to. The implementation makes sure to delay the writing until after the first input is read.

echo '{"foo": "bar", "baz": "bat"}' > ./foo.json
./jq '.qux="waa"' -o foo.json < ./foo.json
# or:
./jq '.qux="waa"' -o foo.json  ./foo.json
cat foo.json

Will output:

{
  "foo": "bar",
  "baz": "bat",
  "qux": "waa"
}

In other (more complicated) situations, however, either all input must be buffered or copied, which is not desirable, or the output should be written to a temporary file, which has it's own limitations. I think, though, for the common use case, this should suffice and is Good Enough(tm). Maybe the caveat on how this works should be more explicitly documented in the 'usage' output.

Happy to help if there's anything wrong with the implementation.

@nicowilliams
Copy link
Copy Markdown
Contributor

Hmmm, can you test this:

echo '{"foo": "bar", "baz": "bat"}' > ./foo.json
# Do it again:
echo '{"foo": "bar", "baz": "bat"}' > ./foo.json
./jq '.qux="waa"' -o foo.json  ./foo.json
cat foo.json

What does that do?

Using fopen() with w+ means we don't truncate, but we'll write at offset zero. So I expect the second JSON text to be corrupted in this test.

What I'd expect is that either we first process all the inputs and then truncate and re-write the file, or that we write all the outputs to a temp file and then rename it into place, and of these two I generally prefer the latter because it's atomic.

@nicowilliams
Copy link
Copy Markdown
Contributor

@itchyny @pkoppstein @owenthereal please do not merge this yet. See questions and comments above.

@drm
Copy link
Copy Markdown
Author

drm commented Jun 16, 2023

You're right about "w+", I forgot about that, that was a leftover from earlier tries to do the opening earlier in the process.

Re tempfiles: Yeah that sounds good too. Would you have a preference in term of retaining the inode? I mean, writing to temp file and then moving would replace the original file, but writing the contents of the tempfile to the target is double the I/O. I would need a pointer to an example of how to retain/inherit the file attributes, if the file would be overwritten.

And would you prefer including a writability test at the start or let it just fail at the end? I would guess the former. Is a fopen/fclose in append mode enough to do that? Or is there a more idiomatic way?

@nicowilliams
Copy link
Copy Markdown
Contributor

nicowilliams commented Jun 16, 2023

You're right about "w+", I forgot about that, that was a leftover from earlier tries to do the opening earlier in the process.

But if you open it with "w" then you will find it empty when you process the inputs. Only "w+" makes sense unless you're opening a temp file (in which case you should use mksotemp() or similar) and renaming it into place.

Re tempfiles: Yeah that sounds good too. Would you have a preference in term of retaining the inode? I mean, writing to temp file and then moving would replace the original file, but writing the contents of the tempfile to the target is double the I/O. I would need a pointer to an example of how to retain/inherit the file attributes, if the file would be overwritten.

You can have atomicity XOR keep the inode. There's no way to keep the inode AND have atomicity. If you try to keep the file identity then any other process that has that file open for reading will get its toes stepped on (i.e., there will be races between reading and writing). On the other hand, changing the file identity is useful for other processes to detect that the file "has changed".

That's the problem with "in-place" editing.

For something like SQLite3, say, where the file contents is structured in some way that allows racing reads and writes, then you want to do in-place, naturally. But for JSON I don't think this is a good idea.

That said, a file-identity-preserving in-place write mode that truncates after the last write would be racy but would leave the file not-corrupted at the end, and it's reasonable to believe that some will want that. But then we'd need to have two in-place update modes.

Another issue is that jq takes any number of input files, so an in-place option would have to... apply to each input file, no? Whereas I'd expect a -o FILE option to write all outputs to that file regardless of how many input files there are.

So if we really want to be serious about this we might want:

  • a --in-place option to edit each file named on the command-line
  • a --in-place-preserve-id option to edit each file named on the command-line in a file identity preserving manner (maybe)
  • a -o FILE option to send all output to a file that might also be one of the input files (but not necessarily)

Now, the -o FILE option is clearly not needed: one can just script it: jq ... > ${desired_output_file}.new && mv ${desired_output_file}.new ${desired_output_file}. So really, we need the first option above, or the first and the second.

And would you prefer including a writability test at the start or let it just fail at the end? I would guess the former. Is a fopen/fclose in append mode enough to do that? Or is there a more idiomatic way?

With the rename approach I'd rather let it fail at the end.

Windows will need special consideration: we'll need to always open these files in ways that do not preclude renaming new ones into place, which means on Windows we need to use CreateFile() and then map that to an FD then fdopen() that. (Ugh, Windows.)

@drm
Copy link
Copy Markdown
Author

drm commented Jun 16, 2023

What does that do?

This actually works without problem, even with w+. So I guess it does truncate before writing. Note that the file will be opened after the first input is read, so w+ or w seems to have the effect. But just to make it clearer, I changed it to w.

echo '{"x":"very long value, does some linger?"}' > ./foo.json
./jq '.x="?"' ./foo.json -o foo.json
cat foo.json
{
  "x": "?"
}

@drm
Copy link
Copy Markdown
Author

drm commented Jun 16, 2023

Regarding the implementation, let me know what you decide on, I'd be happy to implement whatever you think is a good approach. I don't really mind too much about the internals.

@nicowilliams
Copy link
Copy Markdown
Contributor

What does that do?

This actually works without problem, even with w+. So I guess it does truncate before writing. Note that the file will be opened after the first input is read, so w+ or w seems to have the effect. But just to make it clearer, I changed it to w.

Ah, right, I brain-o'ed. Yes, "w+" truncates. But how can this work if we need to read the inputs after truncating the file? We can't simply defer until we've produced one output: we might still have more to read, and we don't buffer the whole file in-memory.

(Also, I updated my previous reply, FYI.)

@nicowilliams
Copy link
Copy Markdown
Contributor

Regarding the implementation, let me know what you decide on, I'd be happy to implement whatever you think is a good approach. I don't really mind too much about the internals.

As I've never needed this feature but you do, let's start with: what semantics do you think jq should implement? See above commentary about multiple input files.

@drm
Copy link
Copy Markdown
Author

drm commented Jun 16, 2023

My use case semantics are: "Change the value in this file to "x"' which I need quite often in a build/release process, for example, build-specific config files generated by a script.

Multiple inputs don't really need to work at all for that. I get that that would be a nice-to-have, for consistency sake, but I'd be happy if jq only supported a single inplace option, which ignores stdin and stdout and file arguments entirely.

I am pretty sure that most people that want this option are looking at jq for the exact same use case (based on my all-seeing google attempts ;))

@nicowilliams
Copy link
Copy Markdown
Contributor

My use case semantics are: "Change the value in this file to "x"' which I need quite often in a build/release process, for example, build-specific config files generated by a script.

Does https://github.com/nicowilliams/inplace not help with the case where you're editing a single file in place? Is it just ugly or icky, or hard to memorize to have to use a second program for this? I do think it's probably hard to memorize, especially when sed has a -i option for this. But in the "Unix philosophy", using a separate program for this is "better". On the other hand, if you're writing a shell script, who cares if it's easy or hard to memorize, you just make it a best practice to use this pattern at $WORK and you're good to go.

Multiple inputs don't really need to work at all for that. I get that that would be a nice-to-have, for consistency sake, but I'd be happy if jq only supported a single inplace option, which ignores stdin and stdout and file arguments entirely.

Right, but I think jq devs/maintainers probably want to keep things general. But now here's another issue: what if you really want to use jq -n -f prog.jq multiple files here but have one of those files be the one that's re-written with all the outputs of jq? So actually a -o option does sound appealing after all.

I'm trying to discover what the right semantics are -- I'm not trying to give you a hard time. It's just that in this particular case there's a lot to think about, and that's really why we've never been keen to do anything about this request, but since you're willing to do the work, maybe we should be willing to take it once we discover what semantics we want.

I am pretty sure that most people that want this option are looking at jq for the exact same use case (based on my all-seeing google attempts ;))

I bet it would cover 90% of use cases, but be very surprising to 10% of users. Maybe that's good enough, but I'm not ready to reach that conclusion. Certainly if we go with this we'd have to make sure it's well-documented.

@nicowilliams
Copy link
Copy Markdown
Contributor

nicowilliams commented Jun 16, 2023

So actually a -o option does sound appealing after all.

Though, once again one can just script that case.

That's another thing: keeping jq as much as possible a "pure" filter makes it easy to say no to this sort of request on account of "you can just script it". But then too, sometimes you can't easily script something. And some extensions some of us have in mind would make jq less of a "pure" filter, so purity isn't a terribly good excuse.

@nicowilliams
Copy link
Copy Markdown
Contributor

Another thing is that whatever we do I don't want us to be sad about later w/o being able to change it due to having to maintain backwards compatibility.

Now, sed -i does a) edit in place all named files, b) it does rename new files into place. So if what we're after is sed -i-like behavior, then let's copy sed. But maybe that's GNU sed and not BSD sed? Can someone check how BSD sed's -i option behaves?

@wader
Copy link
Copy Markdown
Member

wader commented Jun 18, 2023

BSD sed man on macOS says this about -i and -I:

     -I extension
             Edit files in-place, saving backups with the specified extension.  If a zero-length extension is given, no backup will be saved.  It is not
             recommended to give a zero-length extension when in-place editing files, as you risk corruption or partial content in situations where disk
             space is exhausted, etc.

             Note that in-place editing with -I still takes place in a single continuous line address space covering all files, although each file
             preserves its individuality instead of forming one output stream.  The line counter is never reset between files, address ranges can span
             file boundaries, and the “$” address matches only the last line of the last file.  (See Sed Addresses.) That can lead to unexpected results
             in many cases of in-place editing, where using -i is desired.

     -i extension
             Edit files in-place similarly to -I, but treat each file independently from other files.  In particular, line numbers in each file start at
             1, the “$” address matches the last line of the current file, and address ranges are limited to the current file.  (See Sed Addresses.) The
             net result is as though each file were edited by a separate sed instance.

I guess this is the sed inplace implementation on FreeBSD and macOS https://cgit.freebsd.org/src/tree/usr.bin/sed/main.c#n361

Haven't fiddled around much with syscall tracing on macOS but i got dtruss to say this. Look like it uses a temp file and does rename:

$ echo aaa > file1
$ echo aaa > file2
$ sudo dtruss -f ./sed -I "" 's/aaa/bbb/' file1 file2
dtrace: system integrity protection is on, some features will not be available

	PID/THRD  SYSCALL(args) 		 = return
22671/0x30c08e:  fork()		 = 0 0
22671/0x30c08e:  munmap(0x11CFE6000, 0x9C000)		 = 0 0
22671/0x30c08e:  munmap(0x11D082000, 0x8000)		 = 0 0
22671/0x30c08e:  munmap(0x11D08A000, 0x4000)		 = 0 0
22671/0x30c08e:  munmap(0x11D08E000, 0x4000)		 = 0 0
22671/0x30c08e:  munmap(0x11D092000, 0x54000)		 = 0 0
22671/0x30c08e:  open(".\0", 0x100000, 0x0)		 = 3 0
22671/0x30c08e:  fcntl(0x3, 0x32, 0x7FF7B277C1E0)		 = 0 0
22671/0x30c08e:  close(0x3)		 = 0 0
...<cut>
22671/0x30c08e:  close_nocancel(0x3)		 = 0 0
22671/0x30c08e:  open_nocancel("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/LC_MESSAGES\0", 0x0, 0x0)		 = 3 0
22671/0x30c08e:  fstat64(0x3, 0x7FF7B277C0B8, 0x0)		 = 0 0
dtrace: error on enabled probe ID 1714 (ID 959: syscall::read_nocancel:return): invalid kernel access in action #13 at DIF offset 68
22671/0x30c08e:  close_nocancel(0x3)		 = 0 0
22671/0x30c08e:  lstat64("file1\0", 0x7FF7B277C450, 0x0)		 = 0 0
22671/0x30c08e:  unlink("./.!22671!file1\0", 0x0, 0x0)		 = -1 2
22671/0x30c08e:  open_nocancel("./.!22671!file1\0", 0x601, 0x1B6)		 = 3 0
22671/0x30c08e:  fchown(0x3, 0x1F5, 0x14)		 = 0 0
22671/0x30c08e:  fchmod(0x3, 0x1A4, 0x0)		 = 0 0
22671/0x30c08e:  open_nocancel("file1\0", 0x0, 0x0)		 = 4 0
22671/0x30c08e:  fstat64(0x4, 0x7FF7B277C2C8, 0x0)		 = 0 0
dtrace: error on enabled probe ID 1714 (ID 959: syscall::read_nocancel:return): invalid kernel access in action #13 at DIF offset 68
22671/0x30c08e:  fstat64(0x3, 0x7FF7B277C348, 0x0)		 = 0 0
dtrace: error on enabled probe ID 1714 (ID 959: syscall::read_nocancel:return): invalid kernel access in action #13 at DIF offset 68
22671/0x30c08e:  close_nocancel(0x4)		 = 0 0
dtrace: error on enabled probe ID 1712 (ID 961: syscall::write_nocancel:return): invalid kernel access in action #13 at DIF offset 68
22671/0x30c08e:  close_nocancel(0x3)		 = 0 0
22671/0x30c08e:  rename("./.!22671!file1\0", "file1\0")		 = 0 0
22671/0x30c08e:  lstat64("file2\0", 0x7FF7B277C450, 0x0)		 = 0 0
22671/0x30c08e:  unlink("./.!22671!file2\0", 0x0, 0x0)		 = -1 2
22671/0x30c08e:  open_nocancel("./.!22671!file2\0", 0x601, 0x1B6)		 = 3 0
22671/0x30c08e:  fchown(0x3, 0x1F5, 0x14)		 = 0 0
22671/0x30c08e:  fchmod(0x3, 0x1A4, 0x0)		 = 0 0
22671/0x30c08e:  open_nocancel("file2\0", 0x0, 0x0)		 = 4 0
22671/0x30c08e:  fstat64(0x4, 0x7FF7B277C2C8, 0x0)		 = 0 0
dtrace: error on enabled probe ID 1714 (ID 959: syscall::read_nocancel:return): invalid kernel access in action #13 at DIF offset 68
22671/0x30c08e:  fstat64(0x3, 0x7FF7B277C348, 0x0)		 = 0 0
dtrace: error on enabled probe ID 1714 (ID 959: syscall::read_nocancel:return): invalid kernel access in action #13 at DIF offset 68
22671/0x30c08e:  close_nocancel(0x4)		 = 0 0
dtrace: error on enabled probe ID 1712 (ID 961: syscall::write_nocancel:return): invalid kernel access in action #13 at DIF offset 68
22671/0x30c08e:  close_nocancel(0x3)		 = 0 0
22671/0x30c08e:  rename("./.!22671!file2\0", "file2\0")		 = 0 0
22671/0x30c08e:  close_nocancel(0x1)		 = 0 0

Hope that helps... and now i also think i understand where jq got its behaviour to treat all input files as one continuous stream :)

@nicowilliams
Copy link
Copy Markdown
Contributor

NetBSD's sed -i renames an mkstemp()ed file into place.

@nicowilliams
Copy link
Copy Markdown
Contributor

Argh, this page hadn't drawn your comment in when I commented about NetBSD's sed.

@emanuele6
Copy link
Copy Markdown
Member

This approach of using jq '.hello' -o foo.json foo.json in place cannot work.

Even if you delay the opening of the output file at the first output write (lazy open/truncation), it will only either appear to work while actually not working correct for very small input files that jq is able to read entirely in one go, or work reliably only if jq is running in "slurp" mode.

If the file is too long for example, jq will receive one chunk of the file, process it, and then once it outputs it, overwrite the file, and either realise that it is at the end of the file and stop if the output is smaller then the input chunk that was read (assuming the chunk that was read terminated with a valid full JSON value, if it didn't it will abort with a JSON syntax error on the input file), or if the output was bigger read the output as input and most likely abort with a JSON syntax error again.

There are UNIX programs with a -o option that actually lets you overwrite the input files, those programs are "slurp"-y programs like for example sort.

With sort, you can run sort -o file.txt file.txt to sort file.txt "in-place", because only knows what it should output after having read all of its input, and having sorted it. Now that the entire input data is processed and in memory, sort will open the output file, truncate the file, and write the output to it.
Even if the output file was the input file, this will work because the file is truncated only after the input is fully read, so when it is no longer necessary.

If -o is implemented in a way that makes jq hold all of its output in memory until the script has finished running, and then dump all of the output to the output file only then, this may work. But of course, this is just horrible.

If you implement -o in a way that the fopen(path, "w") is delayed to the first write, then this will work, but only if you are using -s.

jq -o foo.json -s '.[] | .hello = "foo"' foo.json

But it is not a great idea since it will only work if you use -s and would be pretty much useless otherwise (in a way that misleads users into thinking it may work).

The only sensible way to do those type of "inplace editing" things is to write the output to a temporary file, and then mv the temporary file to the path of the input file (on success; otherwise delete the tmp file and fail).

This is already easily possible using a bit of scripting:

if jq .something file.json > file.json.tmp
  then mv file.json.tmp file.json
  else rm file.json.tmp
fi

Or even using the sponge utility provided by many packages and takes care of redirecting the output to a temporary file, and mving it to file.json when the input closed:

jq .something file.json | sponge file.json

But I understand we may want to implement a -i option that does this builtin because users seem to really want it.

This is just a matter of creating the temporary file, redirecting stdout to that file, and finally either deleting the file or moving the file to the path of the input file based on whether the jq script executed successfully or not.

For those who want to try implement this, I will point out that there are some subtle caveats of using a temporary file and renaming it the path of the input that you may want to consider.

If say you are root, and you don't own the file you are modifying inplace, the temporary file you create will have a different user/group owner (and maybe also permissions compared to the original file), it will be owned by root, and not the original user, and it may end up being not executable while the original file was executable.
GNU sed remembers the owner/permissions of the original file and will attempt to change the permissions and owner of the temporary file to the original values after moving it to the path of the input file.

If you are not root, but another user that doesn't own the input file, but has read permissions to it, and write permission to the directory in which the file is in, you may not be have permissions to change the ownership of that file to the original owner, and it will be owned by you instead of the original user; GNU sed simply ignores the error in that case, and lets it be that way.

@nicowilliams
Copy link
Copy Markdown
Contributor

[...]

Exactly. Thanks!

But I understand we may want to implement a -i option that does this builtin because users seem to really want it.

This is just a matter of creating the temporary file, redirecting stdout to that file, and finally either deleting the file or moving the file to the path of the input file based on whether the jq script executed successfully or not.

Yes. Exactly how sed -i does it.

For those who want to try implement this, I will point out that there are some subtle caveats of using a temporary file and renaming it the path of the input that you may want to consider.

If say you are root, and you don't own the file you are modifying inplace, the temporary file you create will have a different user/group owner (and maybe also permissions compared to the original file), it will be owned by root, and not the original user, and it may end up being not executable while the original file was executable. GNU sed remembers the owner/permissions of the original file and will attempt to change the permissions and owner of the temporary file to the original values after moving it to the path of the input file.

And not just root. On some systems other users can have this sort of privilege, so don't just check if geteuid() returns 0 -- simply check if the temp file has the same ownership and permissions as the original and if not attempt to change them to the original (and if that fails, when to ignore the failure?).

And not just on Unix-y systems but also on Windows (though perhaps we wouldn't demand that).

If you are not root, but another user that doesn't own the input file, but has read permissions to it, and write permission to the directory in which the file is in, you may not be have permissions to change the ownership of that file to the original owner, and it will be owned by you instead of the original user; GNU sed simply ignores the error in that case, and lets it be that way.

+1

@wader
Copy link
Copy Markdown
Member

wader commented Jun 19, 2023

Had a look at how GNU sed does it. Has some chown error fallback logic and also copies ACLs https://git.savannah.gnu.org/cgit/sed.git/tree/sed/execute.c#n667

@drm
Copy link
Copy Markdown
Author

drm commented Jun 19, 2023

@nicowilliams

[...] I'm trying to discover what the right semantics are -- I'm not trying to give you a hard time. [...]

I get that, no worries. In return, I understand all the considerations that are raised, I am not trying to ignore them nor do I think they're invalid. I don't have any skin in the game besides making my own life easier. That's really all it is: convenience (not having to install multiple utilities on any machine that I need it on, not having to think about escaping/scripting when feeding the command to an SSH connection or docker exec input, even not having to script anything bash-wise when I'm on the command line myself, trying to set some configuration parameter somewhere debugging on stdout while writing the replacement code, and simply adding the -i flag to apply when I'm done, having such oneliner in my bash history to toggle the value back; it really piles up if I think about it :) So yeah, I'm really invested in just that: making my life easier.


Based on everything said, I am inclined to conclude that using temp files in the way sed or sponge does is the way to go. This PR was an attempt to keep it as simple as possible, but if we're in agreement that that is not a viable solution, I'm happy to replace it with one that we can all happy with, UNIX principle hermeneutics aside. However, since that would take a little more work than what I did in this PR, I'd like to make sure that it has a high chance of getting merged, if not a guarantee.

So let's hash out the following first. I am basing this on the GNU sed implementation, not sure if BSD sed differs in any of the following:

  1. I don't think we should implement the backup SUFFIX. It seems sensible on the one hand, but it would also imply implementing the follow-symlinks flag, which rings a bit like scope creep to me, and also an option that could be implemented later if people seem to require it.
  2. In-place should only work for files supplied as arguments. I think it should not read STDIN at all if -i is supplied, but since detecting input on stdin can be tricky (if possible reliably cross-platform at all, I don't think so...). In other words, if -i is supplied, but no file arguments, it's either a noop or an error, even if stdin is supplied. However, it may be confusing to users either way, but I don't think we should linger too much on such unintended usage, besides documentation.
  3. If any of the files is not writable by the current user, the entire process should fail. However, to avoid having a partial failure, I think it makes sense to check this at the start of the program.
  4. There are cases where a user may be able to write to a file, but not create files in the directory where the file is in. I wonder if it makes sense copying the source file to a temp file, instead of copying the file at the end, as sed and sponge do, eliminating the need for restoring ownership and permissions. For sponge it makes sense that they do it the way they do (you don't have an input file) and for sed the use case could involve huge files with tiny results. That is also, I reckon, the main reason not to consider this, but I wanted to bring it up anyway, as it does make some things easier.
  5. The requirement of atomicity is dropped, though we could consider implementing a warning or an error if the file was changed during the execution of the program. Also, we could consider adding a flag that will cause the execution to fail if an intermediate change was detected. Remember though, that in the vast, vast majority of cases, the source file being written to while doing an inplace edit is just a strange edge case, but more importantly simply unintended usage, which should be documented. For example, I would be surprised if sed actually bothers warning users about this...
  6. Implementing the inplace as such no longer warrants the -o as implemented in this branch, so we should just revert to master and start over.

Let me know your thoughts. And feel free to add considerations if you have others.

@nicowilliams
Copy link
Copy Markdown
Contributor

One reason to want a -i is that one might want to have lots of --slurpfile and --arg/--argjson options to repeat for each file to be edited in place, so this:

for in in "${files[@]}"; do
    jq "${slurfiles[@]}" "${args[@]}" "$i" > "${i}.tmp"
    mv "${i}.tmp}" "$i"
done

can be replaced with this: jq -i "${slurfiles[@]}" "${args[@]}" "${files[@]}", which is convenient, and more efficient. Also, it's hard to copy attributes/permissions/ACLs in the shell w/o copying the whole file just to the truncate it, but jq, like sed, could do it.

So I think a sed-like -i is justified, though not essential.

[BTW, @stedolan has wanted to make it so no new command-line options are needed. If we ever finish my and @leonid-s-usov's FFI/co-routines branch we could just do all I/O and what not directly in jq-coded library functions using new C-coded built-ins. I think that's a worthy goal, though not directly relevant to this PR because that work is kinda stuck at the moment, but it's worth keeping in mind.]

@nicowilliams
Copy link
Copy Markdown
Contributor

@drm @wader one thing to be careful of is that we don't want GPL code in jq, so don't just copy code from GNU sed.

@nicowilliams
Copy link
Copy Markdown
Contributor

nicowilliams commented Jun 19, 2023

So let's hash out the following first. I am basing this on the GNU sed implementation, not sure if BSD sed differs in any of the following:

  1. The SUFFIX thing is to avoid the risk of ending up with lots of garbage.
  2. Indeed, sed -i refuses to read stdin, and insists on files being named on the command-line.
  3. If we'd do this then we should probably open all the input files and all the temp files before we start processing any of them, then the only step at which failures can happen is at the final rename(2) step -- and, yes, renames can fail with ENOSPC, but that seems less likely at that point. At any rate, we can't guarantee that there won't be partial success/failure.
  4. Let's not do that. We should stick to sed -i semantics.
  5. Let's not do that. We can't detect that input files changed if they change inside of 1s after we stat(2)/fstat(2) them. More generally, it's not possible to make all of this atomic.
  6. Right.

NetBSD's sed also has a -I which differs from -i mainly in that the line counter doesn't reset for each input file -- I don't think we need that.

@nicowilliams
Copy link
Copy Markdown
Contributor

nicowilliams commented Jun 19, 2023

An initial implementation could just refuse to edit-in-place any files not owned by geteuid(), and we can defer writing code to attempt to copy file owner/group/mode/ACL until later. On Windows it would just always go ahead.

@drm
Copy link
Copy Markdown
Author

drm commented Jun 19, 2023

OK, I'll probably work on a preliminary version tomorrow or the day after. Let's pick it up from there.

@drm
Copy link
Copy Markdown
Author

drm commented Jul 28, 2023

Just a little heads-up since I haven't updated on the topic at all: I got occupied with some other stuff and I've solved my immediate problem with a bit of bash scripting. I am still willing to pick this up, but priorities just have shifted a bit.

@nicowilliams nicowilliams added this to the 1.8 release milestone Jul 28, 2023
@mailsanchu
Copy link
Copy Markdown

Is this still worked on?

@emanuele6
Copy link
Copy Markdown
Member

It was never working.

@mailsanchu
Copy link
Copy Markdown

Any plans to add this feature?

@emanuele6 emanuele6 removed this from the 1.8 release milestone Dec 11, 2023
@jubr
Copy link
Copy Markdown

jubr commented Feb 27, 2026

Hi @drm, @nicowilliams, @wader, @emanuele6, @mailsanchu I went ahead and "just did it", see #3488.

I've also wanted this functionality for over a decade. #GoAi

Please leave any comments on my PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants