Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The usability of unixy shells generally falls down a cliff when you need to deal with more than just one input one output. The awkwardness in trying to shoehorn process substitution is just one of the examples of that.


Yeah. It's caused by this single standard input/output model. Even using standard error is unergonomic and leads to magic like 2>&1.

What if programs could have any number of input and output file descriptors and the numbers/names along with their contents and data types were documented in the manual? Could eliminate the need to parse data altogether. I remember Common Lisp had something similar to this.


My understanding is that they can. stdin, stdout and stderr are just pipes assigned to file descriptors 0,1 and 2 by convention. There's nothing stopping a program from having more file descriptors passed in, and some programs do. There's no just no standard convention on how to do it.


> My understanding is that they can. stdin, stdout and stderr are just pipes assigned to file descriptors 0,1 and 2 by convention.

Yes. Standard input and output are so ubiquitous shells were entirely designed around them with syntax that allows you to work with them implicity.

> There's nothing stopping a program from having more file descriptors passed in, and some programs do.

Can you please cite examples? I've never seen any software do that.

> There's no just no standard convention on how to do it.

Yeah. Such conventions would be nice. Perhaps a plan9 style virtual file system for all programs...

  program/
    input/
      x
      y
    output/
      x+y
      x-y


> Can you please cite examples? I've never seen any software do that.

With `gpg --verify` you can specify which file descriptor you want it to output to. I've previously used it to ensure a file is verified by a key that is trusted ultimate. Something that otherwise requires outputting to a file.

Something like this:

    tmpfifo="$(mktemp -u -t gpgverify.XXXXXXXXX)"
    gpg --status-fd 3 --batch --verify "$sigFile" "$file" 3>"$tmpfifo"
    grep -Eq '^\[GNUPG:] TRUST_(ULTIMATE|FULLY)' "$tmpfifo" || exit 1
GPG also has `--passphrase-fd`. Possibly other option too.


> Can you please cite examples? I've never seen any software do that.

Well, almost no one actually says “input is read from fd 0 (stdin) and 4”, for example. Generally you say “input is read from file1 and file2”, and then the user can pass “/dev/fd/0 /dev/fd/4” as arguments. This copes better when the parent process doesn’t want to close any of its inherited file descriptors.


Here's an example of how you would allow reading from stdin by using a different descriptor (3) for the input you're iterating over. I knew this was possible mainly because I also recently needed to receive user input while iterating over file output in bash.

https://superuser.com/a/421713


> What if programs could have any number of input and output file descriptors and the numbers/names along with their contents and data types were documented in the manual?

What you’re describing is similar to one of the more common ways that programs are run on mainframes by using Job Control Language (JCL).


do you know alternatives to that? I assume PowerShell but don't know if there's anything beyond that.


You can always setup some named pipes. E.g.:

mkfifo named_pipe

echo "Hi" > named_pipe &

cat named_pipe

I used to do this in bash scripts to keep them cleaner and the lines simpler.


Named pipes and tee (or gnu parallel, depending on the problem) make this semantically much clearer. It's so much better than bracket-and-sed-hell spread out over different lines.


Where does sed come into this?


It doesn't necessarily -- I just meant that if you have <(<(...) <(...)) type structures then adding a punctuation-jumble like command in the middle of each subsubshell is a good way to murder readability quickly. Sed, and to a lesser extent, awk, tend to be good examples of tools that (can) use a _lot_ of brackets and symbols...


How does error handling together with this work? Can pipefail catch this or does one explicitly need to ‘wait’ for the background processes and check them there?


Guessing that pipefail cares 0% of which filedescriptors are used and only the exit codes of processes


I remember seeing some academic work on extending shell sematics to more complicated pipe networks, but nothing particularly promising. In industry, I think that is generally the point where people pick up "real" programming language instead of trying to work in shell; on top of my head I imagine golang with its channels and goroutines to be particularly well suited for these sort of problems. I can't say if there is something in golang that shells could adapt somehow.


But of course you can do the same thing in 30 seconds that Go would take 30 minutes for. Especially if you’re trying to process-substitute a shell pipeline, not just one command.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: