It's a bandaid on a wider problem: the design of Unix shell is bonkers and the whole thing should be deleted. Why? Because I haven't seen any other tool ever have so many pitfalls. Take n random languages and m random developers and tell them to loop over a string array and print its contents, and count how many correct programs you get on average per language. There will be easy languages, then difficult languages, then a huge gap, then Unix shell because in your random sample you managed to get one guy who has PhD in bash.
The main problem is using text as a common format between different applications.
First: text is not well defined. Is it ASCII? Is it UTF-8? Some programs can spew UTF-32 with proper locale configured, it's a mess.
Second: encoding and decoding of objects to text is not defined at all. Those problems with filenames is just one example. Using newline as a separator is a natural thing that is easy to implement, yet it is wrong.
In my opinion two things should be done:
1. Standardise on UTF-8. No other encodings allowed.
2. Standardise on JSON. It is good enough to serve as universal exchange format, tools like `jq` exist for some time now.
So any utility must read and write JSON objects with some standard env set. And shells can be developed with better syntax to deal with JSON. This way you can write something like
`ps aux | while read row; do echo ${row.user} ${row.pid}; done`
>It is good enough to serve as universal exchange format, tools like `jq` exist for some time now.
Please don't use that underdefined joke of a spec. Define "PosixJson" and use that instead. Right now it's not even clear what the result of parsing {"a": 1234678901234567890} is. Is this a parse error? A bigint? A float/double? Quiet wraparound? Something else? I've seen all these behaviors in real world JSON implementations across different languages.
> A file that contains characters organized into zero or more lines. The lines do not contain NUL characters and none can exceed {LINE_MAX} bytes in length, including the <newline> character.
So, if you have some non-printable characters like BEL/␇/ASCII 0x07, that's still a text file.
(and I believe what bytes count as a valid character depend on your `LC_CTYPE`).
But the moment you have a line longer than {LINE_MAX} bytes (which can depend on which POSIX environment you have), suddenly your text file is now a binary file.
Kind of a weird definition indeed. One edge case: the definition states the file must contain characters, so presumably zero length files are out. But then how could you have zero lines?
Yes obviously. But the POSIX specification for a "text file" as above is that it contains characters, which an empty file by definition does not. So an empty file cannot be a text file if you read that specification strictly, and therefore you cannot have zero lines in a text file. As soon as you have a single character there is at least one line, and the amount of lines can only stay the same or grow from there.
The definition should read "one or more lines" instead or (probably better) specify that a text file contains "zero or more characters".
What cursed madness have you hit that spits out UTF-32 under normal conditions?! That can only be a bug - UTF-32/UCS-4 never saw external use, and has only ever been used for in-memory fixed-width character representation, e.g. runes in Go.
You never have to worry about whether you're dealing with ASCII vs. UTF-8, but rather if you're dealing with UTF-8 vs. ISO-8859-1, or worse, Shift JIS or similar.
I think a lot of tools should support json as well as plain text. Probably the latter by default, and the former with a "-o json" or similar option. I'm fine with wc giving me `5`, I'd prefer that to `{ "characters": 5 }`.
There are exchange formats that are well-defined enough to be useful to many computers while also being readable enough to be traversed by human eyes. There's no reason to everything ad-hoc, you don't get much by that. You also control the shell itself - there's no reason you can't display object representations in a pretty way.
JSON itself is bad for a streaming interface, as is common with CLI applications. You can't easily consume a JSON array without first reading it in its entirety. JSONL would be a better fit.
But then, how well would it work for ad-hoc usage, which is probably one of the biggest uses of shells?
> I haven't seen any other tool ever have so many pitfalls.
I haven't seen any other tool with so much general utility and availability.
> to loop over a string array and print its contents
Is incredibly easy in bash and bash like shells. As highlighted the issue is that tools like 'ls' don't create "a string array." They create one giant string that has to be parsed. The rules in the shell are different than in other languages but it /will/ do most of the parsing for you, or all of it, if you do it carefully.
This is a fine tradeoff. As evidenced by it's wide usage and lack of convincing replacements.
Someone needs to come up with a interactive shell first, one that is comparable in usability. Then we can think about replacing the unix shell.
I tried both python and lua interactively, but they are a pain when it comes to handling files. You have to type much more to get the same things done.
The bigger issue is the sheer momentum of Unix shell. Even if you come up with an alternative that is better by every objectively measurable metric, it's still going to be a monumental task to have it packages with commonly used distros. Kinda like the "why can't the US switch to the metric system" problem.
I'm sure you might get more than 5 people on HN replying to you that they are using fish right now. Say something discrediting about fish and they show up.
Heh, reminds me of how to get help with Linux back in the day. If you directly asked for help, you'd be told to RTFM. If you stayed confidently that Windows could do something and that Linux sucks because it can't, you'd get users tripping over themselves with details and instructions,'just to prove you wrong.
There's a direct cost in money, time and lives that has come from the US's adherence to their US Customary Units (which are often different to the old imperial units). People have literally died because of the confusion caused by having multiple systems of units in common use with ambiguous names (degrees, gallons, etc). Each year industry worldwide spends an enormous amount of money indirectly precisely because of this problem and it's still incredibly unlikely to be fixed within my lifetime.
Bash-alternatives that are not completely compatible frankly just don't have a chance.
OK let them add an explicit check to standard tools, and/or to open(), mkdir(), etc. with O_PORTABLECHARS. And an environment option to disable this check.
If it isn't distributed out of the box with every nix-like OS, it inherently isn't* “better by every objectively measurable metric" - distribution of a common, stable standard is a huge benefit in and of itself.
Python maybe often installed by default but it's definitely not an essential/required package "out of the box" on every install.
Also, in a thread where one topic is how POSIX shell handles whitespace in filenames, it's hilarious (not in a good way) that someone suggests a language that handles whitespace the wrong way in it's own code. Yes, significant whitespace is objectively wrong.
What OS/distro is Lua included on out of the box? That doesn't mean "available in a package". I mean literally included in every single install and cannot reasonably be omitted?
Regardless of the availability, the parent comment says
> better by every objectively measurable metric
Neither Python nor Lua are "better" than shell, at the types of things shell is commonly used for - they're objectively worse.
Lua gets onto every other Linux distro as dependency of some base system component. For example, rpm or pipewire depend on lua. Ubuntu and Debian ship with pipewire per default.
That isn't even close to "installed on every system". Best I can tell from the reverse dependencies, it's required for some Gnome Remote Desktop tool, and best I can tell, it doesn't rely on Lua anyway (at least on Debian).
> You should use the word "objectively" less.
I specifically used the word objectively, because the original comment that I replied to, said this:
> Pipewire being the Pulseaudio replacement from Redhat.
Right, so it's a desktop package that ultimately will be installed on about 1% of all Linux machines because the vast majority are servers without a desktop environment.
Also worth pointing out: liblua on Debian at least, is the shared library. It's not the binary to execute standalone Lua scripts.
This this like a game where you come up with bullshit and i have to come up with the facts to rectify it? RHEL/centOS have more than 1% market share alone.
Check your own installs and tell me if you find some that dont have liblua or libluajit.
For the library thing: I said "Python and lua are pretty close to that." earlier. I did not say that they have interpreters ready everywhere. But if the language core is already installed on a large fraction of machines, then adding the interpreter is not a big cost.
> already installed on a large fraction of machines
So far you've presented no evidence of this though, just that it's used by a new desktop-focused package.
All linux desktops over the last 30 years is not even a "large fraction" of total Linux installs, much less the ones that have already migrated to this new audio system.
> adding the interpreter is not a big cost
It's nothing to do with cost. It's about "how do I know this will absolutely 100% run on any POSIX machine I throw it on without any extra steps".
Remember the argument here is about something that is claimed to be "objectively better" than Shell. The ubiquitous nature of POSIX shell is a huge barrier for any possible competitor, and saying "well you just need to install it" just defeats the purpose. You might as well write it in fucking java and say "well you just need to install a JVM".
Edit to Add:
a good number of systems I manage do have liblua installed... because HAProxy requires it, and those systems have HAProxy installed. Not because it was installed as part of the base OS or even a default group of packages.
Incidentally, HAProxy and thus liblua were installed on those systems by infrastructure management that's implemented as shell script. So what kind of chicken and egg argument do we need to have here about how exactly I can run a Lua script to install Lua?
PowerShell designer could learn from decades of programming language progress and especially shell usage. They could improve many aspects indeed. This doesn't mean that the original design is "bonkers", only that it's not perfect.
The way Powershell works is largely based on what the computing world was doing with shells outside Bell Labs, at IBM, Xerox, and others places, exactly at similar timeframe as UNIX was happening.
Modern programming language designers have a bad relationship with verbosity. I don't know why they do this.
It's a lang for an interactive shell, typing literally translates to developer speed. I understand the want for clarity and maybe that's nice in large scripts, but the main goal is to be a shell. So, optimize for that. Also, you probably shouldn't be using powershell for large scripts anyway.
The only recent lang I've seen that has a handle on this is Rust. You can tell they put a lot of thought into having keywords be as short as possible while still being descriptive.
Those aliases are, I believe, only defined on Windows PowerShell (the closed-source version 5; not PowerShell 7). I wish those default aliases you mentioned weren’t a thing. Especially `curl` (people should use `iwr` instead), which is an alias of `Invoke-WebRequest`, because it makes the `curl.exe` shipped with Windows nearly undiscoverable.
This should not be as downvoted as it is. In a way shell is broken. The brokenness is in that it requires each command to serialize and deserialize again, considering all the weird things that can happen with the "all is a string" kind of approach, instead of having a proper data interchange format or even sending objects to next steps in the pipeline. This behavior is what necessitates even thinking about the changes listed in the post. We wouldn't even have that problem, if the design of shell was better thought out. Now we are dealing with decades of legacy built on these shaky foundations. I hate to admit it, but seems at least this aspect Powershell got right, whatever one may think about the rest of it.
Dear anal_reactor, what is a "string array"? I have used unix shells since nearly 30 years and never heard about them. And I consider myself a script-fu master!
There are two array-like constructions in the shell: list of words (separated by spaces) and list of lines (separated by newlines). Both cases are implemented as a single string, and the shell makes it trivial to iterate through its components.
That is exactly the problem many people have with it. Encoding „arrays“ this way is foreign to everyone who comes from „normal“ programming languages. Both variants lead to problems because either character can occur in elements, worst case scenario they contain both at the same time. I can see why this leads to confusion and bugs.
It’s like people saying they won’t learn French because it has a different grammatical structure. There’s no “normal” natural language. If you’re used to the C-like syntax, learning C-like language will be easy. But that’s not an argument to say Lisp is confusing.
That's why I put normal in quotes. There is however more to it than having a different grammatical structure: It works different from many commonly used languages that have actual arrays/lists where elements can contain anything the type allows. If you come from any of the common modern programming languages (lets say Java, Kotlin, C#, JS/TS, Python, Swift, Go, Rust, etc.) and expect something similar (because many of them are very similar) you will be confused. Using spaces or newlines to encode elements in a single string is just not robust and leads to easy to make mistakes.
Most of these languages were created long after bash and the other shells. The fact is that shell scripts allows for unquoted strings and quoting is a specific operation, not syntax. Also shell scripts were meant for automations, not for writing general programs. The basic units are commands, arguments, input, output, files,… so the design makes these easy to manipulate.
I’m not saying that we can’t improve, but I’m more in favor of making the tool more apt to solve a problem than making it easier to learn. Because the latter often wants to forego the requirement of understanding the problem space.
Yes, these are newer. I mainly wanted to make the point that it is confusing if you are new to bash and come from these newer languages with the wrong expectations. The concise nature and many subtle details makes it very difficult for beginners and infrequent users.
Compare this to the newer programming languages where you explicitly call something with speaking names like .Trim(), .EndsWith(), support from compiler and IDE.
In my experience automation and general programs often are the same thing once things get more complicated. Bash scripts usually grow rapidly and are a giant PITA to maintain or refactor. Throw in build systems and helper scripts and you quickly receive a giant pile of spaghetti. Personally I just switch to one the mentioned programming languages once it goes above a simple sequence of operations.
Personally I don't see how to improve it much without becoming a full blown programming language, at which point it would probably make more sense to just release a library for common automation tasks that is also composable. Maybe I'm just not the right target audience.
The issue with your otherwise good reply is that someone are bringing expectations to an expert tool (programming languages, software, OS) and blidly assuming that everything will work as he thinks it should. Familiarity helps with learning, but shouldn’t replace it. Someone new to bash should probably start with a book.
And for bigger automation projects, there are lots of projects and programming languages that can help.
I agree it is an issue but it is how many people work and think. Most of the time they are not even wrong. "Hey, I have variables and loops, I know that!".
I would even make the case for expert tools being as unsurprising and familiar as possible unless there is a very good reason for them not to. Also they should be robust against misuse and guide the user towards good practices. There are always beginners, people that rarely need to use it, people that do programming as "just a job" and people that make mistakes because they are distracted, tired or just human. Something like "rm -r /" is a good reminder of that for many people.
Plus there are already a lot of tools required. Reading a book about every tool I have to use would be unpractical for most projects. Maybe more expert tools should just be tools. The same way I can now just use Ubuntu and get a working desktop system including drivers for most common hardware. If I compare that to the past where I installed a Linux distribution and then found out I lack a driver for my network card but I need to download it from the internet... I still can modify my system if I need to, but it's nice that I don't have to. I think we can do similar things with many parts of development and free some capacity for other tasks.