> std::string is a terrible idea with regards to efficiency. If it works for you chances are you should use Python instead
Concrete examples of what you mean would help here.
> For anything complex, std::string is slow and has a huge memory footprint (each string has its own buffer, separately allocated from the heap).
Some implementations (e.g., GNU's) will inline small strings, and so avoid a heap allocation. Python strings always heap allocate; if you're referring to the fact that Python strings don't have to chase a pointer to the data (the string data is inlined in the Python object itself), that might have some marginal effect on speed, but for most cases it really isn't going to matter. I'm not seeing the memory comparison: `std::string` is 32 bytes on my system; Python's str reports as 49. (+ character data for each.)
What would you prefer std::string did to make it more amenable to "a few megabytes of data"? (A case that, honestly, I think it handles just as well as most language's string type.)
(Now, if we want to harp on std::string's Unicode support, I won't stand in your way.)
> Concrete examples of what you mean would help here.
If it works for you, chances are that you are doing something that Python is equally suited for. And much more convenient. (Not necessarily, though).
> I'm not seeing the memory comparison
I didn't make one. I was saying something similar to "AVL trees vs RB trees largely doesn't matter. If you care about efficiency, you don't use either since they're both slow. If you don't care, the small speed difference doesn't matter".
> What would you prefer std::string did to make it more amenable to "a few megabytes of data"? (A case that, honestly, I think it handles just as well as most language's string type.)
Start by identifying strings that are immutable once they're constructed. These are usually the lion's share. Then start pooling these strings to avoid allocation overhead. This brings the overhead of a string down to a pointer or offset, and optionally a length, which costs anywhere from 4 to 16 bytes. Then, think about string interning to weed out duplicates.
This reduces memory consumption significantly. Depending on average string size, up to like 80% saved. And maybe more importantly, you can pass around strings freely without any overhead - string handles fit in a single register.
> (Now, if we want to harp on std::string's Unicode support, I won't stand in your way.)
Unicode is a mess. For all I've ever done, byte arrays (UTF-8) were the right choice.
Concrete examples of what you mean would help here.
> For anything complex, std::string is slow and has a huge memory footprint (each string has its own buffer, separately allocated from the heap).
Some implementations (e.g., GNU's) will inline small strings, and so avoid a heap allocation. Python strings always heap allocate; if you're referring to the fact that Python strings don't have to chase a pointer to the data (the string data is inlined in the Python object itself), that might have some marginal effect on speed, but for most cases it really isn't going to matter. I'm not seeing the memory comparison: `std::string` is 32 bytes on my system; Python's str reports as 49. (+ character data for each.)
What would you prefer std::string did to make it more amenable to "a few megabytes of data"? (A case that, honestly, I think it handles just as well as most language's string type.)
(Now, if we want to harp on std::string's Unicode support, I won't stand in your way.)