The problem is they did an exceptionally poor job at designing their language. A reasonably large Terraform codebase is almost universally hard to read for one of two reasons: it's either unexpressive (read: verbose to the point it's hard to read) or modularized but hard to read because it's fragmented into a bajillion reusable modules.
SQL is also declarative, but incredibly expressive. A thousand character query contains enough complexity that it's hard to reason about. A thousand characters of Terraform will barely stand up a CRUD app on AWS.
Designing a language from first principles for this was a mistake. HCL is awful; they should have gone the Starlark route and made a stripped-down version of an existing language instead of making their own language from scratch. This feels like the worst of both worlds. The language is practically imperative, but it has its own syntax that isn't useful outside of this one single domain.
Anyway you shouldn't have too many resources in a single Terraform workspace, for performance reasons. The real issues with Terraform come when you start to want to orchestrate different workspaces triggering each other, and trying to write that orchestration language, which itself would be declarative.
Terraform built a Stacks feature, but support is Terraform Cloud-only. OpenTofu has issues in the area that have been open for years: https://github.com/opentofu/opentofu/issues/931https://github.com/opentofu/opentofu/issues/2860 and progress is slow, in part (IMO) because a genuine solution requires server-side evaluation (i.e. triggering applies as Kubernetes Jobs) and the open-source implementation of Terraform Enterprise/Cloud is a completely separate project with a completely different group of maintainers, Terrakube.
I'd argue the real issue with Terraform is that workspace orchestration is necessary in the first place. If they addressed the performance issues with large workspaces, then we wouldn't need to split up workspaces and Terraform could just orchestrate changes naturally.
The performance issues in large workspaces are due to needing to refresh status on all the resources in the large workspace before coming up with a plan. Actual apply time is either negligible or the inherently long amount of time it's supposed to take.
You split the workspace into smaller workspaces precisely to tell Terraform that you haven't made any changes to the networking layer, so don't bother trying to refresh the status of the networking layer to see if any changes are needed, it's not relevant when you're trying to scale up your Kubernetes cluster or whatever.
SQL is also declarative, but incredibly expressive. A thousand character query contains enough complexity that it's hard to reason about. A thousand characters of Terraform will barely stand up a CRUD app on AWS.
Designing a language from first principles for this was a mistake. HCL is awful; they should have gone the Starlark route and made a stripped-down version of an existing language instead of making their own language from scratch. This feels like the worst of both worlds. The language is practically imperative, but it has its own syntax that isn't useful outside of this one single domain.