How does this compare to systems like https://coccinelle.gitlabpages.inria.fr/website/ ? What will happen if you try to refactor a large code base the size of the Linux kernel with this apparently Java-based tool?
Background for folks not familiar with cocinnelle: the Linux kernel has a fairly extensive cocinelle setup. Cocinelle has been leveraged extensively for large tree-wide code transformations in the kernel.
OpenRewrite started at Netflix originally where it operated on Netflix's entire codebase (10s of thousands of repositories). The Moderne platform runs OpenRewrite recipes on hundreds of millions of lines of code.
Very naive question, why does Netflix has so much code? I don't mean it in a "they could do it with way less" way, I'm geniunly curious about all of the things that they are doing.
One of our early partners has over 250 million lines of Java code alone. I don't think they're unusual. Every large scale organization we've worked with has a similar situation though the circumstances are different.
For Netflix, the easy answer is probably "microservices", but also there is more going on behind the scenes than it seems. They had to become at least partially their own content producer, so there is a great deal of tech around the studio itself. There's a rich data platform. There are 200+ repositories in Netflix OSS, 100+ in Netflix Skunkworks, and then there are the larger projects like Spinnaker (50+ repositories).
Software engineering has reached a level of industrialization, it's not cottage industry anymore. And that's why I'm so fascinated by this project -- software supply chain activities are masquerading as technical debt, and engineers are left with the responsibility of it. It's time we start applying automation where we can.
Semgrep’s focus is on static analysis/search and is based on rules that developers need to write in a new DSL. Autofix is experimental and is one pattern replaced with another. https://semgrep.dev/docs/experiments/overview/
OpenRewrite originated to do transformations of code, specifically to remove a Netflix proprietary logging library and replace it with in SLF4J. The predecessor of OpenRewrite was Gradle Lint (https://github.com/nebula-plugins/gradle-lint-plugin), commonly used to update Gradle build configuration. OpenRewrite added search after transformation and search can be very flexible (search for all usages of a particular package/any method, not just a specific method invocation). Instead of being DSL based, OpenRewrite provides a set of building blocks called recipes that can be combined together to create more powerful recipes. When building blocks are not enough, you can write a custom recipe in the same language as what you are managing. Java for Java and TypeScript for JavaScript/TypeScript (coming soon).
I wonder what compelled the authors to focus on Java first, given that it accounts for an ever smaller chunk of the code space and market share has been decreasing year on year for quite some time.
It is the most popular compiled strongly typed language, so they went for both market share and ease of implementing refactoring (vs a dynamic language, like Python)
The majority of Java software I've seen is also in need of much refactoring, and tends to be overly complex, which makes the former more difficult to do.
It's also incredibly tedious when you do set about modernizing it. Fixing these things is more possible than you would initially think. Sometimes, we can get an app _almost_ all the way there, with just a little left to do: https://github.com/spring-cloud/spring-cloud-dataflow/pull/4...
A couple of us did workshops on monitoring and delivery automation around the world for a while, and kept seeing the same problems over and over again. Triangulated on those, and many business apps are written in Java and are struggling to keep up with the same set of OSS core. Seemed like a reasonable place to start.
OpenRewrite also supports YAML based IaaC transformation, XML, Terraform HCL, etc. Typescript language binding is being built now.
We feel like to do this well, you really have to respect the idiosyncracies of each language and build for them.