It’s time open source focused on usability
Open source is slouching towards individualization as every new framework or open source architecture has its own particular API, layers, or even wire protocol. In this article, Yaron Haviv explains why the open source community needs to work towards collaboration and standardization for the good of us all.
Public cloud adoption is growing rapidly. At the same time, cloud providers continue to announce a slew of products, all fully integrated and simple to use. Cloud services which have traditionally been proprietary, now go up in the stack. They’re not competing with infrastructure companies anymore – cloud services are now challenging established database vendors, open-source big data and container frameworks, security software and even developer and APM tools.
While open-source is a key ingredient, the public cloud is proving that usability and integration are more important to many customers than access to an endless variety of disjointed open source projects. It’s quite clear to customers that while proprietary cloud solutions may lock them in, simplicity and time to market still outweigh long-term freedom.
A wake-up call for the tech and software industry
We must favor collaboration, standardization and full stack integration over hundreds of overlapping open source projects.
Back in the day, we used to focus on creating modular architectures. We had standard wire protocols like NFS, RPC, etc. and standard API layers like BSD, POSIX, etc. Those were fun days. One could purchase products from different vendors and they actually worked well together and were interchangeable. There were always open source implementations of the standard, but developers also built commercial variations to extend functionality or durability.
The most successful open source project is Linux. We tend to forget it has very strict APIs and layers. Official standards (USB, SCSI…) must often back new kernel implementations, so that open source and commercial implementations live happily side by side in Linux.
If we contrast Linux with the state of open source today, we see a wide range of overlapping implementations. Take the big data ecosystem as an example: in many cases there are no standard APIs, or layers, not to mention standard wire protocols. Each processing framework (Spark, Flink, Presto) has its own data source API. Projects are not interchangeable, causing a much worse lock-in than commercial products which conform to a common standard.
How did we get here?!
The tech industry is going through monumental change driven by digital transformation. This effects the infrastructure and software stack dramatically. The old guard is in survival mode and we seem to be missing responsible tech leadership to define and build a modular stack for the new age. We’ve got to exercise responsibility and work together with a focus on integration, not code. All cloud providers should join in so that we can benefit from interchange/pluggable cloud services.
It’s good to see The Linux Foundation and its members try to make sense of this mess. CNCF and projects like Kubernetes have working groups (SIGs) to define APIs and implementations, open communications channels in Slack alongside agreed release schedules. The next step is greater collaboration across container frameworks as well.
There are hundreds of participating companies that make contributions, and it’s great when they’re able to generate revenue on parallel products or management tools which make user life simpler, as it enables them to finance their open source work. However, it is paramount that baseline architectures and APIs between layers are standardized, so that customers have the freedom to use different components.
I’d like to see the Apache big data echo-system and ODPi closing rank with the same approach of defining layered APIs. The recent common file abstraction (HCFS) is a good start, but there is a lot more work to be done in eliminating project overlap and API sprawl – and getting it done will help ODPi get more active participants.
I suggest we also seek collaboration between CNCF and ODPi, because it doesn’t make sense for big data and cloud efforts to have their own way of scheduling, security, networking and data management. After all, most big data solutions are deployed in the cloud!
Collaboration and standardization is the only way to get back to a decent user experience, one in which we can easily build, secure and operate integrated stacks from independent components, with the ability to swap parts if need be and without getting locked into project-specific or cloud provider APIs. If we won’t do this, we will all lose and increasingly become technically enslaved to proprietary clouds.