OpenStack’s sweet spots seem to be SaaS providers and carriers. Public deployments will struggle; private clouds are difficult and may be ephemeral.
It’s two weeks after the OpenStack Summit in Hong Kong and one week after the AWS re:Invent event in Las Vegas, and social media is full of passionate debate about the state of OpenStack, the future of private clouds, the juggernaut that is AWS, and more.
For those less Twitter-obsessed than I, here are a few of the key pieces:
to whom it may concern
What I saw at the OpenStack Summit
Why vendors can’t sell OpenStack to enterprises
Not Everyone Believes That OpenStack Has Succeeded
Inside OpenStack: Gifted, troubled project that wants to clobber Amazon
OpenStack Wins Developers’ Hearts, But Not IT’s Minds
The last twelve months
The End of Private Cloud – 5 Stages of Loss and Grief
Most of the discussion is focussed on the Holy Grail of “enterprise”, and that was certainly the focus of re:Invent. But that’s not the only market for OpenStack; as I wrote in “A funny thing happened on the way to the cloud” we’ve had substantial “mission creep” since the days of the NIST taxonomy. Different members of the community are interested in addressing different kinds of use cases with OpenStack. How is this affecting the architecture and processes of OpenStack? Is it practical for OpenStack to serve all of these needs equally well, and what are the costs of doing so?
There are some pundits (@krishnan, for instance, and @cloudpundit) who argue that OpenStack’s role is to be a kit of parts from which different organizations – vendors and large users – will assemble a variety of solutions. On this view, it doesn’t particularly matter if the APIs for different OpenStack services are somewhat inconsistent, because the creator of the public cloud or distribution will do the necessary work on “fit and finish”; if necessary they may replace an unsuitable service with an alternative implementation. (At the extreme end of that camp we have people like @randybias who want to replace the entire API with an AWS workalike.) On the other hand, there is a movement afoot, led by @jmckenty, @zehicle and others, to develop a certification process to improve interoperability of OpenStack implementations in the service of hybrid deployments and to help to grow the developer ecosystem. Rather than asking which of these is the “right” position, it’s probably more instructive to see how the OpenStack community is actually behaving.
There seem to be five distinct areas where OpenStack is being used:
- Public IaaS cloud – Rackspace, HP, etc.
- SaaS provider – PayPal, Yahoo, Cisco WebEx
- Carrier infrastructure – AT&T, Verizon
- Private IaaS cloud (often hosted)
- Enterprise datacenter automation
Most of these are fairly self-explanatory, but the distinction between the last two is important. Both are typically enterprise or government customers. The first is usually a greenfield deployment with a “clean sheet” operational philosophy ; the second is an attempt to provide some automation and self-service to an existing enterprise data center, complete with heterogeneous infrastructure and traditional operational policies.
Let’s see how OpenStack is doing in each of these areas:
Public IaaS cloud
Public cloud service is all about the economics of operation at scale. Stable interfaces – both APIs and tools. Consistent abstractions, so that you can change the implementation without breaking the contract with your customers. Measuring everything. Automating the full lifecycle. Capacity planning is key.
OpenStack has been shortchanging this area. The API story is weak, with too many changes without adequate compatibility. The default networking framework doesn’t really scale, and alternatives like NSX, Nuage, OpenContrail and Midonet simply replace all of the Neutron mechanisms. (They don’t necessarily interoperate with all of the vendor-supplied Neutron plugins.) Mechanisms for large-scale application deployments, like availability zones and regions, are implemented inconsistently across the various services.
On the other hand, public clouds are typically (or ideally!) operated at a large enough scale that, as Werner Vogels put it, “software costs round to zero”. So they can afford to throw engineering resources at filling the gaps and fixing the issues.
The most difficult issue for public clouds based on OpenStack is around features. The main competitors are AWS, Google, and Microsoft, all of which can add new services, focussed on customer requirements, much more quickly than the OpenStack community. Rackspace, HP and others face a dilemma: do they wait for the OpenStack community to define and implement a new service, or do they create their own service offerings that are not part of the OpenStack code base? Waiting for the community cedes the market to the proprietary competition, and has other complications, such as the requirement that there has to be an open source reference implementation of every OpenStack service, and the potential for compromise to address the needs of different parts of the community. Proceeding independently may help to close the competitive feature gap, but it’s likely to lead to substantial “tech debt” and/or compatibility issues when the community finally gets round to delivering a comparable service.
A SaaS provider combines the operational scale of a public cloud with the captive tenant base of a private cloud. Large-scale networking issues dominate the architectural discussion. The dominant KPI is likely to be “code pushes per day”. API issues are less critical, since there is usually a comprehensive home-grown applications management framework in use. As with the public cloud, the SaaS provider has the expertise and engineering resources to do large scale customization and augmentation.
OpenStack is serving this constituency relatively well, although scalability remains a concern.
Wireless and wire-line carriers are looking forward to NFV, which will allow them to replace dedicated networking infrastructure with virtualized software components that can be deployed flexibly and efficiently. Is is therefore not surprising that they are interested in infrastructure automation technologies that will facilitate the deployment of VMs and the configuration of their networks. What distinguishes the carriers from other OpenStack users is that their applications often cut across the typical layers of abstraction, particularly with respect to networking. In a public IaaS, the tenant VMs interact with virtualized networking resources – ports, subnets, routers, and load balancers. They have no visibility into the underlying technologies used to construct these abstractions: virtual switches, encapsulation, tunnels, and physical and virtual network appliances. This opaque abstraction is important for portability and interoperability. For carriers, it is often irrelevant: their applications may perform direct packet encapsulation, and can manipulate the chain of NFV services.
There’s a lot of interest in these use cases within OpenStack today. One obvious concern relates to the status of the APIs involved. Public cloud providers probably won’t want their tenants diving in to manipulate service chaining, or getting access to the MPLS or VXLAN configuration of the overlay network. Today the only way of limiting access to specific OpenStack APIs is the Keystone RBAC mechanism, which doesn’t enforce any kind of semantic consistency. One solution might be to package up specific APIs into different OpenStack “Editions”.
It seems likely that the specific use cases for OpenStack in managing carrier infrastructure are sufficiently bounded that the lack of major application services will not be a problem.
Private IaaS cloud
There is a persistent belief that enterprise customers want – and need – private IaaS clouds. Not IaaS-like features bolted on to their existing infrastructure, but pure NIST-compliant IaaS clouds that just happen to be private, running on wholly-owned physical infrastructure. There are several arguments advanced for this. One – InfoSec – is probably unsustainable: public clouds invest far more in security and compliance than any enterprise could hope to, and the laws and regulations will soon reflect this. The second – cost – is occasionally valid, but widely abused: ROI analyses rarely take into account all costs over a reasonable period of time. In addition, the benefits of an IaaS cloud usually depend on the development of new, cloud-aware applications, and such applications can usually be designed to operate more cost-effectively in a public cloud.
So how’s OpenStack doing for private clouds? Not very well. The cost and complexity of deploying OpenStack is extremely high, even if you work with an OpenStack distribution vendor and take advantage of their consulting services. Yes, there are plenty of tools for doing an initial deployment (too many), but almost none for long-term maintenance. To achieve enterprise-grade operational readiness you’ll have to supplement OpenStack with at least a dozen additional open source or commercial tools*, and do the integration yourself; then you’ll be responsible for maintaining this (unique) system indefinitely.
Analyst surveys suggest that most enterprises are looking at private clouds as part of a hybrid cloud strategy. In this case, the lack of high-fidelity compatibility with most public clouds is going to be a problem. There are actually two issues: API interoperability (e.g. good support for the AWS APIs in OpenStack), and feature mismatch (AWS has more, richer features than OpenStack, and the gap is growing).
Once upon a time, the private cloud was seen as a radical alternative to the traditional enterprise datacenter: an opportunity to replace bespoke server and networking configurations with interchangeable pools of infrastructure, and to deliver automated self-service operations in place of bureaucratic human procedures. Great emphasis was placed on the need to design the cloud service from the top down, focussing on the requirements of the users, rather than viewing it as a layer on top of existing enterprise virtualization systems. It was (correctly) assumed that many traditional data center management practices would be incompatible with the kind of automation provided by cloud management platforms like OpenStack and CloudStack.
Unfortunately, many enterprises felt the need to try to cut corners: to deploy IaaS within their existing data center environment, leveraging existing infrastructure. Some literally treated the cloud as “just another large application cluster”. Many of these early experiments failed, because of the difficulty of making cloud operations conform to existing policies. The number of successful projects of this kind is a matter of debate.
The OpenStack project has been doing a lot to facilitate this kind of deployment. Brocade and its partners have integrated FC SAN support into the Cinder storage service, and we’ve proposed improvements to Neutron that will make it much easier to use heterogeneous network resources from different vendors. Mirantis has worked with VMware to allow OpenStack to be deployed on top of vSphere, and Nova now supports the use of several different hypervisors within a single cloud. (The latter is presumably to cater to applications which are sensitive to specific hypervisor features – something that no modern cloud-ready application should care about.)
This work to accommodate legacy infrastructure is obviously addressing a real need. It’s worth asking what the cost has been, particularly in complexity, stability, API governance, and opportunity cost. Could we have delivered a decent load-balancing solution earlier? Would we have a more scalable L3 capability? Hard to tell.
So where does this leave us? It seems to me that the sweet spot for OpenStack today (and for some time to come) is going to be with the SaaS provider, such as PayPal, Cisco WebEx, and Yahoo. (I wonder if the recent announcement by Salesforce.com and HP means that SFDC will be moving in that direction.) Carriers will happily do their own thing, with potentially awkward implications for networking APIs. Public clouds will face the challenge of back-porting their (many) changes to the trunk, and figuring out how to keep up with AWS. And enterprise use will continue to be challenged by the complexity and cost of setting up and then maintaining private clouds, whether green-field or add-in.
* E.g. API management, identity integration, guest OS images, DNS, SIEM, monitoring, log analysis, billing, capacity planning, load testing, asset management, ticket management, configuration management