I was interviewed by Niki and Jeff on the OpenStack Podcast this morning.
I had a blast.
This morning I took part in a panel discussion on the subject “Cable’s cloud forecast: More apps and infrastructure“. It was held at the annual cable industry engineering forum, the SCTE Expo, in Denver, which meant that the audience was very heterogeneous. (Far more so than most software conferences.) The moderator, Comcast CTO Tony Werner, mentioned that I was wearing my Google Glass, so of course I had to take a picture of the audience:
During the discussion I emphasized the fact that hybrid application patterns were going to be the norm, and that the biggest challenge would be adapting both business and operational decision making and governance to catch up with the speed of the cloud.
CED posted a summary of the panel discussion here. It seemed to go down pretty well.
Last December I joined Cisco, and over the last nine months I’ve frequently been asked what my role is here. I didn’t say much about it, mostly because I was still figuring things out. However at this point everything looks pretty stable, and I’m pretty happy about nailing my colors to the mast.
In one sentence: I’m the OpenStack architect in Cisco Cloud Services (CCS), which is a Federated, Multi-tenant, Intercloud Service. Let me unpack that mouthful, from right to left.
First, Service. I’m working in the Cisco Services organization, and CCS is first and foremost a service, built and operated by Cisco. Other parts of the company are working on cloud-related products, including our new joint initiative with Red Hat. Still others work on upstreaming OpenStack plugins and drivers for various Cisco networking products. Our group is laser-focussed on building and operating a service, not selling products.
Second, Intercloud. CCS is a cloud service, similar to that provided by other cloud service providers. It’s based on the OpenStack IaaS architecture, to which we are adding various capabilities and services to meet the Cisco Intercloud hybrid cloud vision described by Rob Lloyd and Faiyaz Sharpurwala earlier this year. We’re using Cisco’s UCS converged infrastructure together with the Application Centric Infrastructure fabric from Insieme. And we’re building a cloud application marketplace which will provide access to CCS, partner applications, and Cisco SaaS services for our partners.
Third, Multi-tenant. Originally CCS was developed to support Cisco SaaS applications such as WebEx and EnergyWise. This involved building out a private cloud service in several global data centers, with a shared backbone network, while leveraging Cisco IT services. In March, we pivoted, extending CCS to include a variety of non-Cisco partners. Some will use CCS to extend their own hybrid cloud operations; others plan to resell a “white label” CCS to their own customers. Some CCS regions will be deployed in Cisco data centers and others in the facilities of our partner, such as Telstra, but they will all be owned and operated by Cisco. Every region will be fully multi-tenant, hosting workloads from any of our partners. Virtual machines from Telstra and its customers will run side-by-side with VMs from WebEx, with full security and compliance.
Fourth, Federated. To make all of this work requires a deep integration with our partners. Hybrid operations are complex, especially in the areas of network integration, global scale, service assurance, capacity management, OSS/BSS and identity management. Cisco and its technology partners are investing heavily in delivering these capabilities, which go far beyond what a generic OpenStack cloud provides.
So we’re building a state-of-the-art cloud service. We’re using Cisco technologies, and collaborating with Cisco partners such as Red Hat and Citrix, but at the end of the day our goals is to deliver a world-class service as a “black box”. As the Cisco CTO, Padmasree Warrior, made clear 5 years ago, we are not going head to head with Amazon. You can’t go to
cloud.cisco.com and sign up for public cloud services from Cisco. But almost everybody will wind up consuming CCS services through our partners, leveraging the global reach, federated integration, and network capabilities that we’ll bring to bear. And because of our business model, CCS has to deliver all of the capabilities of a public cloud, and then some.
Why are we doing this? Isn’t the global cloud business pretty much sewn up? I don’t believe so. True hybrid clouds – “Interclouds” – are challenging, and most of the complexity lies in the networking. I think that Cisco has a huge opportunity, because enterprises and service providers view us as a trusted partner who can help them to solve the problems of hybrid integration, and do so in ways that other cloud service providers cannot. In earlier blog posts, I came to the conclusion that the sweet spot for OpenStack was in supporting SaaS workloads. CCS starts with this and builds on it in a way that I think has compelling business value. So that’s what we’re doing. And it’s insanely exciting.
@geoffarnold: @hui_kenneth @brianmccallion That’s why the studied ambiguity in the HP presser was so interesting. Should generate many tweets….
I never expected HP to have the wits for this… that’s such a blinding move. Very impressed…This is a good play by HP – a lot better than relying on OpenStack… you know my opinion on OpenStack, it hasn’t changed – collective prisoner dilemma etc.
Others hailed this as a brilliant move by HP, even though there was absolutely no information provided on how (or if) HP was going to use the Eucalyptus technology. Some assumed that HP would offer both Helion and Eucalyptus; others such as Ben Kepes concluded that it was an acqui-hire that signalled the failure of Eucalyptus.
But will HP have the freedom to keep going with Eucalyptus, either as a parallel effort or as a source of AWS compatibility features for Helion? Barb Darrow explored that, and found varying opinions on whether the AWS API license that Amazon granted Eucalyptus would survive the takeover. Lydia Leong seems to think that HP has some latitude in this regard.
Personally, I think that this is likely to turn out as a pure acqui-hire. Marten is an excellent choice to lead HP’s cloud efforts, particularly after Biri’s departure and the reboot that we saw at the Atlanta Summit. The idea of adding significant AWS compatibility to OpenStack is an idea whose time has past. Readers of this blog will know that I was a strong supporter of this, but it would have required a community-wide commitment to limit semantic divergence from AWS. (Replicating the syntax of an API is easy; it’s the semantics that cause the problems.) I suppose it’s possible that HP might try to contribute a new Euca-based AWS compatibility project to OpenStack, but I doubt that the community would be very receptive…
PS For me, the biggest surprise is that it was HP that made this move. I half expected IBM to grab Eucalyptus and use it to transform SoftLayer into an AWS-compatible hybrid of Eucalyptus and CloudStack, rather than the current hybrid OpenStack-CloudStack. I guess I should stick to my day job.
UPDATE: Marten’s positioning Eucalyptus as a value-added contribution to OpenStack. And within a couple of minutes he (a) said that one of the values he brings is that he’s not afraid to point out the weaknesses in OpenStack, and (b) declined to express any criticism. Oh well. And now the previous Martin (Fink) is hand-waving about AWS as a design pattern. Sigh.
Over at Information Week, Andrew Froehlich pleads “Don’t Give Up On OpenStack“. And I’m not. But as I commented, the change that is needed conflicts with one of the deepest impulses in any community-based activity, from parent-teacher organizations to open source software projects:
The hardest thing for an open source project to establish is a way of saying “no” to contributors, whether they be individual True Believers in Open Source, or vast commercial enterprises seeking an architectural advantage over their competitors. The impulse is always to do things which increase the number of participants, and it is assumed that saying “no” will have the opposite effect. In a consensus-driven community, this is a hard issue to resolve.
Last week I wrote the following on FaceBook:
I need to write a blog post about open source: specifically about those people who respond to any criticism of a project by saying, “The community is open to all – if you want to influence the project, start contributing, write code.” Because, seriously, if you want your project to be really successful, that attitude simply won’t work. It doesn’t scale. You want to have more users than implementors, and those users need a voice. (And at scale, implementors are lousy proxies for users.)
Maybe this weekend.
This is that blog post. It has been helped by the many friends who added their comments to that FaceBook post, and I’ll be quoting from some of them.
Over the last few months, I’ve been trying to figure out why I’m uncomfortable with the present state of OpenStack, and whether the problem is with me or with OpenStack. Various issues have surfaced – the status of AWS APIs, the “OpenStack Core” project, the state of Neutron, the debate over Solum – and they all seem to come down to a pair of related questions: what is OpenStack trying to do, and who’s making the decisions? And my discomfort arises from my fear that, right now, the two answers are mutually incompatible.
First, the what. That’s pretty simple on the surface. We can look at the language on the OpenStack website, and listen to the Summit keynotes, and (since actions speak truer, if not louder, than words) see the kind of projects which are being accepted. OpenStack is all about building a ubiquitous cloud operating system. There is to be one code base, with an open source version of every single function. And although the system will be delivered through many channels to a wide variety of customers, the intent is that all deployed OpenStack clouds will be fundamentally interoperable; that to use the name OpenStack one must pass a black-box compatibility test. (That’s the easy bit; the more fuzzy requirement is that your implementation must be based on some minimum set of the open source code.) This interoperability is motivated by two goals: first, to avoid the emergence of (probably closed) forks, and second to enable the creation of a strong hybrid cloud marketplace, with brokers, load-bursting, and so forth.
This means that the OpenStack APIs (and, in some less well defined sense, the code) are intended to become a de facto standard. In the public cloud space, while Rackspace, HP, IBM and others will compete on price, support, and added-value services, they are all expected to offer services with binary API compatibility. This means that their customers, presumably numbering in the tens or hundreds of thousands, are all users of the OpenStack APIs. They have a huge stake in the governance and quality of those APIs, and will have their opinions about how they should evolve.
How are their voices heard? What role do they play in making the decisions?
Today, the answers are “they’re not” and “none”. Because that’s not how open source projects work. As Simon Phipps wrote, “…open source communities are not there to serve end users. They are there to serve the needs of the people who show up to collaborate.” And he went on to say,
What has never worked anywhere I have watched has been users dictating function to developers. They get the response Geoff gave in his original remarks, but what that response really means is “the people who are entitled to tell me what to do are the people who pay me or who are the target of my volunteerism — not you with your sense of entitlement who don’t even help with documentation, bug reports or FAQ editing let alone coding.”
As I see it, there are two basic problems with this thinking. First, it doesn’t scale. If you have hundreds of thousands of users, of whom a small percentage want to help, and a couple of dozen core committers, the logistics simply won’t work. (And developers are often very bad at understanding the requirements of real users.) Maybe you can use the big cloud operators and distribution vendors as proxies for the users, but open source communities can be just as intolerant of corporate muscle as they are of users who don’t get involved. Some of that can be managed — much of the resources that Simon terms “volunteerism” actually comes from corporations — but not all.
But the second problem is that the expectation of collaborator primacy is incompatible with the goal of creating a standard. The users of OpenStack have a choice, and the standard will be what they choose. If the OpenStack community wants their technology to become that standard, they must find a way to respect the needs and expectations of the users. And that, quid pro quo, means giving up some control, which will affect what the community gets to do — or, more often, not do.
This is a much bigger issue for OpenStack than for other open source projects, for a couple of reasons. Several of the most successful projects chose to implement existing de facto or de jure standards — POSIX, x86, X Windows, SQL, SMB, TCP/IP. For these projects, the users knew what to expect, there were alternative (non-FOSS) alternatives, and the open source communities accepted the restrictions involved in not compromising the standards. Other projects were structured around a relatively compact functional idea — Hadoop, MongoDB, Xen, MemCache, etc. — and the target audience was relatively small: like-minded developers. OpenStack is neither compact, nor (by choice) standards-based. It has a large number of components, with a very large number of pluggable interfaces. The only comparable open source project is OpenOffice, which had a similarly ambitious mission statement and has also had its share of governance issues.
One obvious step towards addressing the problem would be to distinguish between the “what” and the “how” of OpenStack. Users are interested in the APIs that they use to launch and kill VMs, but they are unlikely to know or care about the way in which Nova uses Neutron services to set up and tear down the network interfaces for those VMs. Good software engineering principles call for a clear distinction between different kinds of interfaces in a distributed system, together with clear policies about evolution, backward compatibility, and deprecation for each type. This is not simply a matter of governance. For example, a user-facing interface may have specific requirements in areas such as security, load balancing, metering, and DNS visibility that are not applicable to intrasystem interfaces. There is also the matter of consistency. Large systems typically have multiple APIs — process management, storage, networking, security, and so forth — and it is important that the various APIs exhibit consistent semantics in matters that cut across the different domains.
The adoption of some kind of interface taxonomy and governance model seems necessary for OpenStack, so that even if community members have to relinquish some control over the user-facing interfaces (the “what”), they still have unfettered freedom with the internal implementation (the “how”). Today, however, we are a long way from that. OpenStack consists of a collection of services, each with its own API and complex interdependencies. There is no clear distinction between internal and external interfaces, and there are significant inconsistencies between the different service APIs. At the last OpenStack Summit I was appalled to read several BluePrints (project proposals) which described changes to user-facing APIs without providing any kind of external use-case justification.
The present situation seems untenable. If OpenStack wants to become an industry standard for cloud computing, it will have to accept that there are multiple stakeholders involved in such a process, and that the “people who show up to collaborate” can’t simply do whatever they want. At a minimum, this will affect the governance — consistency, stability — of the user-facing interfaces; in practice it will also drive functional requirements. Without this, traditional enterprise software vendors delivering OpenStack-based products and services will have a hard time reconciling their customers’ expectations of stability with the volatility of a governance-free project. The bottom line: either the community will evolve to meet these new realities, or OpenStack will fail to meet its ambitious goals.
UPDATE: Rob Hirschfeld seems to want it both ways in his latest piece. Resolving the tension between ubiquitous success and participatory primacy doesn’t necessarily require a “benevolent dictator”, but that’s one way of getting the community to agree on a durable governance model.
My blog yesterday Whither OpenStack was already too long, but I wish I’d included some of the points from this outstanding piece by Bernard Golden on AWS vs. CSPs: Hardware Infrastructure. You should read the whole thing, but the key message is this:
Amazon, however, appears to hold the view that it is not operating an extension to a well-established industry, in which effort is best expended in marketing and sales activities pursued in an attempt to build a customer base that can be defended through brand, selective financial incentives, and personal attention. Instead, it seems to view cloud computing as a new technology platform with unique characteristics — and, in turn, has decided that leveraging established, standardized designs is inappropriate. This decision, in turn, has begotten a decision that Amazon will create its own integrated cloud environment incorporating hardware components designed to its own specifications.
As you read Bernard’s piece, think about the architecture of the software that transforms Amazon’s custom hardware into the set of services which AWS users experience. It has about as much in common with, say, OpenStack’s DevStack (or, too be fair, Eucalyptus FastStart – sorry, Mårten!) as does a supertanker to a powerboat.
In this world, you can’t start small and then “add scale”; the characteristics needed to operate at extreme scale become fundamental technical (and business) requirements that drive the systems architecture.
This is the challenge that Rackspace, HP and others face. Their core software was designed – is, continually, being evolved – by a community that doesn’t have those architectural requirements. This is not a criticism; it’s simply the reality of working in a world defined by things like:
- Tempest should be able to run against any OpenStack cloud, be it a one node devstack install, a 20 node lxc cloud, or a 1000 node kvm cloud.
and this piece on OpenStack HA (read through to see all of the caveats). I know the guys at HP and Rackspace: they are smart, creative engineers, and I’m sure they can build software as good as AWS. The question is, can they do so while remaining coupled to an open source community that doesn’t have the same requirements?
OpenStack’s sweet spots seem to be SaaS providers and carriers. Public deployments will struggle; private clouds are difficult and may be ephemeral.
It’s two weeks after the OpenStack Summit in Hong Kong and one week after the AWS re:Invent event in Las Vegas, and social media is full of passionate debate about the state of OpenStack, the future of private clouds, the juggernaut that is AWS, and more.
For those less Twitter-obsessed than I, here are a few of the key pieces:
to whom it may concern
What I saw at the OpenStack Summit
Why vendors can’t sell OpenStack to enterprises
Not Everyone Believes That OpenStack Has Succeeded
Inside OpenStack: Gifted, troubled project that wants to clobber Amazon
OpenStack Wins Developers’ Hearts, But Not IT’s Minds
The last twelve months
The End of Private Cloud – 5 Stages of Loss and Grief
Most of the discussion is focussed on the Holy Grail of “enterprise”, and that was certainly the focus of re:Invent. But that’s not the only market for OpenStack; as I wrote in “A funny thing happened on the way to the cloud” we’ve had substantial “mission creep” since the days of the NIST taxonomy. Different members of the community are interested in addressing different kinds of use cases with OpenStack. How is this affecting the architecture and processes of OpenStack? Is it practical for OpenStack to serve all of these needs equally well, and what are the costs of doing so?
There are some pundits (@krishnan, for instance, and @cloudpundit) who argue that OpenStack’s role is to be a kit of parts from which different organizations – vendors and large users – will assemble a variety of solutions. On this view, it doesn’t particularly matter if the APIs for different OpenStack services are somewhat inconsistent, because the creator of the public cloud or distribution will do the necessary work on “fit and finish”; if necessary they may replace an unsuitable service with an alternative implementation. (At the extreme end of that camp we have people like @randybias who want to replace the entire API with an AWS workalike.) On the other hand, there is a movement afoot, led by @jmckenty, @zehicle and others, to develop a certification process to improve interoperability of OpenStack implementations in the service of hybrid deployments and to help to grow the developer ecosystem. Rather than asking which of these is the “right” position, it’s probably more instructive to see how the OpenStack community is actually behaving.
There seem to be five distinct areas where OpenStack is being used:
- Public IaaS cloud – Rackspace, HP, etc.
- SaaS provider – PayPal, Yahoo, Cisco WebEx
- Carrier infrastructure – AT&T, Verizon
- Private IaaS cloud (often hosted)
- Enterprise datacenter automation
Most of these are fairly self-explanatory, but the distinction between the last two is important. Both are typically enterprise or government customers. The first is usually a greenfield deployment with a “clean sheet” operational philosophy ; the second is an attempt to provide some automation and self-service to an existing enterprise data center, complete with heterogeneous infrastructure and traditional operational policies.
Let’s see how OpenStack is doing in each of these areas:
Public IaaS cloud
Public cloud service is all about the economics of operation at scale. Stable interfaces – both APIs and tools. Consistent abstractions, so that you can change the implementation without breaking the contract with your customers. Measuring everything. Automating the full lifecycle. Capacity planning is key.
OpenStack has been shortchanging this area. The API story is weak, with too many changes without adequate compatibility. The default networking framework doesn’t really scale, and alternatives like NSX, Nuage, OpenContrail and Midonet simply replace all of the Neutron mechanisms. (They don’t necessarily interoperate with all of the vendor-supplied Neutron plugins.) Mechanisms for large-scale application deployments, like availability zones and regions, are implemented inconsistently across the various services.
On the other hand, public clouds are typically (or ideally!) operated at a large enough scale that, as Werner Vogels put it, “software costs round to zero”. So they can afford to throw engineering resources at filling the gaps and fixing the issues.
The most difficult issue for public clouds based on OpenStack is around features. The main competitors are AWS, Google, and Microsoft, all of which can add new services, focussed on customer requirements, much more quickly than the OpenStack community. Rackspace, HP and others face a dilemma: do they wait for the OpenStack community to define and implement a new service, or do they create their own service offerings that are not part of the OpenStack code base? Waiting for the community cedes the market to the proprietary competition, and has other complications, such as the requirement that there has to be an open source reference implementation of every OpenStack service, and the potential for compromise to address the needs of different parts of the community. Proceeding independently may help to close the competitive feature gap, but it’s likely to lead to substantial “tech debt” and/or compatibility issues when the community finally gets round to delivering a comparable service.
A SaaS provider combines the operational scale of a public cloud with the captive tenant base of a private cloud. Large-scale networking issues dominate the architectural discussion. The dominant KPI is likely to be “code pushes per day”. API issues are less critical, since there is usually a comprehensive home-grown applications management framework in use. As with the public cloud, the SaaS provider has the expertise and engineering resources to do large scale customization and augmentation.
OpenStack is serving this constituency relatively well, although scalability remains a concern.
Wireless and wire-line carriers are looking forward to NFV, which will allow them to replace dedicated networking infrastructure with virtualized software components that can be deployed flexibly and efficiently. Is is therefore not surprising that they are interested in infrastructure automation technologies that will facilitate the deployment of VMs and the configuration of their networks. What distinguishes the carriers from other OpenStack users is that their applications often cut across the typical layers of abstraction, particularly with respect to networking. In a public IaaS, the tenant VMs interact with virtualized networking resources – ports, subnets, routers, and load balancers. They have no visibility into the underlying technologies used to construct these abstractions: virtual switches, encapsulation, tunnels, and physical and virtual network appliances. This opaque abstraction is important for portability and interoperability. For carriers, it is often irrelevant: their applications may perform direct packet encapsulation, and can manipulate the chain of NFV services.
There’s a lot of interest in these use cases within OpenStack today. One obvious concern relates to the status of the APIs involved. Public cloud providers probably won’t want their tenants diving in to manipulate service chaining, or getting access to the MPLS or VXLAN configuration of the overlay network. Today the only way of limiting access to specific OpenStack APIs is the Keystone RBAC mechanism, which doesn’t enforce any kind of semantic consistency. One solution might be to package up specific APIs into different OpenStack “Editions”.
It seems likely that the specific use cases for OpenStack in managing carrier infrastructure are sufficiently bounded that the lack of major application services will not be a problem.
Private IaaS cloud
There is a persistent belief that enterprise customers want – and need – private IaaS clouds. Not IaaS-like features bolted on to their existing infrastructure, but pure NIST-compliant IaaS clouds that just happen to be private, running on wholly-owned physical infrastructure. There are several arguments advanced for this. One – InfoSec – is probably unsustainable: public clouds invest far more in security and compliance than any enterprise could hope to, and the laws and regulations will soon reflect this. The second – cost – is occasionally valid, but widely abused: ROI analyses rarely take into account all costs over a reasonable period of time. In addition, the benefits of an IaaS cloud usually depend on the development of new, cloud-aware applications, and such applications can usually be designed to operate more cost-effectively in a public cloud.
So how’s OpenStack doing for private clouds? Not very well. The cost and complexity of deploying OpenStack is extremely high, even if you work with an OpenStack distribution vendor and take advantage of their consulting services. Yes, there are plenty of tools for doing an initial deployment (too many), but almost none for long-term maintenance. To achieve enterprise-grade operational readiness you’ll have to supplement OpenStack with at least a dozen additional open source or commercial tools*, and do the integration yourself; then you’ll be responsible for maintaining this (unique) system indefinitely.
Analyst surveys suggest that most enterprises are looking at private clouds as part of a hybrid cloud strategy. In this case, the lack of high-fidelity compatibility with most public clouds is going to be a problem. There are actually two issues: API interoperability (e.g. good support for the AWS APIs in OpenStack), and feature mismatch (AWS has more, richer features than OpenStack, and the gap is growing).
Once upon a time, the private cloud was seen as a radical alternative to the traditional enterprise datacenter: an opportunity to replace bespoke server and networking configurations with interchangeable pools of infrastructure, and to deliver automated self-service operations in place of bureaucratic human procedures. Great emphasis was placed on the need to design the cloud service from the top down, focussing on the requirements of the users, rather than viewing it as a layer on top of existing enterprise virtualization systems. It was (correctly) assumed that many traditional data center management practices would be incompatible with the kind of automation provided by cloud management platforms like OpenStack and CloudStack.
Unfortunately, many enterprises felt the need to try to cut corners: to deploy IaaS within their existing data center environment, leveraging existing infrastructure. Some literally treated the cloud as “just another large application cluster”. Many of these early experiments failed, because of the difficulty of making cloud operations conform to existing policies. The number of successful projects of this kind is a matter of debate.
The OpenStack project has been doing a lot to facilitate this kind of deployment. Brocade and its partners have integrated FC SAN support into the Cinder storage service, and we’ve proposed improvements to Neutron that will make it much easier to use heterogeneous network resources from different vendors. Mirantis has worked with VMware to allow OpenStack to be deployed on top of vSphere, and Nova now supports the use of several different hypervisors within a single cloud. (The latter is presumably to cater to applications which are sensitive to specific hypervisor features – something that no modern cloud-ready application should care about.)
This work to accommodate legacy infrastructure is obviously addressing a real need. It’s worth asking what the cost has been, particularly in complexity, stability, API governance, and opportunity cost. Could we have delivered a decent load-balancing solution earlier? Would we have a more scalable L3 capability? Hard to tell.
So where does this leave us? It seems to me that the sweet spot for OpenStack today (and for some time to come) is going to be with the SaaS provider, such as PayPal, Cisco WebEx, and Yahoo. (I wonder if the recent announcement by Salesforce.com and HP means that SFDC will be moving in that direction.) Carriers will happily do their own thing, with potentially awkward implications for networking APIs. Public clouds will face the challenge of back-porting their (many) changes to the trunk, and figuring out how to keep up with AWS. And enterprise use will continue to be challenged by the complexity and cost of setting up and then maintaining private clouds, whether green-field or add-in.
* E.g. API management, identity integration, guest OS images, DNS, SIEM, monitoring, log analysis, billing, capacity planning, load testing, asset management, ticket management, configuration management
Following up on my unexpectedly viral piece about Seagate’s Ethernet-attached disk architecture (40,000 hits so far), here are a couple of pictures from the OpenStack Design Summit session on Swift + Kinetic:
Updated to add one more picture. (When I was at Sun, I used to say that the truth was not in press releases or Powerpoint decks but in t-shirts.)
And yes, I want one of those t-shirts!
It’s a beautiful Sunday afternoon here in Silicon Valley. It’s been a good weekend for sports: Manchester United won (finally!), Sebastian Vettel claimed his fourth F1 Driver’s Championship, and the Patriots came from behind to thrash Miami. And then there’s the Red Sox; oh well, three out of four isn’t bad. But all of those events are sitting on my DVR, because for the next week I’m focussed on one thing: preparing for the upcoming OpenStack Summit.
This time next week I’ll be in Hong Kong as part of the Brocade team, joining thousands of cloud computing technologists, users, salespeople, and writers for a week of business and technical sessions. My main focus will be on the Design Summit sessions for Neutron, the OpenStack networking subsystem formerly known as Quantum. My colleagues will be involved in a variety of areas, including FC SAN features for the Cinder storage service, load balancers, and integration of our VCS fabric. Several of them are presenting in the main Summit. And we’ll all be talking to customers and partners.
OpenStack networking is complicated. This is mostly due to the fact that data center networking is going through a period of massive disruption in several different areas, leading to a combinatorial explosion of complexity. Overlay architectures, different kinds of tunneled underlay, the replacement of dedicated network equipment by software running in VMs, the emergence of controller-based SDN such as the OpenDaylight project, and the spectacular performance improvements in merchant silicon and x86 processors: these have resulted in many innovative products from startups and established vendors, all of whom are keen to participate in OpenStack. In part, it’s because the OpenStack mission has been expanding from a simple EC2-style IaaS to include legacy data center automation and carrier NFV. Public clouds emphasize abstraction and multi-tenant isolation, features which are less relevant for other users of the technology, and it’s challenging to develop abstractions and APIs which address all of the use cases. There is still a lively debate on which parts of OpenStack are “core” elements of every OpenStack system. (Indeed the original Nova networking system is still the default; deprecation is planned for the upcoming Icehouse cycle.)
In this exciting and unpredictable environment, my team has been working on a project to manage some of the diversity. In our Dynamic Network Resource Manager (DNRM) Blueprint, we’re proposing a framework for managing the pool of physical and virtual network resources from multiple vendors. It borrows an idea from the OpenStack Nova scheduler: the use of a policy-based resource allocator that abstracts away the complexity of resource management, and allows each cloud operator to choose the resource allocation policy which fits their environment.
We’re demonstrating a proof-of-concept implementation of DNRM that uses the Brocade Vyatta vRouter, probably the most widely used virtual networking appliance. The DNRM resource manager uses Nova to provision a number of Vyatta virtual machines. Then a modified API handler in Neutron intercepts each client request to create an L3 Router, calls the policy-based DNRM allocator to find the best resource instance, examines the type of resource, and calls the appropriate driver (in this case the Vyatta driver) which talks to the VM to configure the vRouter. All of this can be viewed in the OpenStack Horizon dashboard; we’ve added a new panel which displays the state of the resource pool.
The Blueprint explores a range of use cases that are supported by the DNRM framework. Several of Brocade’s customers are particularly interested in the ability to allocate virtual appliances for dev/test networks and physical systems for production traffic, without changing any code. Others focus on the way it supports resources from multiple vendors, or the ability to choose specific resources to meet compliance requirements.
Inevitably such a comprehensive mechanism as DNRM overlaps several projects within Neutron, including the FWaaS, LBaaS, and VPNaaS work. In recent weeks we’ve been meeting with many of the other contributors to OpenStack to thrash out the details of what a final architecture should look like. I’m looking forward to the Design Summit sessions in Hong Kong, which should lead to agreement on a program of work for the next Icehouse release of OpenStack. It’s going to be complicated, for the reasons that I already mentioned, but I think this increasing complexity emphasizes the need to provide cloud operators with policy-based automation tools.
And when I get back from Hong Kong on the 10th, I’ll see which of those sporting events I still want to watch!