Archive for the 'Governance' Category
The 10 Commandments of Web Service Change Management
Web service governance covers many topics one of which is change management. How do you change and enhance existing web services without breaking the code of your existing clients? After more than five years experience delivering web services to some pretty mission-critical applications, Xignite has learned a thing or two about what good business practices are in that domain. We use those rules internally when managing our services and we communicate those to our clients as well. These rules have worked well in helping prevent problems. We compiled those into the Ten Commandments below.
But let’s first remind what those Commandments seek to achieve:
- Never break existing code. It’s pretty clear that clients that have already integrated their apps with your web service don’t like it when things stop working. They will be quite upset when they find out it is because you made a change without telling them (this happened to us with code we wrote against the famed API of a just-as-famous Saas CRM vendor).
- Eliminate the need to update Existing code. Ideally, once clients have built something in, they should never have to change it. If that is impossible, the frequency and complexity of those changes should be kept to a minimum and clients should be given a time-frame to made those changes and multiple environments to test those changes against.
Ten Commandments of Web Service Change Management
- Thou shalt never remove or rename a web service. This is pretty basic. If you change the location of the web service, all hell breaks loose. If you are going to rename a service, make sure you leave a proxy of the new one where the old one was.
- Thou shalt never remove or rename a web service operation. With SOAP, the operation defines the SOAP action. With REST, the operation is part of the resource url. In either case, if your remove or rename the operation, the code will break. Because of the nature of web services, keeping “old” operations around for compatibility purpose is typically not a huge issue since only those clients using those operations are affected. We found it is a good practice to do so.
- Thou shalt never rename a web service operation parameter. With SOAP, the operation parameters are part of the SOAP message. With
REST, the operation parameters make up the resource url. In either
case, the client code will break. - Thou shalt never add a web service operation parameter. Adding a new parameter to an operation will break your client’s code since that code does not currently pass that parameter. If the client uses REST, you may be able to add logic inside your web service to provide a default value for that new parameter, if the client uses the tighter-controlled SOAP, the code will break right away since it will never get passed the server and not even enter your code.
- Thou shalt never remove a web service operation parameter. Removing an operation parameter is one of the lesser sins on this list. If the client uses REST, their code will most likely continue to work since the extra parameter will be ignored by your new service. If the client uses SOAP, the result will depend on the behavior of the client’s SOAP toolkit. Some may fail, some may ignore the additional parameters (For instance, client code written over .Net would continue working).
- Thou shalt never remove an enumeration value on an operation parameter. Some
values used as operation parameters are enumerations (i.e. a list of valid
string values like ‘Daily’, ‘Monthly’, and ‘Annually’ could be the
enumeration values of a member called PeriodType). Whether the client uses SOAP or REST, if the client code passes one of those values after you removed them from the valid list, their code will break. - Thou shalt never remove or rename a returned object member. Just like for #2 and #5, this is a very evil sin. If the client uses SOAP, their parser will break when not finding the member it is looking for. If the client uses REST, their code will only break if they use that parameter in their code. Either way, this one sends you straight to customer service hell.
- Thou shalt never change the type of a returned object member. Although all data is exchanged over the wire as strings, changing the type of an object could have bad consequences based on the logic the client built on its side. For example, if you change type type of a field from Integer to Double, the code on the client side will probably choke on those decimals.
- When adding new enumeration values on a web service member, make sure they are not returned by old operations. This is one of the trickiest rules on the list (and we found it the hard way). Note that adding new enumeration values is not a Commandment on the list. The reason is that new enumeration values are typically not returned by old operations. For example if an operation value helps you identify a given type of interest rate or currency, if your client’s “old code” does not know about this value, it will never request it and never expose the problem. But if an “old operation” can return “new enumeration values”, the client’s SOAP parser will fail on trying to interpret those new unknown values. If this is functionality you need to have, you’d better turn those enumerations into strings.
- Thou shall pray a lot before adding a new web service member. This is the only Commandment I wish did not make the list. The other ones make sense in a technical kind of way. This one really should not be. You should be able to add new web service members to your heart delight without anguishing over the consequences. After all, if a web service returns a new data element, the code on the other side does not know about it and should be able to simply ignore it. But the cold hard truth is that some parsers (Axis to be specific) will throw an exception when parsing new values returned by a service. The good news is that Axis plans to address this problem. The bad news is until then you will have to pray and notify your Axis users.
Twitter For Transparency
A few weeks ago, I was part to a discussion with Frank Kenney from the Gartner Group. He was talking about the need of providing greater governance transparency with web services. He recollected the enormous backlash SalesForce.com experienced last year when their system went down for a long period of time. As a remedy, and as as way to regain lost client confidence, SalesForce created a new web site (trust.salesforce.com) focused on providing its clients with complete transparency on the state of its systems.
Frank commented that this approach not only enabled SalesForce to yet again set the standard on how Saas should be done, but it built significant goodwill with its clients. SalesForce clearly showed it had nothing to hide.
This issue is equally significant for Xignite as more and more clients run more and more mission-critical business process on our services. We have historically achieved great levels of availability. We have nothing to hide. This is why we are busy at work providing a similar functionality to our enterprise clients.
Meanwhile, I was wondering if we could not do something simple that would help our clients easily answer the “are you having problems right now?” question which we get once in a while when a client has a problem accessing our services. 99% of the time the problem has to do with configuration changes on their side but I don’t blame fir first wondering if the problem is with us.
Here comes Twitter. Can Twitter do to web service governance what it has done for individual: bring total transparency on the availability of our systems?
You be the judge. Our support pages now boast a twitter-powered System Status widget that provide 5 minutes reports on system availability. If our primary web service farms runs into problem then the status is instantly be updated.
The great thing about twitter is of course that it lets any of our clients subscribe to our twitter in case they want to be updated in real-time of any changes to our system. Now that’s transparency.
No commentsWeb Services Availability – What’s Right
In a recent post, Dan Farber and Larry Dignan
talked about the fact that there is still much ground to cover before
all major Internet properties will provide the "five nines" (99.9999%
availability). Their key point was that today’s Internet services are not "ready" for the massive wave of utility computing ahead and that more money and intelligence needed to be thrown at those problems.
Their post talks about "web services" in more general
consumer sense than ours (where we mean XML web services), but the comment certainly applies. With more than 300 clients in 25 countries and upwards of 500 million
web service requests a months, our web services are powering more and
more mission-critical business processes buried deep inside the core
fabric of our clients’ business. This of course raises the question of the availability of those services and what’s right given the maturity of the market.
Web services as a "computing utility platform" for business are at least 10 years behind the web as a utility platform for consumers. That wave started in 2005 while the web goes back to 1995. So it’s normal to expect a lower level of maturity. Our experience to date has been that clients and prospective users have been more and more sensitive to availability issues but that guaranteeing "three nine" (99.9% availability) meets all current business needs. I expect that the market will be requesting another "nine" every year for the next 4 years (2008->99.99%, 2009->99.999%, 2010-> 99.9999% and maybe more).
There is one key reason for that. Few of our client business processes run themselves in a "fine nine" mode. Many processes are mission critical but they are mission-critical once or several times a day, giving plenty of time for retries and recovery. For instance, one of the world’s largest corporate lenders gets some interest rates information from us. This info is used to feed some homegrown systems that decide how much interest corporate clients gets charge for tens of billions of dollars in corporate loans. That’s mission critical. But the process runs once a day (replacing an old unreliable manual process). As long as our system is up within a 1/2 hour timeframe every day, all is well. In truth, we achieve more than 99.99% availability, with many monthly periods at 100% (we have been publishing availability data on our site since 2004. You can find it here). But it’s almost irrelevant to this client.
The key is to map the service level to the needs. Why going to the trouble of providing
99.999% availability if your clients don’t care at all or don’t care yet? Running redundant data centers is extremely expensive and difficult to do for an emerging company. Your clients probably want you to spend your money doing other things firsts. But once they begin asking for more "nines", you’d better get moving.
And so we are planning for our redundant data center–which we will need to get to the "five nines".
No comments
