An Overview of BEEP by Marshall Rose, Dover Beach Consulting The Blocks Extensible Exchange Protocol (BEEP) is something like "the missing link between the application layer and the Transmission Control Protocol (TCP)." This statement is a horrific analogy because TCP is a transport protocol that provides reliable connections, and it makes no sense to compare a protocol to a layer. TCP is a highly-evolved protocol; many talented engineers have, over the last 20 years, built an impressive theory and practice around TCP. In fact, TCP is so good at what it does that when it came to survival of the fittest, it obliterated the competition. Even today, any serious talk about the transport protocol revolves around minor tweaks to TCP. (Or, if you prefer, the intersection between people talking about doing an "entirely new" transport protocol and people who are clueful is the empty set.) Unfortunately, most application protocol design has not enjoyed as excellent a history as TCP. Engineers design protocols the way monkeys try to get to the moon—that is, by climbing a tree, looking around, and finding another tree to climb. Perhaps this is because there are more distractions at the application layer. For example, as far as TCP is concerned, its sole reason for being is to provide a full-duplex octet-aligned pipe in a robust and network-friendly fashion. The natural result is that while TCP's philosophy is built around "reliability through retransmission," there isn't a common mantra at the application layer. Historically, when different engineers work on application protocols, they come up with different solutions to common problems. Sometimes the solutions reflect differing perspectives on inevitable tradeoffs; sometimes the solutions reflect different skill and experience levels. Regardless, the result is that the wheel is continuously reinvented, but rarely improved. So, what is BEEP and how does it relate to all this? BEEP integrates the best practices for common, basic mechanisms that are needed when designing an application protocol over TCP. For example, it handles things like peer-to-peer, client/server, and server/client interactions. Depending on how you count, there are about a dozen or so issues that arise time and time again, and BEEP just deals with them. This means that you get to focus on the "interesting stuff." BEEP has three things going for it: It's been standardized by the Internet Engineering Task Force (IETF), the so-called "governing body" for Internet protocols. There are open source implementations available in different languages. There's a community of developers who are clueful. The standardization part is important, because BEEP has undergone a lot of technical review. The implementation part is important, because BEEP is probably available on a platform you're familiar with. The community part is important, because BEEP has a lot of resources available for you. Application Protocols An application protocol is a set of rules that says how your application talks to the network. Over the last few years, the Hypertext Transfer Protocol (HTTP) has been pressed into service as a general-purpose application protocol for many different kinds of applications, ranging from the Internet Printing Protocol (IPP) [1] to the Simple Object Access Protocol (SOAP) [2]. This is great for application designers: it saves them the trouble of having to design a new protocol and allows them to reuse a lot of ideas and code. HTTP has become the reuse platform of choice, largely because: It is familiar. It is ubiquitous. It has a simple request/response model. It usually works through firewalls. These are all good reasons, and—if HTTP meets your communications requirements—you should use it. The problem is that the widespread availability of HTTP has become an excuse for not bothering to understand what the requirements really are. It's easier to use HTTP, even if it's not a good fit, than to understand your requirements and design a protocol that does what you really need. That's where BEEP comes in. It's a toolkit that you can use for building application protocols. It works well in a wide range of application domains, many of which weren't of interest when HTTP was being designed. BEEP's goal is simple: you, the protocol designer, focus on the protocol details for your problem domain, and BEEP takes care of the other details. It turns out that the vast majority of application protocols have more similarities than differences. The similarities primarily deal with " administrative overhead"—things you need for a working system, but aren't specific to the problem at hand. BEEP mechanizes the similar parts, and lets you focus on the interesting stuff. Application Protocol Design Let's assume, for the moment, that you don't see a good fit between the protocol functions you need and either the e-mail or the Web infrastructures. (We'll talk more about this later on in the section "The Problem Space".) It's time to make something new. First, you decide that your protocol needs ordered, reliable delivery. This is a common requirement for most application protocols, including HTTP and the Simple Mail Transfer Protocol (SMTP). [3] The easiest way to get this is to layer the protocol over TCP. So, you decide to use TCP as the underlying transport for your protocol. Of course, TCP sends data as an octet stream—there aren't any delimiters that TCP uses to indicate where one of your application's messages ends and another one begins. This means you have to design a framing mechanism that your application uses with TCP. That's pretty simple to do—HTTP uses an octet count and SMTP uses a delimiter with quoting. Since TCP is just sending bytes for you, you need to not only frame messages, but have a way of marking what's in each message. (For example, a data structure, an image, some text, and so on.) This means you have to design an encoding mechanism that your application uses with the framing mechanism. That's also pretty simple to do—HTTP and SMTP both use Multipurpose Internet Mail Extensions (MIME).[4] Back in the early 1980s, when I was a young (but exceptionally cynical) computer scientist, my advisor told me that protocols have two parts: data and control. It looks like the data part is taken care of with MIME, so it's onto the control part. If you are fortunate enough to know ahead of time every operation and option that your protocol will ever support, there's no need for any kind of capabilities negotiation. In other words, your protocol doesn't need anything that lets the participants tell each other which operations and options are supported. (Of course, if this is the case, you have total recall of future events, and really ought to be making the big money in another, more speculative, field.) The purpose of negotiation is to find common ground between two different implementations of a protocol (or two different versions of the same implementation). There are lots of different ways of doing this and, unfortunately, most of them don't work very well. SMTP is a really long-lived, well-deployed protocol, and it seems to do a pretty good job of negotiations. The basic idea is for the server to tell the client what capabilities it supports when a connection is established, and then for the client to use a subset of that. Well, that's just the first control issue. The next deals with when it's time for the connection to be released. Sometimes this is initiated by the protocol, and sometimes it's required by TCP because the network is unresponsive. To further complicate things, if the release is initiated by the protocol, maybe one of the computers hasn't finished working on something, so it doesn't want to release the connection just yet. Some application protocols don't do any negotiation on connection release, and just rely on TCP to indicate that it's time to go away—even though this is inherently ambiguous. Is ambiguity a good thing in a protocol? Computers lack subtlety and nuance, so in protocols between computers, ambiguity is a bad thing. For example, in HTTP 1.0 (and earlier), you often didn't know whether a response was truncated or not. For a more concrete example, interested readers will be amused by page 2 of RFC 962.[5] The final control issue deals with what happens between connection establishment and release. Most application protocols tend to be client/server in nature: one computer establishes a connection, sends some requests, gets back responses, and then releases the connection. But, are the requests and responses handled one at a time (in lock-step), or can multiple requests be outstanding, either in transit or being processed, at the same time (asynchronously)? In the original SMTP, the lock-step model was implicitly assumed by most implementors; later on, SMTP introduced a capability to allow limited pipelining. Regardless, as soon as we move away from lock-stepping, it looks as though we'll need some way of correlating requests and responses. Although this is a step in the right direction, some application protocols need even more support for asynchrony. The reasoning is a little convoluted, but it all comes down to performance. There's a lot of overhead involved in terms of establishing a connection and getting the right user state, so it makes sense to maximize the number of transactions that get done in a single connection. While this helps in terms of overall efficiency, if the transactions are handled serially, then transactional latency—the time it takes to transit the network, process the transaction, and then transit back—isn't reduced (and may even be increased); a transaction might be blocked while waiting for another to complete. The solution is to be able to handle transactions in parallel. Earlier I mentioned how, back in the 1980s, protocols had two parts, data and control. Today, things have changed. First of all, I'm still cynical, but more comfortable with it, and—perhaps as important—many might argue that protocols now have a third part, namely security. The really unfortunate part is that security is a moving target on two fronts: When you deploy your protocol in different environments, you may have different security requirements. Even in the same environment, security requirements change over time. This introduces something of a paradox: modern thinking is that security must be tightly integrated with your protocol, but at the same time, you have to take a modular approach to the actual technology to allow for easy upgrades. Worse, it's very easy to get security very wrong. (Just ask any major computer vendor!) Few applications folks are also expert in protocol security, and obtaining that expertise is a time-consuming, thankless task, so there's a lot of benefit in having a security mechanism menu, developed by security experts, that applications folk can pick from. Now the good news: there's already something around designed to meet just those requirements. It's called the Simple Authentication and Security Layer (SASL), and a lot of existing application protocols have been retrofitted over the last four years to make use of it. Well, let's see what all this means. Without ever having talked about what your application protocol is going to do to earn a living, we haveto develop solutions for: Framing messages Encoding data Negotiating capabilities (versions and options) Negotiating connection release Correlating requests and responses Handling multiple outstanding requests (pipelining) Handling multiple asynchronous requests (multiplexing) Providing integrated and modular security Integrating all these things together into a single, coherent framework So, going back to the question "Why use BEEP?", the answer is pretty simple: if you use BEEP, you simply don't have to think about any of these things. They automatically get taken care of. Now maybe you're the kind of hardcore engineer that really wants to solve these problems yourself. Okay, go right ahead! But first, I'll let you in on a little secret: engineers have been solving these problems since 1972. In fact, they keep solving them over and over again. For each problem, there are usually two or three good solutions, and while individual tastes may vary, the sad fact is that you can make any of them work great if you're willing to put in the hours. But why put in the hours if they have nothing to do with the primary reason for writing the application protocol to begin with? Isn't there something more productive that you'd care to do with your life than design yet another framing protocol? So, what's really new about BEEP? The short answer is: not much. The innovative part is that some folks sat down, did an analysis of the problems and solutions, and came up with an integrated framework that put it all together. That's not really innovation, but it's really good news if you're already familiar with the building blocks that BEEP uses. Doesn't all this stuff add a lot of overhead? The short answer is: nope. The reason is a little more complex. BEEP is fairly minimalistic—it provides a simple mechanism for negotiating things on an à la carte basis. If you don't want privacy, no problem; don't turn it on. If you don't want parallelism, that's easy; just say "no" if the other computer asks for it. The trick here is two-fold: BEEP's inner mechanisms (for example, framing) are pretty light-weight, so you don't incur a lot of overhead using them (even if you don't use all the functionality they provide). BEEP's outer mechanisms (for example, encryption) are all controlled via bilateral negotiation, so you can decide exactly what you want to get and pay for. There's no free lunch, but if you want to start with something "lean and mean," BEEP doesn't slow you down, and when you want to bulk up (say, by adding privacy), BEEP lets you negotiate it. You incur only the overhead you need. (This overhead will show up, regardless of whether you use BEEP or grow your own mechanisms.) It turns out that this philosophy can yield some interesting results. For example, take a look at this high-level scripting fragment: ::init -server example.com -port 10288 -privacy strong This fragment is invoking a procedure to establish a BEEP session. With the exception of the last two terms, it looks pretty conventional. The last two terms tell the procedure to "tune" the session by looking at the security protocols supported in common, selecting one that supports "strong privacy," and then negotiating its use. What's interesting here is that neither the person who designed the application protocol nor the person who wrote the application making the procedure call has to be a security expert. The choice to use strong privacy, and how it gets transparently used, is all an issue of provisioning. Of course, the application protocol designer may still provide security guidelines to the implementor; naturally, the implementor may bundle a wide range of security protocols with the code. However—and this is key—everyone got to focus on what they do best (even the security guys), and it still comes together into a working system. The cool part here is how easily this all integrates into an evolving protocol. Back in the good ol' days (say the mid-1980s) when the Post Office Protocol (POP) [6] was defined, this kind of flexibility wasn't available. Whenever someone wanted to add a new security mechanism for authentication or privacy, you had to muck with the entire protocol. With BEEP's framework, you just add a module that works seamlessly with the rest of the protocol. This means less work for everyone, and presumably fewer mistakes getting the work done. Now we've come full circle: the reason for using BEEP is because it makes it a lot easier to specify, develop, maintain, and evolve new application protocols. The Problem Space BEEP works for a large class of application protocols. However, you should always use the right tool for the right job. Before you start using BEEP for a project, you should ask yourself whether your application protocol is a good fit for either the e-mail or Web models. Dave Crocker, one of the Internet's progenitors, suggests that network applications can be broadly distinguished by five operational characteristics: Server push or client pull Synchronous (interactive) or asynchronous (batch) Time-assured or time-insensitive Best-effort or reliable Stateful or stateless For example: The World Wide Web is a pull, synchronous, time-insensitive, reliable, stateless service. Internet mail is a push, asynchronous, time-insensitive, best-effort, stateless service. This is a pretty useful taxonomy. So, your first step is to see whether either of these existing infrastructures meet your requirements. It's easiest to start by asking if your application can reside on top of e-mail. Typically, the unpredictable latency of the Internet mail infrastructure raises the largest issues; however, in some cases it's a non-issue. For example, in the early 1990s, some of the earliest business-to-business exchanges were operated over e-mail (for example, USC/ISI's FAST project). If you can find a good fit between your application and Internet e-mail, use it! More likely, though, you'll be tempted to use the Web infrastructure, and there are a lot of awfully good reasons to do so. After all, when you use HTTP: There's lots of tools (libraries, servers, etc.) to choose from. It's easy to prototype stuff. There's already a security model. You can traverse firewalls pretty easily. All of this boils down to one simple fact: it is pretty easy to deploy things in the Web infrastructure. The real issue is whether you can make good use of this infrastructure. HTTP was originally developed for retrieving documents in a LAN environment, so HTTP's interaction model is optimized for that application. Accordingly, in HTTP: Each session consists of a single request/response exchange. The computer that initiates the session is also the one that initiates the request. What needs to be emphasized here is that this is a perfectly fine interaction model for HTTP's target application, as well as many other application domains. The problem arises when the behavior of your application protocol doesn't match this interaction model. In this case, there are two choices: make use of HTTP's extensibility features, or simply make do. Obviously, each choice has some drawbacks. The problem with using HTTP's extensibility features is that it pretty much negates the ability to use the existing HTTP infrastructure; the problem with "just making do" is that you end up crippling your protocol. For example, if your application protocol needs asynchronous notifications, you're out of luck. A second problem arises due to "the law of codepaths." The HTTP 1.1 specification, RFC 2616 [10] is fairly rigorous. Even so, few implementors take the time to think out many of the nuances of the protocol. For example, the typical HTTP transaction consists of a small request, which results in a (much) larger response. Talk to any engineer who's worked on a browser and they'll tell you this is "obvious." So, what happens when the "obvious" doesn't happen? Some time ago, folks wanted a standardized protocol for talking to networked printers. The result was something called the Internet Printing Protocol (IPP) [1]. IPP sits on top of HTTP. At this point, the old "obvious" thing (small request, big response) gets replaced with the new "obvious" thing—the request contains an arbitrarily large file to be printed, and the response contains this tiny little status indication. A surprising amount of HTTP software doesn't handle this situation particularly gracefully (that is, long requests get silently truncated). The moral is that even though HTTP's interaction model doesn't play favorites with respect to lengthy requests or responses, many HTTP implementors inadvertently make unfortunate assumptions. A third problem deals with the unitary relationship between sessions and exchanges. If a single transaction needs to consist of more than one exchange, it has to be spread out over multiple sessions. This introduces two issues: In terms of stateful behavior, the server computer has to be able to keep track of session state across multiple connections, imposing a significant burden both on the correctness and implementation of the protocol (for example, to properly handle time-outs). In terms of performance, TCP isn't designed for dealing with back-to-back connections— there's a fair amount of overhead and latency involved in establishing a connection. This is also true for the security protocols that layer on top of TCP. HTTP 1.1 begins to address these issues by introducing persistent connections that allow multiple exchanges to occur serially over a single connection, but still the protocol lacks a session concept. In practice, implementors try to bridge this gap by using "cookies" to manage session state, which introduces ad-hoc (in)security models that often result in security breakdowns (as a certain Web-based e-mail service provider found out). This brings us to a more general fourth problem: although HTTP has a security model, it predates SASL. From a practical perspective, what this means is that it's very difficult to add new security protocols to HTTP. Of course, that may not be an issue for you. If you can find a good fit between your application and the Web infrastructure, use it! (For those interested in a more architectural perspective on the reuse of the Web infrastructure for new application protocols, consider RFC 3205 [7].) Okay, so we've talked about both the e-mail and Web infrastructures, and we've talked about what properties your application protocol needs to have in order to work well with them. So, if there isn't a good fit between either of them and your application protocol, what about BEEP? BEEP's interaction model is pretty simple, with the following three properties: Each session consists of one or more request/response exchanges. Either computer can initiate requests or notifications. It's connection-oriented. By using BEEP, you get an amortization effect with respect to the cost of connection establishment and state management. This is largely derived from the first property. Similarly, the second property gives BEEP its ability to support either peer-to-peer or client-server interactions. What we really need to explain is the connection-oriented part. To begin, all three of the interaction models we've looked at (BEEP, e-mail, and the Web) are connection-oriented. (Although e-mail may get delivered out of order, the commands sent over each e-mail "hop" are processed in an ordered, reliable fashion.) The connection-oriented model is the most commonly used for application protocols, but it does introduce some restrictions. A connection-oriented interaction model means that data is delivered reliably and in the same order as it was sent. If you don't require ordered, reliable delivery, you don't need a connection-oriented interaction model. For example, Internet telephony applications don't fit this model, nor do traditional multicast applications. So, BEEP is suitable for unicast application protocols (two computers are talking to each other). However, not all unicast applications need a connection-oriented model—for example, the Domain Name System (DNS) manages name-to-address resolutions just fine without it. In fact, if your protocol is able to limit each session to exactly one request/re-sponse exchange with minimalist reliability requirements, and also limit the size of each message to around 65K octets, then it's probably a good candidate for using the User Datagram Protocol (UDP) instead. The IETF and BEEP BEEP is an emerging standard from the Internet Engineering Task Force (IETF). The IETF is a voluntary professional organization that develops many of the protocols running in the Internet. (Of course, anyone is free to develop their own protocols to run in their own little part of the Internet, but if you want multi-vendor support, you need an organization like the IETF.) So why does the IETF care about BEEP? The answer is that the largest area in the IETF deals with application protocols. There are usually over two dozen working groups developing different application protocols. And, the IETF has been doing this for a long, long time. It turns out that even though there are well-engineered solutions to the different overhead issues, BEEP is the first time that the IETF decided to develop a standard approach that integrates the best practices for each issue. Before BEEP, each working group would spend endless hours arguing about different solutions, and then, if any time was remaining, they might sit down and look at the actual problem domain. (Okay, this is an exaggeration... but not by much!) So, here's the process by which BEEP got designed: Identify the common domain-independent problems. Determine the best solution for each problem. Integrate the solutions into a consistent framework. Declare victory. Now, the obvious question is: how do you determine what's "best?" The truth is that in some cases, the answer is obvious, and in other cases, the answer is arbitrary. (Protocol experts hate to admit this, but in some cases, there is no clear winner, and it's simply better to pick one and order another drink.) Since most of what BEEP does is hidden from the application designer and implementor, there's really not a lot of mileage in going through it here. beepcore.org Where can you find out more about BEEP? To start, you can always consult the two RFCs: the BEEP core framework [8] and the BEEP's mapping onto TCP [9] . However, it's probably better to start with the BEEP community Web site http://beepcore.org where you'll find: News about BEEP meetings and events Information about BEEP projects, programmers, and consultants Information about beepcore (open source) and commercial software BEEP-related RFCs, Internet-Drafts, and whitepapers [This article is adapted from Beep?he Definitive Guide, by Marshall T. Rose, ISBN 0-596-00244-0, O'Reilly & Associates, 2002. Used with permission. http://www.oreilly.com/catalog/beep/] References [1] Herriot, R., Ed., Butler, S., Moore, P., Turner, R., "Internet Printing Protocol/1.0: Encoding and Transport," RFC 2565, April 1999. [2] http://www.w3.org/TR/SOAP/ [3] Postel, J., "Simple Mail Transfer Protocol," RFC 821, August 1982. [4] Freed, N., Borenstein, N., "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies," RFC 2045, November 1996. [5] Padlipsky, M. A., "TCP-4 prime," RFC 962, November 1985. [6] Rose, M. T., "Post Office Protocol: Version 3," RFC 1081, November 1988. [7] Moore, K., "On the use of HTTP as a Substrate," RFC 3205, February 2002. [8] Rose, M., "The Blocks Extensible Exchange Protocol Core," RFC 3080, March 2001. [9] Rose, M., "Mapping the BEEP Core onto TCP," RFC 3081, March 2001. [10] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., Berners-Lee, T., "Hypertext Transfer Protocol ?HTTP/1.1," RFC 2616, June 1999. MARSHALL T. ROSE is the prime mover of the BEEP Protocol. In his former position as the Internet Engineering Task Force (IETF) area director for network management, he was one of a dozen individuals who oversaw the Internet's standardization process. Rose was responsible for the design, specification, and implementation of several Internet-standard technologies, and wrote more than 60 of the Internet's Requests For Comments (RFCs). With a Ph.D. in information and computer science from the University of California, Irvine, Rose is the author of several professional texts. E-mail: mrose@dbc.mtview.ca.us