WebSockets represent a long-awaited evolution in client/server web technology. They allow a long-held single TCP socket connection to be established between the client and server which allows for bi-directional, full-duplex, messages to be instantly distributed with little overhead resulting in a very low latency connection.
WebSockets represent a big step in the evolution of the internet. Just as AJAX changed the game in the mid-2000s; having the ability to open bidirectional, low latency connections enable a whole new generation of real-time web applications. Including what I hope will be some pretty awesome games!
A Brief Web History | Developer’s Perspective
The Internet wasn’t built to be all that dynamic. It was conceived to be a collection of HyperText Markup Language (HTML) pages linking to one another to form a conceptual web of information. Over time the static resources increased in number and richer items, such as images, began to be part of the web fabric.
Over the following years, we saw cross-frame communication in an attempt to avoid page reloads followed by HTTP Polling within frames. Things started to get interesting with the introduction of LiveConnect, then the forever frame technique, and finally, thanks to Microsoft.
The WebSockets Protocol
A WebSocket Protocol enables two-way communication between a client running untrusted code running in a controlled environment to a remote host that has opted-in to communications from that code. The security model used for this is the Origin-based security model commonly used by Web browsers. The protocol consists of an opening handshake followed by basic message framing, layered over TCP.
In theory, any transport protocol could be used so long as it provides reliable transport, is byte clean, and supports relatively large message sizes. However, for this document, we consider only TCP. The goal of this technology is to provide a mechanism for browser-based applications that need two-way communication with servers that do not rely on opening multiple HTTP connections (e.g. using XMLHttpRequest or <iframe>s and long polling).
Historically, creating an instant messenger chat client as a Web application has required an abuse of HTTP. Particularly, in order to poll the server for updates while sending upstream notifications as distinct HTTP calls [RFC6202].
This results in a variety of problems:
- The server is forced to use a number of different underlying TCP connections for each client: one for sending information to the client, and a new one for each incoming message.
- The wire protocol has a high overhead, with each client-to-server message having an HTTP header.
- The client-side script is forced to maintain a mapping from the outgoing connections to the incoming connection to track replies.
A simpler solution would be to use a single TCP connection for traffic in both directions. This is what the WebSocket protocol provides. Combined with the WebSocket API, it provides an alternative to HTTP polling for two-way communication from a Web page to a remote server [WSAPI].
The same technique can be used for a variety of Web applications: games, stock tickers, multiuser applications with simultaneous editing, user interfaces exposing server-side services in real-time, etc. The protocol has two parts: a handshake, and then the data transfer.
The handshake from the client looks as follows:
GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Sec-WebSocket-Origin: http://example.com Sec-WebSocket-Protocol: chat, superchat Sec-WebSocket-Version: 8
The handshake from the server looks as follows:
HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= Sec-WebSocket-Protocol: chat
The leading line from the client follows the Request-Line format. The leading line from the server follows the Status-Line format. The Request-Line and Status-Line productions are defined in [RFC2616].
After the leading line in both cases come an unordered set of header fields. The meaning of these header fields is specified in Section 5 of this document. Additional header fields may also be present, such as cookies [I-D.ietf-httpstate-cookie] required to identify the user. The format and parsing of headers are as defined in [RFC2616].
The Kind of Apps You can Build with WebSockets
So why would you want to use WebSockets (or something like it)? It’s not really about WebSockets. Rather, it is about what you are trying to get down to the TCP layer. Such as sending and receiving small data packages, and making them reliable and available across a number of devices.
The ability to push a signal to a device as quickly as possible makes WebSockets one of the many solutions to push data between two devices. It’s the blueprint for creating real-time applications on both web and mobile (pretty much anything with a server and a client).
The WebSockets API and Protocol have both been standardized by the IETF and W3C, and have established themselves as a standard for real-time functionality in web, desktop, and mobile apps.
Some advantages of WebSockets include:
- Cross-origin communication (however this poses security risks)
- Cross-platform compatibility (web, desktop, mobile)
- Low weight envelope when passing messages
However, the designation of WebSockets as the standard for data push and real-time communication is somewhat of a misnomer around the web as it is described today. Independent of some open-source solutions, WebSockets are just a part of the puzzle when developing real-time applications.
There is a slew of operational issues a developer may run into when using WebSockets as their real-time solution, especially as the app scales and the user base grows.
Consider the following:
- Network topology
- Kernel configs
- Load testing
- Scaling, redundancy, load balancing, replication
Overall, WebSockets are a powerful tool for adding real-time functionality to a web or mobile application. You can check out how you can take WebSockets to the next level with PubNub in detail.
WebSockets don’t make AJAX obsolete but they do supersede Comet (HTTP Long-polling/HTTP Streaming) as the solution of choice for true real-time functionality. AJAX should still be used for making short-lived web service calls, and if we eventually see a good uptake in CORS supporting web services, it will get even more useful.
How WebSockets Work
WebSockets should now be the go-to standard for real-time functionality. Since they offer low latency bi-directional communication over a single connection. Even if a web browser doesn’t natively support the
WebSocketobject there are polyfill fallback options which all but guarantee any web browser can actually establish a WebSocket connection.
WebSockets provide a new protocol between client and server that runs over a persistent TCP connection. Because it is an independent TCP-based protocol, it doesn’t ideally require HTTP tunneling (similar to Netflix and other streaming services), allowing for simplified communication when sending messages.
WebSockets come after many other technologies that allow servers to send information to the client. Web applications that use Comet/Ajax, push/pull and long polling all do this over HTTP. Other than handshaking the Upgrade Header, WebSockets is independent of HTTP.
The client establishes a WebSocket connection through a process known as the WebSocket handshake. This process starts with the client sending a regular HTTP request to the server. An
Upgrade the header is included in this request that informs the server that the client wishes to establish a WebSocket connection.
Here is a simplified example of the initial request headers.
GET ws://websocket.example.com/ HTTP/1.1 Origin: http://example.com Connection: Upgrade Host: websocket.example.com Upgrade: websocket
Note: WebSocket URLs use the
wsscheme. There is also
wss for secure WebSocket connections which are the equivalent of
If the server supports the WebSocket protocol, it agrees to the upgrade and communicates this through an
Upgrade the header in the response.
HTTP/1.1 101 WebSocket Protocol Handshake Date: Wed, 16 Oct 2013 10:07:34 GMT Connection: Upgrade Upgrade: WebSocket
The closing handshake is far simpler than the opening handshake. Either peer can send a control frame with data containing a specified control sequence to begin the closing handshake (detailed in Section 4.5.1). Upon receiving such a frame, the other peer sends a close frame in response, if it hasn’t already sent one.
Upon receiving that control frame, the first peer then closes the connection, safe in the knowledge that no further data is forthcoming. After sending a control frame indicating the connection should be closed, a peer does not send any further data; after receiving a control frame indicating the connection should be closed, a peer discards any further data received.
It is safe for both peers to initiate this handshake simultaneously. The closing handshake is intended to complement the TCP closing handshake (FIN/ACK), on the basis that the TCP closing handshake is not always reliable end-to-end, especially in the presence of man-in-the-middle proxies and other intermediaries.
By sending a close frame and waiting for a close frame in response, certain cases are avoided where data may be unnecessarily lost. For instance, on some platforms, if a socket is closed with data in the receive queue, an RST packet is sent, which will then cause recv() to fail for the party that received the RST, even if there was data waiting to be read.
Creating And Opening Connections
At this point, either party can start sending data. With WebSockets, you can transfer as much data as you like without incurring the overhead associated with traditional HTTP requests. A connection can be made to a port that is shared by an HTTP server (a situation that is quite likely to occur with traffic to ports 80 and 443).
And, as such, the connection will appear to the HTTP server to be a regular GET request with an Upgrade offer. In relatively simple setups with just one IP address and a single server for all traffic to a single hostname, this might allow a practical way for systems based on the WebSocket protocol to be deployed.
In more elaborate setups (e.g. with load balancers and multiple servers), a dedicated set of hosts for WebSocket connections separate from the HTTP servers is probably easier to manage. At the time of writing this specification, it should be noted that connections on ports 80 and 443 have significantly different success rates.
Bear in mind, that connections on port 443 are significantly more likely to succeed, though this may change with time.
Creating WebSocket connections is really simple. All you have to do is call the
WebSocket constructor and pass in the URL of your server. Copy the following code into your
app.js file to create a new WebSocket connection.
// Create a new WebSocket. var socket = new WebSocket('ws://echo.websocket.org');
Once the connection has been established the
open the event will be fired on your WebSocket instance. You can handle any errors that occur by listening out for the
To send a message through the WebSocket connection you call the
send() method on your
WebSocket instance; passing in the data you want to transfer.
You can send both text and binary data through a WebSocket. When the form is submitted this code will retrieve the message from the
messageFieldand send it through the WebSocket.
The message is then added to the
messagesListand displayed on the screen. To finish up, the value
messageFieldis reset and ready for the user to type in a new message.
When a message is received the
messageevent is fired. This event includes a property called
datathat can be used to access the contents of the message. Your code should then retrieve the message from the event and display it in the
messagesList. Once you’re done with your WebSocket you can terminate the connection using the
After the connection has been closed the browser will fire the
closeevent. Attaching an event listener to the
closeevent allows you to perform any clean-up that you might need to do. The developer tools in Google Chrome include a feature for monitoring traffic through a WebSocket. You can access this tool by following these steps:
- Open up the Developer Tools.
- Switch to the
- Click on the entry for your WebSocket connection.
- Switch to the
These tools will show you a summary of all the data sent through the connection.
WebSockets on the Server
In this article, I have mainly focused on how to use WebSockets from a client-side perspective. If you’re looking to build your own WebSocket server there are plenty of tools. And libraries out there that can help you out. One of the most popular is socket.io, a Node.JS library that provides cross-browser fallbacks so you can confidently use WebSockets in your apps today.
Some other libraries include:
- C++: libwebsockets
- Erlang: Shirasu.ws
- Java: Jetty
- Node.JS: ws
- Ruby: em-websocket
- Python: Tornado, pywebsocket
- PHP: Ratchet, phpws
Overall, WebSockets represent a standard for bi-directional real-time communication between servers and clients.
Firstly in web browsers, but ultimately between any server and any client. The standards first approach means that as developers we can finally create functionality that works consistently across multiple platforms. Connection limitations are no longer a problem since WebSockets represent a single TCP socket connection.
Cross-domain communication has been considered from day one. And, it’s dealt with within the connection handshake. This means, that services such as Pusher can easily use them when offering a massively scalable real-time platform. Something that can be used by any website, web, desktop, or mobile application.
- High-Performance Browser Networking: WebSocket
- IETF WebSocket Protocol
- WebSockets (MDN)
- WebSocket Specification (WHATWG)
- Can I Use: Web Sockets
Finally, we hope the above-revised guide on WebSockets was useful. But, if you’ll have additional contributions or questions, please Contact Us for more answers and explanations. Or simply, use our comments section box to share your thoughts and insights. You can also share this guide with other online and social media readers like you.