How does WebSockets Work?

WebSockets represent a long-awaited evolution in client/server web technology. They allow a long-held single TCP socket connection to be established between the client and server which allows for bi-directional, full-duplex, messages to be instantly distributed with little overhead resulting in a very low latency connection.

Both the WebSocket API and the well as native WebSocket support in browsers such as Google Chrome, Firefox, Opera, and a prototype Silverlight to JavaScript bridge implementation for Internet Explorer, there are now WebSocket library implementations in Objective-C,.NET,Ruby,Java,node.js, ActionScript and many other languages.

WebSockets represent a big step in the evolution of the internet. Just as AJAX changed the game in the mid-2000s; having the ability to open bidirectional, low latency connections enable a whole new generation of real-time web applications. Including what I hope will be some pretty awesome games!

A Brief Web History | Developer’s Perspective

The Internet wasn’t built to be all that dynamic. It was conceived to be a collection of HyperText Markup Language (HTML) pages linking to one another to form a conceptual web of information. Over time the static resources increased in number and richer items, such as images, began to be part of the web fabric.

Server technologies advanced allowing for dynamic server pages – pages whose content was generated based on a query. Soon the requirement to have more dynamic web pages led to the availability of Dynamic HyperText Markup Language (DHTML) all thanks to JavaScript (let’s pretend VBScript never existed).

Over the following years, we saw cross-frame communication in an attempt to avoid page reloads followed by HTTP Polling within frames. Things started to get interesting with the introduction of LiveConnect, then the forever frame technique, and finally, thanks to Microsoft.

We ended up with the XMLHttpRequest object and therefore Asynchronous JavaScript and XML (AJAX). In turn, AJAX made XHR Long-Polling and XHR Streaming possible. But none of these solutions offered a truly standardized cross-browser solution to real-time bi-directional communication between a server and a client.

The WebSockets Protocol

A WebSocket Protocol enables two-way communication between a client running untrusted code running in a controlled environment to a remote host that has opted-in to communications from that code. The security model used for this is the Origin-based security model commonly used by Web browsers. The protocol consists of an opening handshake followed by basic message framing, layered over TCP.

In theory, any transport protocol could be used so long as it provides reliable transport, is byte clean, and supports relatively large message sizes. However, for this document, we consider only TCP. The goal of this technology is to provide a mechanism for browser-based applications that need two-way communication with servers that do not rely on opening multiple HTTP connections (e.g. using XMLHttpRequest or <iframe>s and long polling).

Historically, creating an instant messenger chat client as a Web application has required an abuse of HTTP. Particularly, in order to poll the server for updates while sending upstream notifications as distinct HTTP calls [RFC6202].

This results in a variety of problems:

The server is forced to use a number of different underlying TCP connections for each client: one for sending information to the client, and a new one for each incoming message.
The wire protocol has a high overhead, with each client-to-server message having an HTTP header.
The client-side script is forced to maintain a mapping from the outgoing connections to the incoming connection to track replies.

A simpler solution would be to use a single TCP connection for traffic in both directions. This is what the WebSocket protocol provides. Combined with the WebSocket API, it provides an alternative to HTTP polling for two-way communication from a Web page to a remote server [WSAPI].

The same technique can be used for a variety of Web applications: games, stock tickers, multiuser applications with simultaneous editing, user interfaces exposing server-side services in real-time, etc. The protocol has two parts: a handshake, and then the data transfer.

The handshake from the client looks as follows:

     GET /chat HTTP/1.1
     Host: server.example.com
     Upgrade: websocket
     Connection: Upgrade
     Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
     Sec-WebSocket-Origin: http://example.com
     Sec-WebSocket-Protocol: chat, superchat
     Sec-WebSocket-Version: 8

The handshake from the server looks as follows:

     HTTP/1.1 101 Switching Protocols
     Upgrade: websocket
     Connection: Upgrade
     Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
     Sec-WebSocket-Protocol: chat

The leading line from the client follows the Request-Line format. The leading line from the server follows the Status-Line format. The Request-Line and Status-Line productions are defined in [RFC2616].

After the leading line in both cases come an unordered set of header fields. The meaning of these header fields is specified in Section 5 of this document. Additional header fields may also be present, such as cookies [I-D.ietf-httpstate-cookie] required to identify the user. The format and parsing of headers are as defined in [RFC2616].

The Kind of Apps You can Build with WebSockets

So why would you want to use WebSockets (or something like it)? It’s not really about WebSockets. Rather, it is about what you are trying to get down to the TCP layer. Such as sending and receiving small data packages, and making them reliable and available across a number of devices.

The ability to push a signal to a device as quickly as possible makes WebSockets one of the many solutions to push data between two devices. It’s the blueprint for creating real-time applications on both web and mobile (pretty much anything with a server and a client).

The WebSockets API and Protocol have both been standardized by the IETF and W3C, and have established themselves as a standard for real-time functionality in web, desktop, and mobile apps.

Some advantages of WebSockets include:

Cross-origin communication (however this poses security risks)
Cross-platform compatibility (web, desktop, mobile)
Low weight envelope when passing messages

However, the designation of WebSockets as the standard for data push and real-time communication is somewhat of a misnomer around the web as it is described today. Independent of some open-source solutions, WebSockets are just a part of the puzzle when developing real-time applications.

There is a slew of operational issues a developer may run into when using WebSockets as their real-time solution, especially as the app scales and the user base grows.

Consider the following:

Network topology
Firewalls
Kernel configs
Load testing
Security
Monitoring
Scaling, redundancy, load balancing, replication

Overall, WebSockets are a powerful tool for adding real-time functionality to a web or mobile application. You can check out how you can take WebSockets to the next level with PubNub in detail.

WebSockets don’t make AJAX obsolete but they do supersede Comet (HTTP Long-polling/HTTP Streaming) as the solution of choice for true real-time functionality. AJAX should still be used for making short-lived web service calls, and if we eventually see a good uptake in CORS supporting web services, it will get even more useful.

How WebSockets Work

WebSockets should now be the go-to standard for real-time functionality. Since they offer low latency bi-directional communication over a single connection. Even if a web browser doesn’t natively support theWebSocketobject there are polyfill fallback options which all but guarantee any web browser can actually establish a WebSocket connection.

WebSockets provide a new protocol between client and server that runs over a persistent TCP connection. Because it is an independent TCP-based protocol, it doesn’t ideally require HTTP tunneling (similar to Netflix and other streaming services), allowing for simplified communication when sending messages.

WebSockets come after many other technologies that allow servers to send information to the client. Web applications that use Comet/Ajax, push/pull and long polling all do this over HTTP. Other than handshaking the Upgrade Header, WebSockets is independent of HTTP.

The client establishes a WebSocket connection through a process known as the WebSocket handshake. This process starts with the client sending a regular HTTP request to the server. An Upgrade the header is included in this request that informs the server that the client wishes to establish a WebSocket connection.

Here is a simplified example of the initial request headers.

GET ws://websocket.example.com/ HTTP/1.1
Origin: http://example.com
Connection: Upgrade
Host: websocket.example.com
Upgrade: websocket

Note: WebSocket URLs use thewsscheme. There is also wss for secure WebSocket connections which are the equivalent of HTTPS.

If the server supports the WebSocket protocol, it agrees to the upgrade and communicates this through an Upgrade the header in the response.

HTTP/1.1 101 WebSocket Protocol Handshake
Date: Wed, 16 Oct 2013 10:07:34 GMT
Connection: Upgrade
Upgrade: WebSocket

Closing Handshake

The closing handshake is far simpler than the opening handshake. Either peer can send a control frame with data containing a specified control sequence to begin the closing handshake (detailed in Section 4.5.1). Upon receiving such a frame, the other peer sends a close frame in response, if it hasn’t already sent one.

Upon receiving that control frame, the first peer then closes the connection, safe in the knowledge that no further data is forthcoming. After sending a control frame indicating the connection should be closed, a peer does not send any further data; after receiving a control frame indicating the connection should be closed, a peer discards any further data received.

It is safe for both peers to initiate this handshake simultaneously. The closing handshake is intended to complement the TCP closing handshake (FIN/ACK), on the basis that the TCP closing handshake is not always reliable end-to-end, especially in the presence of man-in-the-middle proxies and other intermediaries.

By sending a close frame and waiting for a close frame in response, certain cases are avoided where data may be unnecessarily lost. For instance, on some platforms, if a socket is closed with data in the receive queue, an RST packet is sent, which will then cause recv() to fail for the party that received the RST, even if there was data waiting to be read.

Creating And Opening Connections

At this point, either party can start sending data. With WebSockets, you can transfer as much data as you like without incurring the overhead associated with traditional HTTP requests. A connection can be made to a port that is shared by an HTTP server (a situation that is quite likely to occur with traffic to ports 80 and 443).

And, as such, the connection will appear to the HTTP server to be a regular GET request with an Upgrade offer. In relatively simple setups with just one IP address and a single server for all traffic to a single hostname, this might allow a practical way for systems based on the WebSocket protocol to be deployed.

In more elaborate setups (e.g. with load balancers and multiple servers), a dedicated set of hosts for WebSocket connections separate from the HTTP servers is probably easier to manage. At the time of writing this specification, it should be noted that connections on ports 80 and 443 have significantly different success rates.

Bear in mind, that connections on port 443 are significantly more likely to succeed, though this may change with time.

See the Demo
Download The Code
View on CodePen

Creating WebSocket connections is really simple. All you have to do is call the WebSocket constructor and pass in the URL of your server. Copy the following code into your app.js file to create a new WebSocket connection.

// Create a new WebSocket.
var socket = new WebSocket('ws://echo.websocket.org');

Once the connection has been established the open the event will be fired on your WebSocket instance. You can handle any errors that occur by listening out for the error event.

Sending Messages

To send a message through the WebSocket connection you call the send() method on your WebSocket instance; passing in the data you want to transfer.

socket.send(data);

You can send both text and binary data through a WebSocket. When the form is submitted this code will retrieve the message from themessageFieldand send it through the WebSocket.

The message is then added to themessagesListand displayed on the screen. To finish up, the valuemessageFieldis reset and ready for the user to type in a new message.

Receiving Messages

When a message is received themessageevent is fired. This event includes a property calleddatathat can be used to access the contents of the message. Your code should then retrieve the message from the event and display it in the messagesList. Once you’re done with your WebSocket you can terminate the connection using the close()method.

socket.close();

After the connection has been closed the browser will fire thecloseevent. Attaching an event listener to thecloseevent allows you to perform any clean-up that you might need to do. The developer tools in Google Chrome include a feature for monitoring traffic through a WebSocket. You can access this tool by following these steps:

Open up the Developer Tools.
Switch to the Network tab.
Click on the entry for your WebSocket connection.
Switch to the Frames tab.

These tools will show you a summary of all the data sent through the connection.

WebSockets on the Server

In this article, I have mainly focused on how to use WebSockets from a client-side perspective. If you’re looking to build your own WebSocket server there are plenty of tools. And libraries out there that can help you out. One of the most popular is socket.io, a Node.JS library that provides cross-browser fallbacks so you can confidently use WebSockets in your apps today.

Some other libraries include:

C++: libwebsockets
Erlang: Shirasu.ws
Java: Jetty
Node.JS: ws
Ruby: em-websocket
Python: Tornado, pywebsocket
PHP: Ratchet, phpws

Overall, WebSockets represent a standard for bi-directional real-time communication between servers and clients.

Firstly in web browsers, but ultimately between any server and any client. The standards first approach means that as developers we can finally create functionality that works consistently across multiple platforms. Connection limitations are no longer a problem since WebSockets represent a single TCP socket connection.

Cross-domain communication has been considered from day one. And, it’s dealt with within the connection handshake. This means, that services such as Pusher can easily use them when offering a massively scalable real-time platform. Something that can be used by any website, web, desktop, or mobile application.

WebSockets | What They Are Plus Their Work With Examples

A Brief Web History | Developer’s Perspective

The WebSockets Protocol

This results in a variety of problems:

The handshake from the client looks as follows:

The handshake from the server looks as follows:

The Kind of Apps You can Build with WebSockets

Some advantages of WebSockets include:

Consider the following:

How WebSockets Work

Here is a simplified example of the initial request headers.

Closing Handshake

Creating And Opening Connections

Sending Messages

Receiving Messages

WebSockets on the Server

Some other libraries include:

Further Reading:

Get Free Updates

A Brief Web History | Developer’s Perspective

The WebSockets Protocol

This results in a variety of problems:

The handshake from the client looks as follows:

The handshake from the server looks as follows:

The Kind of Apps You can Build with WebSockets

Some advantages of WebSockets include:

Consider the following:

How WebSockets Work

Here is a simplified example of the initial request headers.

Closing Handshake

Creating And Opening Connections

Sending Messages

Receiving Messages

WebSockets on the Server

Some other libraries include:

Further Reading:

Get Free Updates

Please, help us spread the word!

Other Related Blog Posts

The Certified Public Accountant (CPA) Cost To Small Business

Speedy Website Design | Vital Steps To Optimize Performance

How Website API Tokens Can Foster High-Grade User Security

Why Design Thinking (DT) Is A Crucial Process For Creatives