A later version of the API, though, might want to offload all the crypto work onto subworkers. This could be done as follows:

function handleMessage(e) {
  if (e.data == "genkeys")
    genkeys(e.ports[0]);
  else if (e.data == "encrypt")
    encrypt(e.ports[0]);
  else if (e.data == "decrypt")
    decrypt(e.ports[0]);
}

function genkeys(p) {
  var generator = new Worker('libcrypto-v2-generator.js');
  generator.postMessage('', [p]);
}

function encrypt(p) {
  p.onmessage = function (e) {
    var key = e.data;
    var encryptor = new Worker('libcrypto-v2-encryptor.js');
    encryptor.postMessage(key, [p]);
  };
}

function encrypt(p) {
  p.onmessage = function (e) {
    var key = e.data;
    var decryptor = new Worker('libcrypto-v2-decryptor.js');
    decryptor.postMessage(key, [p]);
  };
}

// support being used as a shared worker as well as a dedicated worker
if ('onmessage' in this) // dedicated worker
  onmessage = handleMessage;
else // shared worker
  onconnect = function (e) { e.ports[0].onmessage = handleMessage };

The little subworkers would then be as follows.

For generating key pairs:

onmessage = function (e) {
  var k = _generateKeyPair();
  e.ports[0].postMessage(k[0]);
  e.ports[0].postMessage(k[1]);
  close();
}

function _generateKeyPair() {
  return [Math.random(), Math.random()];
}

For encrypting:

onmessage = function (e) {
  var key = e.data;
  e.ports[0].onmessage = function (e) {
    var s = e.data;
    postMessage(_encrypt(key, s));
  }
}

function _encrypt(k, s) {
  return 'encrypted-' + k + ' ' + s;
}

For decrypting:

onmessage = function (e) {
  var key = e.data;
  e.ports[0].onmessage = function (e) {
    var s = e.data;
    postMessage(_decrypt(key, s));
  }
}

function _decrypt(k, s) {
  return s.substr(s.indexOf(' ')+1);
}

Notice how the users of the API don't have to even know that this is happening — the API hasn't changed; the library can delegate to subworkers without changing its API, even though it is accepting data using message channels.

View this example online.

9.2 Infrastructure

There are two kinds of workers; dedicated workers, and shared workers. Dedicated workers, once created, and are linked to their creator; but message ports can be used to communicate from a dedicated worker to multiple other browsing contexts or workers. Shared workers, on the other hand, are named, and once created any script running in the same origin can obtain a reference to that worker and communicate with it.

9.2.1 The global scope

The global scope is the "inside" of a worker.

9.2.1.1 The WorkerGlobalScope abstract interface
interface WorkerGlobalScope {
  readonly attribute WorkerGlobalScope self;
  readonly attribute WorkerLocation location;

  void close();
           attribute Function onerror;
};
WorkerGlobalScope implements WorkerUtils;
WorkerGlobalScope implements EventTarget;

The self attribute must return the WorkerGlobalScope object itself.

The location attribute must return the WorkerLocation object created for the WorkerGlobalScope object when the worker was created. It represents the absolute URL of the script that was used to initialize the worker, after any redirects.


When a script invokes the close() method on a WorkerGlobalScope object, the user agent must run the following steps (atomically):

  1. Discard any tasks that have been added to the event loop's task queues.

  2. Set the worker's WorkerGlobalScope object's closing flag to true. (This prevents any further tasks from being queued.)

  3. Disentangle all the ports in the list of the worker's ports.

The following are the event handlers (and their corresponding event handler event types) that must be supported, as IDL attributes, by objects implementing the WorkerGlobalScope interface:

Event handler Event handler event type
onerror error

The WorkerGlobalScope interface must not exist if the interface's relevant namespace object is a Window object. [WEBIDL]

9.2.1.2 Dedicated workers and the DedicatedWorkerGlobalScope interface
[Supplemental, NoInterfaceObject]
interface DedicatedWorkerGlobalScope : WorkerGlobalScope {
  void postMessage(in any message, in optional MessagePortArray ports);
           attribute Function onmessage;
};

DedicatedWorkerGlobalScope objects act as if they had an implicit MessagePort associated with them. This port is part of a channel that is set up when the worker is created, but it is not exposed. This object must never be garbage collected before the DedicatedWorkerGlobalScope object.

All messages received by that port must immediately be retargeted at the DedicatedWorkerGlobalScope object.

The postMessage() method on DedicatedWorkerGlobalScope objects must act as if, when invoked, it immediately invoked the method of the same name on the port, with the same arguments, and returned the same return value.

The following are the event handlers (and their corresponding event handler event types) that must be supported, as IDL attributes, by objects implementing the DedicatedWorkerGlobalScope interface:

Event handler Event handler event type
onmessage message

For the purposes of the application cache networking model, a dedicated worker is an extension of the cache host from which it was created.

9.2.1.3 Shared workers and the SharedWorkerGlobalScope inteface
[Supplemental, NoInterfaceObject]
interface SharedWorkerGlobalScope : WorkerGlobalScope {
  readonly attribute DOMString name;
  readonly attribute ApplicationCache applicationCache;
           attribute Function onconnect;
};

Shared workers receive message ports through connect events on their global object for each connection.

The name attribute must return the value it was assigned when the SharedWorkerGlobalScope object was created by the "run a worker" algorithm. Its value represents the name that can be used to obtain a reference to the worker using the SharedWorker constructor.

The following are the event handlers (and their corresponding event handler event types) that must be supported, as IDL attributes, by objects implementing the SharedWorkerGlobalScope interface:

Event handler Event handler event type
onconnect connect

For the purposes of the application cache networking model, a shared worker is its own cache host. The run a worker algorithm takes care of associating the worker with an application cache.

The applicationCache attribute returns the ApplicationCache object for the worker.

9.2.2 Origins of workers

Both the origin and effective script origin of scripts running in workers are the origin of the absolute URL given in that the worker's location attribute represents.

9.2.3 The event loop

Each WorkerGlobalScope object has an event loop distinct from those defined for units of related similar-origin browsing contexts. This event loop has no associated browsing context, and its task queues only have events, callbacks, and networking activity as tasks. The processing model of these event loops is defined below in the run a worker algorithm.

Each WorkerGlobalScope object also has a closing flag, which must initially be false, but which can get set to true by the algorithms in the processing model section below.

Once the WorkerGlobalScope's closing flag is set to true, the event loop's task queues must discard any further tasks that would be added to them (tasks already on the queue are unaffected except where otherwise specified). Effectively, once the closing flag is true, timers stop firing, notifications for all pending asynchronous operations are dropped, etc.

9.2.4 The worker's lifetime

Workers communicate with other workers and with browsing contexts through message channels and their MessagePort objects.

Each WorkerGlobalScope worker global scope has a list of the worker's ports, which consists of all the MessagePort objects that are entangled with another port and that have one (but only one) port owned by worker global scope. This list includes the implicit MessagePort in the case of dedicated workers.

Each WorkerGlobalScope also has a list of the worker's workers. Initially this list is empty; it is populated when the worker creates or obtains further workers.

Finally, each WorkerGlobalScope also has a list of the worker's Documents. Initially this list is empty; it is populated when the worker is created.

Whenever a Document d is added to the worker's Documents, the user agent must, for each worker in the list of the worker's workers whose list of the worker's Documents does not contain d, add d to q's WorkerGlobalScope owner's list of the worker's Documents.

Whenever a Document object is discarded, it must be removed from the list of the worker's Documents of each worker whose list contains that Document.

Given a script's global object o when creating or obtaining a worker, the list of relevant Document objects to add depends on the type of o. If o is a WorkerGlobalScope object (i.e. if we are creating a nested worker), then the relevant Documents are the Documents that are in o's own list of the worker's Documents. Otherwise, o is a Window object, and the relevant Document is just the Document that is the active document of the Window object o.


A worker is said to be a permissible worker if its list of the worker's Documents is not empty.

A worker is said to be a protected worker if it is a permissible worker and either it has outstanding timers, database transactions, or network connections, or its list of the worker's ports is not empty, or its WorkerGlobalScope is actually a SharedWorkerGlobalScope object (i.e. the worker is a shared worker).

A worker is said to be an active needed worker if any of the Document objects in the worker's Documents are fully active.

A worker is said to be a suspendable worker if it is not an active needed worker but it is a permissible worker.

9.2.5 Processing model

When a user agent is to run a worker for a script with URL url, a browsing context owner browsing context, a Document owner document, an origin owner origin, and with global scope worker global scope, it must run the following steps:

  1. Create a completely separate and parallel execution environment (i.e. a separate thread or process or equivalent construct), and run the rest of these steps asynchronously in that context.

  2. If worker global scope is actually a SharedWorkerGlobalScope object (i.e. the worker is a shared worker), and there are any relevant application caches that are identified by a manifest URL with the same origin as url and that have url as one of their entries, not excluding entries marked as foreign, then associate the worker global scope with the most appropriate application cache of those that match.

  3. Attempt to fetch the resource identified by url, from the owner origin.

    If the attempt fails, or if the attempt involves any redirects to URIs that do not have the same origin as url (even if the final URI is at the same origin as the original url), then for each Worker or SharedWorker object associated with worker global scope, queue a task to fire a simple event named error at that object. Abort these steps.

    If the attempt succeeds, then convert the script resource to Unicode by assuming it was encoded as UTF-8, to obtain its source.

    Let language be JavaScript.

    As with script elements, the MIME type of the script is ignored. Unlike with script elements, there is no way to override the type. It's always assumed to be JavaScript.

  4. A new script is now created, as follows.

    Create a new script execution environment set up as appropriate for the scripting language language.

    Parse/compile/initialize source using that script execution environment, as appropriate for language, and thus obtain a list of code entry-points; set the initial code entry-point to the entry-point for any executable code to be immediately run.

    Set the script's global object to worker global scope.

    Set the script's browsing context to owner browsing context.

    Set the script's document to owner document.

    Set the script's URL character encoding to UTF-8. (This is just used for encoding non-ASCII characters in the query component of URLs.)

    Set the script's base URL to url.

  5. Closing orphan workers: Start monitoring the worker such that no sooner than it stops being either a protected worker or a suspendable worker, and no later than it stops being a permissible worker, worker global scope's closing flag is set to true.

  6. Suspending workers: Start monitoring the worker, such that whenever worker global scope's closing flag is false and the worker is a suspendable worker, the user agent suspends execution of script in that worker until such time as either the closing flag switches to true or the worker stops being a suspendable worker.

  7. Jump to the script's initial code entry-point, and let that run until it either returns, fails to catch an exception, or gets prematurely aborted by the "kill a worker" or "terminate a worker" algorithms defined below.

  8. If worker global scope is actually a DedicatedWorkerGlobalScope object (i.e. the worker is a dedicated worker), then enable the port message queue of the worker's implicit port.

  9. Event loop: Wait until either there is a task in one of the event loop's task queues or worker global scope's closing flag is set to true.

  10. Run the oldest task on one of the event loop's task queues, if any. The user agent may pick any task queue.

    The handling of events or the execution of callbacks might get prematurely aborted by the "kill a worker" or "terminate a worker" algorithms defined below.

  11. Remove the task run in the previous step, if any, from its task queue.

  12. If there are any more events in the event loop's task queues or if worker global scope's closing flag is set to false, then jump back to the step above labeled event loop.

  13. If there are any outstanding transactions that have callbacks that involve scripts whose global object is the worker global scope, roll them back (without invoking any of the callbacks).

  14. Empty the worker global scope's list of active timeouts and its list of active intervals.


When a user agent is to kill a worker it must run the following steps in parallel with the worker's main loop (the "run a worker" processing model defined above):

  1. Set the worker's WorkerGlobalScope object's closing flag to true.

  2. If there are any tasks queued in the event loop's task queues, discard them without processing them.

  3. Wait a user-agent-defined amount of time.

  4. Abort the script currently running in the worker.

User agents may invoke the "kill a worker" processing model on a worker at any time, e.g. in response to user requests, in response to CPU quota management, or when a worker stops being an active needed worker if the worker continues executing even after its closing flag was set to true.


When a user agent is to terminate a worker it must run the following steps in parallel with the worker's main loop (the "run a worker" processing model defined above):

  1. Set the worker's WorkerGlobalScope object's closing flag to true.

  2. If there are any tasks queued in the event loop's task queues, discard them without processing them.

  3. Abort the script currently running in the worker.

  4. If the worker's WorkerGlobalScope object is actually a DedicatedWorkerGlobalScope object (i.e. the worker is a dedicated worker), then empty the port message queue of the port that the worker's implicit port is entangled with.


The task source for the tasks mentioned above is the DOM manipulation task source.

9.2.6 Runtime script errors

Whenever an uncaught runtime script error occurs in one of the worker's scripts, if the error did not occur while handling a previous script error, the user agent must report the error using the WorkerGlobalScope object's onerror attribute.

For shared workers, if the error is still not handled afterwards, or if the error occurred while handling a previous script error, the error may be reported to the user.

For dedicated workers, if the error is still not handled afterwards, or if the error occurred while handling a previous script error, the user agent must queue a task to fire a worker error event at the Worker object associated with the worker.

When the user agent is to fire a worker error event at a Worker object, it must dispatch an event that uses the ErrorEvent interface, with the name error, that doesn't bubble and is cancelable, with its message, filename, and lineno attributes set appropriately. The default action of this event depends on whether the Worker object is itself in a worker. If it is, and that worker is also a dedicated worker, then the user agent must again queue a task to fire a worker error event at the Worker object associated with that worker. Otherwise, then the error may be reported to the user.

The task source for the tasks mentioned above is the DOM manipulation task source.


interface ErrorEvent : Event {
  readonly attribute DOMString message;
  readonly attribute DOMString filename;
  readonly attribute unsigned long lineno;
  void initErrorEvent(in DOMString typeArg, in boolean canBubbleArg, in boolean cancelableArg, in DOMString messageArg, in DOMString filenameArg, in unsigned long linenoArg);
};

The initErrorEvent() method must initialize the event in a manner analogous to the similarly-named method in the DOM Events interfaces. [DOMEVENTS]

The message attribute represents the error message.

The filename attribute represents the absolute URL of the script in which the error originally occurred.

The lineno attribute represents the line number where the error occurred in the script.

9.2.7 Creating workers

9.2.7.1 The AbstractWorker abstract interface
[Supplemental, NoInterfaceObject]
interface AbstractWorker {
           attribute Function onerror;

};
AbstractWorker implements EventTarget;

The following are the event handlers (and their corresponding event handler event types) that must be supported, as IDL attributes, by objects implementing the AbstractWorker interface:

Event handler Event handler event type
onerror error
9.2.7.2 Dedicated workers and the Worker interface
[Constructor(in DOMString scriptURL)]
interface Worker : AbstractWorker {
  void terminate();

  void postMessage(in any message, in optional MessagePortArray ports);
           attribute Function onmessage;
};

The terminate() method, when invoked, must cause the "terminate a worker" algorithm to be run on the worker with with the object is associated.

Worker objects act as if they had an implicit MessagePort associated with them. This port is part of a channel that is set up when the worker is created, but it is not exposed. This object must never be garbage collected before the Worker object.

All messages received by that port must immediately be retargeted at the Worker object.

The postMessage() method on Worker objects must act as if, when invoked, it immediately invoked the method of the same name on the port, with the same arguments, and returned the same return value.

The following are the event handlers (and their corresponding event handler event types) that must be supported, as IDL attributes, by objects implementing the Worker interface:

Event handler Event handler event type
onmessage message

When the Worker(scriptURL) constructor is invoked, the user agent must run the following steps:

  1. Resolve the scriptURL argument relative to the entry script's base URL, when the method is invoked.

  2. If this fails, throw a SYNTAX_ERR exception.

  3. If the origin of the resulting absolute URL is not the same as the origin of the entry script, then throw a SECURITY_ERR exception.

    Thus, scripts must be external files with the same scheme as the original page: you can't load a script from a data: URL or javascript: URL, and a https: page couldn't start workers using scripts with http: URLs.

  4. Create a new DedicatedWorkerGlobalScope object. Let worker global scope be this new object.

  5. Create a new Worker object, associated with worker global scope. Let worker be this new object.

  6. Create a new MessagePort object owned by the global object of the script that invoked the constructor. Let this be the outside port.

  7. Associate the outside port with worker.

  8. Create a new MessagePort object owned by worker global scope. Let inside port be this new object.

  9. Associate inside port with worker global scope.

  10. Entangle outside port and inside port.

  11. Return worker, and run the following steps asynchronously.

  12. Enable outside port's port message queue.

  13. Let docs be the list of relevant Document objects to add given the global object of the script that invoked the constructor.

  14. Add to worker global scope's list of the worker's Documents the Document objects in docs.

  15. If the global object of the script that invoked the constructor is a WorkerGlobalScope object (i.e. we are creating a nested worker), add worker global scope to the list of the worker's workers of the WorkerGlobalScope object that is the global object of the script that invoked the constructor.

  16. Run a worker for the resulting absolute URL, with the script's browsing context of the script that invoked the method as the owner browsing context, with the script's document of the script that invoked the method as the owner document, with the origin of the entry script as the owner origin, and with worker global scope as the global scope.

This constructor must be visible when the script's global object is either a Window object or an object implementing the WorkerUtils interface.

9.2.7.3 Shared workers and the SharedWorker interface
[Constructor(in DOMString scriptURL, in optional DOMString name)]
interface SharedWorker : AbstractWorker {
  readonly attribute MessagePort port;
};

The port attribute must return the value it was assigned by the object's constructor. It represents the MessagePort for communicating with the shared worker.

When the SharedWorker(scriptURL, name) constructor is invoked, the user agent must run the following steps:

  1. Resolve the scriptURL argument.

  2. If this fails, throw a SYNTAX_ERR exception.

  3. Otherwise, let scriptURL be the resulting absolute URL.

  4. Let name be the value of the second argument, or the empty string if the second argument was omitted.

  5. If the origin of scriptURL is not the same as the origin of the entry script, then throw a SECURITY_ERR exception.

    Thus, scripts must be external files with the same scheme as the original page: you can't load a script from a data: URL or javascript: URL, and a https: page couldn't start workers using scripts with http: URLs.

  6. Let docs be the list of relevant Document objects to add given the global object of the script that invoked the constructor.

  7. Execute the following substeps atomically:

    1. Create a new SharedWorker object, which will shortly be associated with a SharedWorkerGlobalScope object. Let this SharedWorker object be worker.

    2. Create a new MessagePort object owned by the global object of the script that invoked the method. Let this be the outside port.

    3. Assign outside port to the port attribute of worker.

    4. Let worker global scope be null.

    5. If name is not the empty string and there exists a SharedWorkerGlobalScope object whose closing flag is false, whose name attribute is exactly equal to name, and whose location attribute represents an absolute URL with the same origin as scriptURL, then let worker global scope be that SharedWorkerGlobalScope object.

    6. Otherwise, if name is the empty string and there exists a SharedWorkerGlobalScope object whose closing flag is false, and whose location attribute represents an absolute URL that is exactly equal to scriptURL, then let worker global scope be that SharedWorkerGlobalScope object.

    7. If worker global scope is not null, then run these steps:

      1. If worker global scope's location attribute represents an absolute URL that is not exactly equal to scriptURL, then throw a URL_MISMATCH_ERR exception and abort all these steps.

      2. Associate worker with worker global scope.

      3. Create a new MessagePort object owned by worker global scope. Let this be the inside port.

      4. Entangle outside port and inside port.

      5. Return worker and perform the next step asynchronously.

      6. Create an event that uses the MessageEvent interface, with the name connect, which does not bubble, is not cancelable, has no default action, has a data attribute whose value is the empty string and has a ports attribute whose value is an array containing only the newly created port, and queue a task to dispatch the event at worker global scope.

      7. Add to worker global scope's list of the worker's Documents the Document objects in docs.

      8. If the global object of the script that invoked the constructor is a WorkerGlobalScope object, add worker global scope to the list of the worker's workers of the WorkerGlobalScope object that is the global object of the script that invoked the constructor.

      9. Abort all these steps.

    8. Create a new SharedWorkerGlobalScope object. Let worker global scope be this new object.

    9. Associate worker with worker global scope.

    10. Set the name attribute of worker global scope to name.

    11. Create a new MessagePort object owned by worker global scope. Let inside port be this new object.

    12. Entangle outside port and inside port.

  8. Return worker and perform the remaining steps asynchronously.

  9. Create an event that uses the MessageEvent interface, with the name connect, which does not bubble, is not cancelable, has no default action, has a data attribute whose value is the empty string and has a ports attribute whose value is an array containing only the newly created port, and queue a task to dispatch the event at worker global scope.

  10. Add to worker global scope's list of the worker's Documents the Document objects in docs.

  11. If the global object of the script that invoked the constructor is a WorkerGlobalScope object, add worker global scope to the list of the worker's workers of the WorkerGlobalScope object that is the global object of the script that invoked the constructor.

  12. Run a worker for scriptURL, with the script's browsing context of the script that invoked the method as the owner browsing context, with the script's document of the script that invoked the method as the owner document, with the origin of the entry script as the owner origin, and with worker global scope as the global scope.

This constructor must be visible when the script's global object is either a Window object or an object implementing the WorkerUtils interface.

The task source for the tasks mentioned above is the DOM manipulation task source.

9.3 APIs available to workers

[Supplemental, NoInterfaceObject]
interface WorkerUtils {
  void importScripts(in DOMString... urls);
  readonly attribute WorkerNavigator navigator;
};
WorkerUtils implements WindowTimers;

The DOM APIs (Node objects, Document objects, etc) are not available to workers in this version of this specification.

9.3.1 Importing scripts and libraries

When a script invokes the importScripts(urls) method on a WorkerGlobalScope object, the user agent must run the following steps:

  1. If there are no arguments, return without doing anything. Abort these steps.

  2. Resolve each argument.

  3. If any fail, throw a SYNTAX_ERR exception.

  4. Attempt to fetch each resource identified by the resulting absolute URLs, from the entry script's origin.

  5. For each argument in turn, in the order given, starting with the first one, run these substeps:

    1. Wait for the fetching attempt for the corresponding resource to complete.

      If the fetching attempt failed, throw a NETWORK_ERR exception and abort all these steps.

      If the attempt succeeds, then convert the script resource to Unicode by assuming it was encoded as UTF-8, to obtain its source.

      Let language be JavaScript.

      As with the worker's script, the script here is always assumed to be JavaScript, regardless of the MIME type.

    2. Create a script, using source as the script source and language as the scripting language, using the same global object, browsing context, URL character encoding, base URL, and script group as the script that was created by the worker's run a worker algorithm.

      Let the newly created script run until it either returns, fails to parse, fails to catch an exception, or gets prematurely aborted by the "kill a worker" or "terminate a worker" algorithms defined above.

      If it failed to parse, then throw an ECMAScript SyntaxError exception and abort all these steps. [ECMA262]

      If an exception was raised or if the script was prematurely aborted, then abort all these steps, letting the exception or aborting continue to be processed by the script that called the importScripts() method.

      If the "kill a worker" or "terminate a worker" algorithms abort the script then abort all these steps.

9.3.2 The WorkerNavigator object

The navigator attribute of the WorkerUtils interface must return an instance of the WorkerNavigator interface, which represents the identity and state of the user agent (the client):

interface WorkerNavigator {};
WorkerNavigator implements NavigatorID;
WorkerNavigator implements NavigatorOnLine;

Objects implementing the WorkerNavigator interface also implement the NavigatorID and NavigatorOnLine interfaces.

This WorkerNavigator interface must not exist if the interface's relevant namespace object is a Window object. [WEBIDL]

9.3.3 APIs defined in other specifications

The openDatabase() and openDatabaseSync() methods are defined in the Web SQL Database specification. [WEBSQL]

9.3.4 Interface objects and constructors

There must be no interface objects and constructors available in the global scope of scripts whose script's global object is a WorkerGlobalScope object except for the following:

These requirements do not override the requirements defined by the Web IDL specification, in particular concerning the visibility of interfaces annotated with the [NoInterfaceObject] extended attribute.

9.3.5 Worker locations

interface WorkerLocation {
  readonly attribute DOMString href;
  readonly attribute DOMString protocol;
  readonly attribute DOMString host;
  readonly attribute DOMString hostname;
  readonly attribute DOMString port;
  readonly attribute DOMString pathname;
  readonly attribute DOMString search;
  readonly attribute DOMString hash;
};

A WorkerLocation object represents an absolute URL set at its creation.

The href attribute must return the absolute URL that the object represents.

The WorkerLocation interface also has the complement of URL decomposition IDL attributes, protocol, host, port, hostname, pathname, search, and hash. These must follow the rules given for URL decomposition IDL attributes, with the input being the absolute URL that the object represents (same as the href attribute), and the common setter action being a no-op, since the attributes are defined to be readonly.

The WorkerLocation interface must not exist if the interface's relevant namespace object is a Window object. [WEBIDL]

10 Communication

10.1 Event definitions

Messages in server-sent events, Web sockets, cross-document messaging, and channel messaging use the message event.

The following interface is defined for this event:

interface MessageEvent : Event {
  readonly attribute any data;
  readonly attribute DOMString origin;
  readonly attribute DOMString lastEventId;
  readonly attribute WindowProxy source;
  readonly attribute MessagePortArray ports;
  void initMessageEvent(in DOMString typeArg, in boolean canBubbleArg, in boolean cancelableArg, in any dataArg, in DOMString originArg, in DOMString lastEventIdArg, in WindowProxy sourceArg, in MessagePortArray portsArg);
};
event . data

Returns the data of the message.

event . origin

Returns the origin of the message, for server-sent events and cross-document messaging.

event . lastEventId

Returns the last event ID, for server-sent events.

event . source

Returns the WindowProxy of the source window, for cross-document messaging.

event . ports

Returns the MessagePortArray sent with the message, for cross-document messaging and channel messaging.

The initMessageEvent() method must initialize the event in a manner analogous to the similarly-named method in the DOM Events interfaces. [DOMEVENTS]

The data attribute represents the message being sent.

The origin attribute represents, in server-sent events and cross-document messaging, the origin of the document that sent the message (typically the scheme, hostname, and port of the document, but not its path or fragment identifier).

The lastEventId attribute represents, in server-sent events, the last event ID string of the event source.

The source attribute represents, in cross-document messaging, the WindowProxy of the browsing context of the Window object from which the message came.

The ports attribute represents, in cross-document messaging and channel messaging the MessagePortArray being sent, if any.

Except where otherwise specified, when the user agent creates and dispatches a message event in the algorithms described in the following sections, the lastEventId attribute must be the empty string, the origin attribute must be the empty string, the source attribute must be null, and the ports attribute must be null.

10.2 Server-sent events

10.2.1 Introduction

This section is non-normative.

To enable servers to push data to Web pages over HTTP or using dedicated server-push protocols, this specification introduces the EventSource interface.

Using this API consists of creating an EventSource object and registering an event listener.

var source = new EventSource('updates.cgi');
source.onmessage = function (event) {
  alert(event.data);
};

On the server-side, the script ("updates.cgi" in this case) sends messages in the following form, with the text/event-stream MIME type:

data: This is the first message.

data: This is the second message, it
data: has two lines.

data: This is the third message.

Using this API rather than emulating it using XMLHttpRequest or an iframe allows the user agent to make better use of network resources in cases where the user agent implementor and the network operator are able to coordinate in advance. Amongst other benefits, this can result in significant savings in battery life on portable devices. This is discussed further in the section below on connectionless push.

10.2.2 The EventSource interface

[Constructor(in DOMString url)]
interface EventSource {
  readonly attribute DOMString URL;

  // ready state
  const unsigned short CONNECTING = 0;
  const unsigned short OPEN = 1;
  const unsigned short CLOSED = 2;
  readonly attribute unsigned short readyState;

  // networking
           attribute Function onopen;
           attribute Function onmessage;
           attribute Function onerror;
  void close();
};
EventSource implements EventTarget;

The EventSource(url) constructor takes one argument, url, which specifies the URL to which to connect. When the EventSource() constructor is invoked, the UA must run these steps:

  1. Resolve the URL specified in url, relative to the entry script's base URL.

  2. If the previous step failed, then throw a SYNTAX_ERR exception.

  3. Return a new EventSource object, and continue these steps in the background (without blocking scripts).

  4. Fetch the resource identified by the resulting absolute URL, from the entry script's origin, and process it as described below.

    The definition of the fetching algorithm is such that if the browser is already fetching the resource identified by the given absolute URL, that connection can be reused, instead of a new connection being established. All messages received up to this point are dispatched immediately, in this case.

This constructor must be visible when the script's global object is either a Window object or an object implementing the WorkerUtils interface.


The URL attribute must return the absolute URL that resulted from resolving the value that was passed to the constructor.

The readyState attribute represents the state of the connection. It can have the following values:

CONNECTING (numeric value 0)
The connection has not yet been established, or it was closed and the user agent is reconnecting.
OPEN (numeric value 1)
The user agent has an open connection and is dispatching events as it receives them.
CLOSED (numeric value 2)
The connection is not open, and the user agent is not trying to reconnect. Either there was a fatal error or the close() method was invoked.

When the object is created its readyState must be set to CONNECTING (0). The rules given below for handling the connection define when the value changes.

The close() method must close the connection, if any; must abort any reconnection attempt, if any; and must set the readyState attribute to CLOSED. If the connection is already closed, the method must do nothing.

The following are the event handlers (and their corresponding event handler event types) that must be supported, as IDL attributes, by all objects implementing the EventSource interface:

Event handler Event handler event type
onopen open
onmessage message
onerror error

In addition to the above, each EventSource object has the following associated with it:

These values are not currently exposed on the interface.

10.2.3 Processing model

The resource indicated in the argument to the EventSource constructor is fetched when the constructor is run.

For HTTP connections, the Accept header may be included; if included, it must contain only formats of event framing that are supported by the user agent (one of which must be text/event-stream, as described below).

If the event source's last event ID string is not the empty string, then a Last-Event-ID HTTP header must be included with the request, whose value is the value of the event source's last event ID string.

User agents should use the Cache-Control: no-cache header in requests to bypass any caches for requests of event sources. User agents should ignore HTTP cache headers in the response, never caching event sources.

User agents must act as if the connection had failed due to a network error if the origin of the URL of the resource to be fetched is not the same origin as that of the entry script when the EventSource() constructor is invoked.


As data is received, the tasks queued by the networking task source to handle the data must act as follows.

HTTP 200 OK responses with a Content-Type header specifying the type text/event-stream must be processed line by line as described below.

When a successful response with a supported MIME type is received, such that the user agent begins parsing the contents of the stream, the user agent must announce the connection.

The task that the networking task source places on the task queue once the fetching algorithm for such a resource (with the correct MIME type) has completed must reestablish the connection. This applies whether the connection is closed gracefully or unexpectedly. It doesn't apply for the error conditions listed below.

HTTP 200 OK responses that have a Content-Type other than text/event-stream (or some other supported type) must cause the user agent to fail the connection.

HTTP 204 No Content, and 205 Reset Content responses are equivalent to 200 OK responses with the right MIME type but no content, and thus must reestablish the connection.

Other HTTP response codes in the 2xx range must similarly reestablish the connection. They are, however, likely to indicate an error has occurred somewhere and may cause the user agent to emit a warning.

HTTP 301 Moved Permanently responses must cause the user agent to reconnect using the new server specified URL instead of the previously specified URL for all subsequent requests for this event source. (It doesn't affect other EventSource objects with the same URL unless they also receive 301 responses, and it doesn't affect future sessions, e.g. if the page is reloaded.)

HTTP 302 Found, 303 See Other, and 307 Temporary Redirect responses must cause the user agent to connect to the new server-specified URL, but if the user agent needs to again request the resource at a later point, it must return to the previously specified URL for this event source.

The Origin specification also introduces some relevant requirements when dealing with redirects. [ORIGIN]

HTTP 305 Use Proxy, HTTP 401 Unauthorized, and 407 Proxy Authentication Required should be treated transparently as for any other subresource.

Any other HTTP response code not listed here, and any network error that prevents the HTTP connection from being established in the first place (e.g. DNS errors), must cause the user agent to fail the connection.

For non-HTTP protocols, UAs should act in equivalent ways.


When a user agent is to announce the connection, the user agent must set the readyState attribute to OPEN and queue a task to fire a simple event named open at the EventSource object.

When a user agent is to reestablish the connection, the user agent must set the readyState attribute to CONNECTING, queue a task to fire a simple event named error at the EventSource object, and then fetch the event source resource again after a delay equal to the reconnection time of the event source, from the same origin as the original request triggered by the EventSource() constructor. Only if the user agent reestablishes the connection does the connection get opened anew!

When a user agent is to fail the connection, the user agent must set the readyState attribute to CLOSED and queue a task to fire a simple event named error at the EventSource object. Once the user agent has failed the connection, it does not attempt to reconnect!


The task source for any tasks that are queued by EventSource objects is the remote event task source.

10.2.4 Parsing an event stream

This event stream format's MIME type is text/event-stream.

The event stream format is as described by the stream production of the following ABNF, the character set for which is Unicode. [ABNF]

stream        = [ bom ] *event
event         = *( comment / field ) end-of-line
comment       = colon *any-char end-of-line
field         = 1*name-char [ colon [ space ] *any-char ] end-of-line
end-of-line   = ( cr lf / cr / lf / eof )
eof           = < matches repeatedly at the end of the stream >

; characters
lf            = %x000A ; U+000A LINE FEED (LF)
cr            = %x000D ; U+000D CARRIAGE RETURN (CR)
space         = %x0020 ; U+0020 SPACE
colon         = %x003A ; U+003A COLON (:)
bom           = %xFEFF ; U+FEFF BYTE ORDER MARK
name-char     = %x0000-0009 / %x000B-000C / %x000E-0039 / %x003B-10FFFF
                ; a Unicode character other than U+000A LINE FEED (LF), U+000D CARRIAGE RETURN (CR), or U+003A COLON (:)
any-char      = %x0000-0009 / %x000B-000C / %x000E-10FFFF
                ; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)

Event streams in this format must always be encoded as UTF-8.

Lines must be separated by either a U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF) character, or a single U+000D CARRIAGE RETURN (CR) character.

Since connections established to remote servers for such resources are expected to be long-lived, UAs should ensure that appropriate buffering is used. In particular, while line buffering with lines are defined to end with a single U+000A LINE FEED (LF) character is safe, block buffering or line buffering with different expected line endings can cause delays in event dispatch.

10.2.5 Interpreting an event stream

Bytes or sequences of bytes that are not valid UTF-8 sequences must be interpreted as the U+FFFD REPLACEMENT CHARACTER.

One leading U+FEFF BYTE ORDER MARK character must be ignored if any are present.

The stream must then be parsed by reading everything line by line, with a U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF) character not preceded by a U+000D CARRIAGE RETURN (CR) character, a single U+000D CARRIAGE RETURN (CR) character not followed by a U+000A LINE FEED (LF) character, and the end of the file being the four ways in which a line can end.

When a stream is parsed, a data buffer and an event name buffer must be associated with it. They must be initialized to the empty string

Lines must be processed, in the order they are received, as follows:

If the line is empty (a blank line)

Dispatch the event, as defined below.

If the line starts with a U+003A COLON character (:)

Ignore the line.

If the line contains a U+003A COLON character character (:)

Collect the characters on the line before the first U+003A COLON character (:), and let field be that string.

Collect the characters on the line after the first U+003A COLON character (:), and let value be that string. If value starts with a U+0020 SPACE character, remove it from value.

Process the field using the steps described below, using field as the field name and value as the field value.

Otherwise, the string is not empty but does not contain a U+003A COLON character character (:)

Process the field using the steps described below, using the whole line as the field name, and the empty string as the field value.

Once the end of the file is reached, the user agent must dispatch the event one final time, as defined below.

The steps to process the field given a field name and a field value depend on the field name, as given in the following list. Field names must be compared literally, with no case folding performed.

If the field name is "event"

Set the event name buffer to field value.

If the field name is "data"

Append the field value to the data buffer, then append a single U+000A LINE FEED (LF) character to the data buffer.

If the field name is "id"

Set the event stream's last event ID to the field value.

If the field name is "retry"

If the field value consists of only characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), then interpret the field value as an integer in base ten, and set the event stream's reconnection time to that integer. Otherwise, ignore the field.

Otherwise

The field is ignored.

When the user agent is required to dispatch the event, then the user agent must act as follows:

  1. If the data buffer is an empty string, set the data buffer and the event name buffer to the empty string and abort these steps.

  2. If the data buffer's last character is a U+000A LINE FEED (LF) character, then remove the last character from the data buffer.

  3. If the event name buffer is not the empty string but is also not a valid event type name, as defined by the DOM Events specification, set the data buffer and the event name buffer to the empty string and abort these steps. [DOMEVENTS]

  4. Otherwise, create an event that uses the MessageEvent interface, with the event name message, which does not bubble, is not cancelable, and has no default action. The data attribute must be set to the value of the data buffer, the origin attribute must be set to the Unicode serialization of the origin of the event stream's URL, and the lastEventId attribute must be set to the last event ID string of the event source.

  5. If the event name buffer has a value other than the empty string, change the type of the newly created event to equal the value of the event name buffer.

  6. Set the data buffer and the event name buffer to the empty string.

  7. Queue a task to dispatch the newly created event at the EventSource object.

If an event doesn't have an "id" field, but an earlier event did set the event source's last event ID string, then the event's lastEventId field will be set to the value of whatever the last seen "id" field was.

The following event stream, once followed by a blank line:

data: YHOO
data: +2
data: 10

...would cause an event message with the interface MessageEvent to be dispatched on the EventSource object. The event's data attribute would contain the string YHOO\n+2\n10 (where \n represents a newline).

This could be used as follows:

var stocks = new EventSource("http://stocks.example.com/ticker.php");
stocks.onmessage = function (event) {
  var data = event.data.split('\n');
  updateStocks(data[0], data[1], data[2]);
};

...where updateStocks() is a function defined as:

function updateStocks(symbol, delta, value) { ... }

...or some such.

The following stream contains four blocks. The first block has just a comment, and will fire nothing. The second block has two fields with names "data" and "id" respectively; an event will be fired for this block, with the data "first event", and will then set the last event ID to "1" so that if the connection died between this block and the next, the server would be sent a Last-Event-ID header with the value "1". The third block fires an event with data "second event", and also has an "id" field, this time with no value, which resets the last event ID to the empty string (meaning no Last-Event-ID header will now be sent in the event of a reconnection being attempted). Finally, the last block just fires an event with the data " third event" (with a single leading space character). Note that the last block doesn't have to end with a blank line, the end of the stream is enough to trigger the dispatch of the last event.

: test stream

data: first event
id: 1

data:second event
id

data:  third event

The following stream fires just one event:

data

data
data

data:

The first and last blocks do nothing, since they do not contain any actual data (the data buffer remains at the empty string, and so nothing gets dispatched). The middle block fires an event with the data set to a single newline character.

The following stream fires two identical events:

data:test

data: test

This is because the space after the colon is ignored if present.

10.2.6 Notes

Legacy proxy servers are known to, in certain cases, drop HTTP connections after a short timeout. To protect against such proxy servers, authors can include a comment line (one starting with a ':' character) every 15 seconds or so.

Authors wishing to relate event source connections to each other or to specific documents previously served might find that relying on IP addresses doesn't work, as individual clients can have multiple IP addresses (due to having multiple proxy servers) and individual IP addresses can have multiple clients (due to sharing a proxy server). It is better to include a unique identifier in the document when it is served and then pass that identifier as part of the URL when the connection is established.

Authors are also cautioned that HTTP chunking can have unexpected negative effects on the reliability of this protocol. Where possible, chunking should be disabled for serving event streams unless the rate of messages is high enough for this not to matter.

Clients that support HTTP's per-server connection limitation might run into trouble when opening multiple pages from a site if each page has an EventSource to the same domain. Authors can avoid this using the relatively complex mechanism of using unique domain names per connection, or by allowing the user to enable or disable the EventSource functionality on a per-page basis, or by sharing a single EventSource object using a shared worker.

10.2.7 Connectionless push and other features

User agents running in controlled environments, e.g. browsers on mobile handsets tied to specific carriers, may offload the management of the connection to a proxy on the network. In such a situation, the user agent for the purposes of conformance is considered to include both the handset software and the network proxy.

For example, a browser on a mobile device, after having established a connection, might detect that it is on a supporting network and request that a proxy server on the network take over the management of the connection. The timeline for such a situation might be as follows:

  1. Browser connects to a remote HTTP server and requests the resource specified by the author in the EventSource constructor.
  2. The server sends occasional messages.
  3. In between two messages, the browser detects that it is idle except for the network activity involved in keeping the TCP connection alive, and decides to switch to sleep mode to save power.
  4. The browser disconnects from the server.
  5. The browser contacts a service on the network, and requests that that service, a "push proxy", maintain the connection instead.
  6. The "push proxy" service contacts the remote HTTP server and requests the resource specified by the author in the EventSource constructor (possibly including a Last-Event-ID HTTP header, etc).
  7. The browser allows the mobile device to go to sleep.
  8. The server sends another message.
  9. The "push proxy" service uses a technology such as OMA push to convey the event to the mobile device, which wakes only only enough to process the event and then returns to sleep.

This can reduce the total data usage, and can therefore result in considerable power savings.

As well as implementing the existing API and text/event-stream wire format as defined by this specification and in more distributed ways as described above, formats of event framing defined by other applicable specifications may be supported. This specification does not define how they are to be parsed or processed.

10.2.8 Garbage collection

While an EventSource object's readyState is not CLOSED, and the object has one or more event listeners registered for message events, there must be a strong reference from the Window or WorkerUtils object that the EventSource object's constructor was invoked from to the EventSource object itself.

If an EventSource object is garbage collected while its connection is still open, the connection must be closed.

10.2.9 IANA considerations

10.2.9.1 text/event-stream

This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.

Type name:
text
Subtype name:
event-stream
Required parameters:
No parameters
Optional parameters:
No parameters
Encoding considerations:
Always UTF-8.
Security considerations:

An event stream from an origin distinct from the origin of the content consuming the event stream can result in information leakage. To avoid this, user agents are required to block all cross-origin loads.

Event streams can overwhelm a user agent; a user agent is expected to apply suitable restrictions to avoid depleting local resources because of an overabundance of information from an event stream.

Servers can be overwhelmed if a situation develops in which the server is causing clients to reconnect rapidly. Servers should use a 5xx status code to indicate capacity problems, as this will prevent conforming clients from reconnecting automatically.

Interoperability considerations:
Rules for processing both conforming and non-conforming content are defined in this specification.
Published specification:
This document is the relevant specification.
Applications that use this media type:
Web browsers and tools using Web services.
Additional information:
Magic number(s):
No sequence of bytes can uniquely identify an event stream.
File extension(s):
No specific file extensions are recommended for this type.
Macintosh file type code(s):
No specific Macintosh file type codes are recommended for this type.
Person & email address to contact for further information:
Ian Hickson <ian@hixie.ch>
Intended usage:
Common
Restrictions on usage:
This format is only expected to be used by dynamic open-ended streams served using HTTP or a similar protocol. Finite resources are not expected to be labeled with this type.
Author:
Ian Hickson <ian@hixie.ch>
Change controller:
W3C

Fragment identifiers have no meaning with text/event-stream resources.

10.2.9.2 Last-Event-ID

This section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]

Header field name
Last-Event-ID
Applicable protocol
http
Status
standard
Author/Change controller
W3C
Specification document(s)
This document is the relevant specification.
Related information
None.

10.3 Web sockets

10.3.1 Introduction

This section is non-normative.

To enable Web applications to maintain bidirectional communications with server-side processes, this specification introduces the WebSocket interface.

This interface does not allow for raw access to the underlying network. For example, this interface could not be used to implement an IRC client without proxying messages through a custom server.

10.3.2 The WebSocket interface

[Constructor(in DOMString url, in optional DOMString protocol)]
interface WebSocket {
  readonly attribute DOMString URL;

  // ready state
  const unsigned short CONNECTING = 0;
  const unsigned short OPEN = 1;
  const unsigned short CLOSING = 2;
  const unsigned short CLOSED = 3;
  readonly attribute unsigned short readyState;
  readonly attribute unsigned long bufferedAmount;

  // networking
           attribute Function onopen;
           attribute Function onmessage;
           attribute Function onerror;
           attribute Function onclose;
  boolean send(in DOMString data);
  void close();
};
WebSocket implements EventTarget;

The WebSocket(url, protocol) constructor takes one or two arguments. The first argument, url, specifies the URL to which to connect. The second, protocol, if present, specifies a sub-protocol that the server must support for the connection to be successful. The sub-protocol name must be a non-empty ASCII string with no control characters in it (i.e. only characters in the range U+0020 to U+007E).

When the WebSocket() constructor is invoked, the UA must run these steps:

  1. Parse a WebSocket URL's components from the url argument, to obtain host, port, resource name, and secure. If this fails, throw a SYNTAX_ERR exception and abort these steps.

  2. If port is a port to which the user agent is configured to block access, then throw a SECURITY_ERR exception. (User agents typically block access to well-known ports like SMTP.)

  3. If protocol is present but is either the empty string or contains characters with Unicode code points less than U+0020 or greater than U+007E (i.e. any characters that are not printable ASCII characters), then throw a SYNTAX_ERR exception and abort these steps.

  4. Let origin be the ASCII serialization of the origin of the script that invoked the WebSocket() constructor, converted to ASCII lowercase.

  5. Return a new WebSocket object, and continue these steps in the background (without blocking scripts).

  6. Establish a WebSocket connection to a host host, on port port (if one was specified), from origin, with the flag secure, with resource name as the resource name, and with protocol as the protocol (if it is present).

    If the "establish a WebSocket connection" algorithm fails, it triggers the "fail the WebSocket connection" algorithm, which then invokes the "close the WebSocket connection" algorithm, which then establishes that the "WebSocket connection is closed", which fires the close event as described below.

This constructor must be visible when the script's global object is either a Window object or an object implementing the WorkerUtils interface.


The URL attribute must return the result of resolving the URL that was passed to the constructor. (It doesn't matter what it is resolved relative to, since we already know it is an absolute URL.)

The readyState attribute represents the state of the connection. It can have the following values:

CONNECTING (numeric value 0)
The connection has not yet been established.
OPEN (numeric value 1)
The WebSocket connection is established and communication is possible.
CLOSING (numeric value 2)
The connection is going through the closing handshake.
CLOSED (numeric value 3)
The connection has been closed or could not be opened.

When the object is created its readyState must be set to CONNECTING (0).

The send(data) method transmits data using the connection. If the readyState attribute is CONNECTING, it must raise an INVALID_STATE_ERR exception. If the data argument has any unpaired surrogates, then it must raise SYNTAX_ERR. If the connection is established, and the string has no unpaired surrogates, and the WebSocket closing handshake has not yet started, then the user agent must send data using the WebSocket. If the data cannot be sent, e.g. because it would need to be buffered but the buffer is full, the user agent must close the WebSocket connection. The method must then return true if the connection is still established (and the data was queued or sent successfully), or false if the connection is closing or closed (e.g. because the user agent just had a buffer overflow and failed to send the data, or because the WebSocket closing handshake has started).

The close() method must run the first matching steps from the following list:

If the readyState attribute is in the CLOSING (2) or CLOSED (3) state

Do nothing.

The connection is already closing or is already closed. If it has not already, a close event will eventually fire as described below.

If the WebSocket connection is not yet established

Fail the WebSocket connection and set the readyState attribute's value to CLOSING (2).

The "fail the WebSocket connection" algorithm invokes the "close the WebSocket connection" algorithm, which then establishes that the "WebSocket connection is closed", which fires the close event as described below.

If the WebSocket closing handshake has not yet been started

Start the WebSocket closing handshake and set the readyState attribute's value to CLOSING (2).

The "start the WebSocket closing handshake" algorithm eventually invokes the "close the WebSocket connection" algorithm, which then establishes that the "WebSocket connection is closed", which fires the close event as described below.

Otherwise

Set the readyState attribute's value to CLOSING (2).

The WebSocket closing handshake has started, and will eventually invokethe "close the WebSocket connection" algorithm, which will establish that the "WebSocket connection is closed", and thus the close event will fire, as described below.


The bufferedAmount attribute must return the number of bytes that have been queued but not yet sent. This does not include framing overhead incurred by the protocol. If the connection is closed, this attribute's value will only increase with each call to the send() method (the number does not reset to zero once the connection closes).

In this simple example, the bufferedAmount attribute is used to ensure that updates are sent either at the rate of one update every 50ms, if the network can handle that rate, or at whatever rate the network can handle, if that is too fast.

var socket = new WebSocket('ws://game.example.com:12010/updates');
socket.onopen = function () {
  setInterval(function() {
    if (socket.bufferedAmount == 0)
      socket.send(getUpdateData());
  }, 50);
};

The bufferedAmount attribute can also be used to saturate the network without sending the data at a higher rate than the network can handle, though this requires more careful monitoring of the value of the attribute over time.


The following are the event handlers that must be supported, as IDL attributes, by all objects implementing the WebSocket interface:

Event handler Event handler event type
onopen open
onmessage message
onerror error
onclose close

10.3.3 Feedback from the protocol

When the WebSocket connection is established, the user agent must queue a task to first change the readyState attribute's value to OPEN (1), and then fire a simple event named open at the WebSocket object.

When a WebSocket message has been received with text data, the user agent must create an event that uses the MessageEvent interface, with the event name message, which does not bubble, is not cancelable, has no default action, and whose data attribute is set to data, and queue a task to check to see if the readyState attribute's value is OPEN (1) or CLOSING (2), and if so, dispatch the event at the WebSocket object.

When a WebSocket error has been detected, the user agent must queue a task to check to see if the readyState attribute's value is OPEN (1) or CLOSING (2), and if so, fire a simple event named error at the WebSocket object.

When the WebSocket closing handshake has started, the user agent must queue a task to change the readyState attribute's value to CLOSING (2). (If the close() method was called, the readyState attribute's value will already be set to CLOSING (2) when this task runs.)

When the WebSocket connection is closed, possibly cleanly, the user agent must create an event that uses the CloseEvent interface, with the event name close, which does not bubble, is not cancelable, has no default action, and whose wasClean attribute is set to true if the connection closed cleanly and false otherwise; and queue a task to first change the readyState attribute's value to CLOSED (3), and then dispatch the event at the WebSocket object.

The task source for all tasks queued in this section is the WebSocket task source.

10.3.3.1 Event definitions
interface CloseEvent : Event {
  readonly attribute boolean wasClean;
  void initCloseEvent(in DOMString typeArg, in boolean canBubbleArg, in boolean cancelableArg, in boolean wasCleanArg);
};

The initCloseEvent() method must initialize the event in a manner analogous to the similarly-named method in the DOM Events interfaces. [DOMEVENTS]

The wasClean attribute represents whether the connection closed cleanly or not.

10.3.3.2 Garbage collection

A WebSocket object with an open connection must not be garbage collected if there are any event listeners registered for message events.

If a WebSocket object is garbage collected while its connection is still open, the user agent must close the WebSocket connection.

10.3.4 The WebSocket protocol

10.3.4.1 Introduction
10.3.4.1.1 Background

This section is non-normative.

Historically, creating an instant messenger chat client as a Web application has required an abuse of HTTP to poll the server for updates while sending upstream notifications as distinct HTTP calls.

This results in a variety of problems:

A simpler solution would be to use a single TCP connection for traffic in both directions. This is what the WebSocket protocol provides. Combined with the WebSocket API, it provides an alternative to HTTP polling for two-way communication from a Web page to a remote server.

The same technique can be used for a variety of Web applications: games, stock tickers, multiuser applications with simultaneous editing, user interfaces exposing server-side services in real time, etc.

10.3.4.1.2 Protocol overview

This section is non-normative.

The protocol has two parts: a handshake, and then the data transfer.

The handshake from the client looks as follows:

GET /demo HTTP/1.1
Host: example.com
Connection: Upgrade
Sec-WebSocket-Key2: 12998 5 Y3 1  .P00
Sec-WebSocket-Protocol: sample
Upgrade: WebSocket
Sec-WebSocket-Key1: 4 @1  46546xW%0l 1 5
Origin: http://example.com

^n:ds[4U

The handshake from the server looks as follows:

HTTP/1.1 101 WebSocket Protocol Handshake
Upgrade: WebSocket
Connection: Upgrade
Sec-WebSocket-Origin: http://example.com
Sec-WebSocket-Location: ws://example.com/demo
Sec-WebSocket-Protocol: sample

8jKS'y:G*Co,Wxa-

The leading line from the client follows the Request-Line format. The leading line from the server follows the Status-Line format. The Request-Line and Status-Line productions are defined in the HTTP specification.

After the leading line in both cases come an unordered ASCII case-insensitive set of fields, one per line, that each match the following non-normative ABNF: [ABNF]

field         = 1*name-char colon [ space ] *any-char cr lf
colon         = %x003A ; U+003A COLON (:)
space         = %x0020 ; U+0020 SPACE
cr            = %x000D ; U+000D CARRIAGE RETURN (CR)
lf            = %x000A ; U+000A LINE FEED (LF)
name-char     = %x0000-0009 / %x000B-000C / %x000E-0039 / %x003B-10FFFF
                ; a Unicode character other than U+000A LINE FEED (LF), U+000D CARRIAGE RETURN (CR), or U+003A COLON (:)
any-char      = %x0000-0009 / %x000B-000C / %x000E-10FFFF
                ; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)

The character set for the above ABNF is Unicode. The fields themselves are encoded as UTF-8.

Lines that don't match the above production cause the connection to be aborted.

Finally, after the last field, the client sends 10 bytes starting with 0x0D 0x0A and followed by 8 random bytes, part of a challenge, and the server sends 18 bytes starting with 0x0D 0x0A and followed by 16 bytes consisting of a challenge response. The details of this challenge and other parts of the handshake are described in the next section.


Once the client and server have both sent their handshakes, and if the handshake was successful, then the data transfer part starts. This is a two-way communication channel where each side can, independently from the other, send data at will.

Data is sent in the form of UTF-8 text. Each frame of data starts with a 0x00 byte and ends with a 0xFF byte, with the UTF-8 text in between.

The WebSocket protocol uses this framing so that specifications that use the WebSocket protocol can expose such connections using an event-based mechanism instead of requiring users of those specifications to implement buffering and piecing together of messages manually.

To close the connection cleanly, a frame consisting of just a 0xFF byte followed by a 0x00 byte is sent from one peer to ask that the other peer close the connection.

The protocol is designed to support other frame types in future. Instead of the 0x00 and 0xFF bytes, other bytes might in future be defined. Frames denoted by bytes that do not have the high bit set (0x00 to 0x7F) are treated as a stream of bytes terminated by 0xFF. Frames denoted by bytes that have the high bit set (0x80 to 0xFF) have a leading length indicator, which is encoded as a series of 7-bit bytes stored in octets with the 8th bit being set for all but the last byte. The remainder of the frame is then as much data as was specified. (The closing handshake contains no data and therefore has a length byte of 0x00.)

This wire format for the data transfer part is described by the following non-normative ABNF, which is given in two alternative forms: the first describing the wire format as allowed by this specification, and the second describing how an arbitrary bytestream would be parsed. [ABNF]

; the wire protocol as allowed by this specification
frames        = *frame
frame         = text-frame / closing-frame
text-frame    = %x00 *( UTF8-char ) %xFF
closing-frame = %xFF %x00

; the wire protocol including error-handling and forward-compatible parsing rules
frames        = *frame
frame         = text-frame / binary-frame
text-frame    = (%x00-7F) *(%x00-FE) %xFF
binary-frame  = (%x80-FF) length < as many bytes as given by the length >
length        = *(%x80-FF) (%x00-7F)

The UTF8-char rule is defined in the UTF-8 specification. [RFC3629]

The above ABNF is intended for a binary octet environment.

At this time, the WebSocket protocol cannot be used to send binary data. Using any of the frame types other than 0x00 and 0xFF is invalid.


The following diagram summarises the protocol:

Handshake
   |
   V
Frame type byte <--------------------------------------.
   |      |                                            |
   |      `--> (0x00 to 0x7F) --> Data... --> 0xFF -->-+
   |                                                   |
   `--> (0x80 to 0xFE) --> Length --> Data... ------->-'

10.3.4.1.3 Opening handshake

This section is non-normative.

The opening handshake is intended to be compatible with HTTP-based server-side software, so that a single port can be used by both HTTP clients talking to that server and WebSocket clients talking to that server. To this end, the WebSocket client's handshake appears to HTTP servers to be a regular GET request with an Upgrade offer:

GET / HTTP/1.1
Upgrade: WebSocket
Connection: Upgrade

Fields in the handshake are sent by the client in a random order; the order is not meaningful.

Additional fields are used to select options in the WebSocket protocol. The only options available in this version are the subprotocol selector, Sec-WebSocket-Protocol, and Cookie, which can used for sending cookies to the server (e.g. as an authentication mechanism). The Sec-WebSocket-Protocol field takes an arbitrary string:

Sec-WebSocket-Protocol: chat

This field indicates the subprotocol (the application-level protocol layered over the WebSocket protocol) that the client intends to use. The server echoes this field in its handshake to indicate that it supports that subprotocol.

The other fields in the handshake are all security-related. The Host field is used to protect against DNS rebinding attacks and to allow multiple domains to be served from one IP address.

Host: example.com

The server includes the hostname in the Sec-WebSocket-Location field of its handshake, so that both the client and the server can verify that they agree on which host is in use.

The Origin field is used to protect against unauthorized cross-origin use of a WebSocket server by scripts using the WebSocket API in a Web browser. The server specifies which origin it is willing to receive requests from by including a Sec-WebSocket-Origin field with that origin. If multiple origins are authorized, the server echoes the value in the Origin field of the client's handshake.

Origin: http://example.com

Finally, the server has to prove to the client that it received the client's WebSocket handshake, so that the server doesn't accept connections that are not WebSocket connections. This prevents an attacker from tricking a WebSocket server by sending it carefully-crafted packets using XMLHttpRequest or a form submission.

To prove that the handshake was received, the server has to take three pieces of information and combine them to form a response. The first two pieces of information come from the Sec-WebSocket-Key1 and Sec-WebSocket-Key2 fields in the client handshake:

Sec-WebSocket-Key1: 18x 6]8vM;54 *(5:  {   U1]8  z [  8
Sec-WebSocket-Key2: 1_ tx7X d  <  nw  334J702) 7]o}` 0

For each of these fields, the server has to take the digits from the value to obtain a number (in this case 1868545188 and 1733470270 respectively), then divide that number by the number of spaces characters in the value (in this case 12 and 10) to obtain a 32-bit number (155712099 and 173347027). These two resulting numbers are then used in the server handshake, as described below.

The counting of spaces is intended to make it impossible to smuggle this field into the resource name; making this even harder is the presence of two such fields, and the use of a newline as the only reliable indicator that the end of the key has been reached. The use of random characters interspersed with the spaces and the numbers ensures that the implementor actually looks for spaces and newlines, instead of being treating any character like a space, which would make it again easy to smuggle the fields into the path and trick the server. Finally, dividing by this number of spaces is intended to make sure that even the most naïve of implementations will check for spaces, since if ther server does not verify that there are some spaces, the server will try to divide by zero, which is usually fatal (a correct handshake will always have at least one space).

The third piece of information is given after the fields, in the last eight bytes of the handshake, expressed here as they would be seen if interpreted as ASCII:

Tm[K T2u

The concatenation of the number obtained from processing the Sec-WebSocket-Key1 field, expressed as a big-endian 32 bit number, the number obtained from processing the Sec-WebSocket-Key2 field, again expressed as a big-endian 32 bit number, and finally the eight bytes at the end of the handshake, form a 128 bit string whose MD5 sum is then used by the server to prove that it read the handshake.


The handshake from the server is much simpler than the client handshake. The first line is an HTTP Status-Line, with the status code 101 (the HTTP version and reason phrase aren't important):

HTTP/1.1 101 WebSocket Protocol Handshake

The fields follow. Two of the fields are just for compatibility with HTTP:

Upgrade: WebSocket
Connection: Upgrade

Two of the fields are part of the security model described above, echoing the origin and stating the exact host, port, resource name, and whether the connection is expected to be encrypted:

Sec-WebSocket-Origin: http://example.com
Sec-WebSocket-Location: ws://example.com/

These fields are checked by the Web browser when it is acting as a WebSocket client for scripted pages. A server that only handles one origin and only serves one resource can therefore just return hard-coded values and does not need to parse the client's handshake to verify the correctness of the values.

Option fields can also be included. In this version of the protocol, the main option field is Sec-WebSocket-Protocol, which indicates the subprotocol that the server speaks. Web browsers verify that the server included the same value as was specified in the WebSocket constructor, so a server that speaks multiple subprotocols has to make sure it selects one based on the client's handshake and specifies the right one in its handshake.

Sec-WebSocket-Protocol: chat

The server can also set cookie-related option fields to set cookies, as in HTTP.

After the fields, the server sends the aforementioned MD5 sum, a 16 byte (128 bit) value, shown here as if interpreted as ASCII:

fQJ,fN/4F4!~K~MH

This value depends on what the client sends, as described above. If it doesn't match what the client is expecting, the client would disconnect.

Having part of the handshake appear after the fields ensures that both the server and the client verify that the connection is not being interrupted by an HTTP intermediary such as a man-in-the-middle cache or proxy.

10.3.4.1.4 Closing handshake

This section is non-normative.

The closing handshake is far simpler than the opening handshake.

Either peer can send a 0xFF frame with length 0x00 to begin the closing handshake. Upon receiving a 0xFF frame, the other peer sends an identical 0xFF frame in acknowledgement, if it hasn't already sent one. Upon receiving that 0xFF frame, the first peer then closes the connection, safe in the knowledge that no further data is forthcoming.

After sending a 0xFF frame, a peer does not send any further data; after receiving a 0xFF frame, a peer discards any further data received.

It is safe for both peers to initiate this handshake simultaneously.

The closing handshake is intended to replace the TCP closing handshake (FIN/ACK), on the basis that the TCP closing handshake is not always reliable end-to-end, especially in the presence of man-in-the-middle proxies and other intermediaries.

10.3.4.1.5 Design philosophy

This section is non-normative.

The WebSocket protocol is designed on the principle that there should be minimal framing (the only framing that exists is to make the protocol frame-based instead of stream-based, and to support a distinction between Unicode text and binary frames). It is expected that metadata would be layered on top of WebSocket by the application layer, in the same way that metadata is layered on top of TCP by the application layer (HTTP).

Conceptually, WebSocket is really just a layer on top of TCP that adds a Web "origin"-based security model for browsers; adds an addressing and protocol naming mechanism to support multiple services on one port and multiple host names on one IP address; layers a framing mechanism on top of TCP to get back to the IP packet mechanism that TCP is built on, but without length limits; and reimplements the closing handshake in-band. Other than that, it adds nothing. Basically it is intended to be as close to just exposing raw TCP to script as possible given the constraints of the Web. It's also designed in such a way that its servers can share a port with HTTP servers, by having its handshake be a valid HTTP Upgrade handshake also.

The protocol is intended to be extensible; future versions will likely introduce a mechanism to compress data and might support sending binary data.

10.3.4.1.6 Security model

This section is non-normative.

The WebSocket protocol uses the origin model used by Web browsers to restrict which Web pages can contact a WebSocket server when the WebSocket protocol is used from a Web page. Naturally, when the WebSocket protocol is used by a dedicated client directly (i.e. not from a Web page through a Web browser), the origin model is not useful, as the client can provide any arbitrary origin string.

This protocol is intended to fail to establish a connection with servers of pre-existing protocols like SMTP or HTTP, while allowing HTTP servers to opt-in to supporting this protocol if desired. This is achieved by having a strict and elaborate handshake, and by limiting the data that can be inserted into the connection before the handshake is finished (thus limiting how much the server can be influenced).

It is similarly intended to fail to establish a connection when data from other protocols, especially HTTP, is sent to a WebSocket server, for example as might happen if an HTML form were submitted to a WebSocket server. This is primarily achieved by requiring that the server prove that it read the handshake, which it can only do if the handshake contains the appropriate parts which themselves can only be sent by a WebSocket handshake; in particular, fields starting with Sec- cannot be set by an attacker from a Web browser, even when using XMLHttpRequest.

10.3.4.1.7 Relationship to TCP and HTTP

This section is non-normative.

The WebSocket protocol is an independent TCP-based protocol. Its only relationship to HTTP is that its handshake is interpreted by HTTP servers as an Upgrade request.

Based on the expert recommendation of the IANA, the WebSocket protocol by default uses port 80 for regular WebSocket connections and port 443 for WebSocket connections tunneled over TLS.

10.3.4.1.8 Establishing a connection

This section is non-normative.

There are several options for establishing a WebSocket connection.

On the face of it, the simplest method would seem to be to use port 80 to get a direct connection to a WebSocket server. Port 80 traffic, however, will often be intercepted by man-in-the-middle HTTP proxies, which can lead to the connection failing to be established.

The most reliable method, therefore, is to use TLS encryption and port 443 to connect directly to a WebSocket server. This has the advantage of being more secure; however, TLS encryption can be computationally expensive.

When a connection is to be made to a port that is shared by an HTTP server (a situation that is quite likely to occur with traffic to ports 80 and 443), the connection will appear to the HTTP server to be a regular GET request with an Upgrade offer. In relatively simple setups with just one IP address and a single server for all traffic to a single hostname, this might allow a practical way for systems based on the WebSocket protocol to be deployed. In more elaborate setups (e.g. with load balancers and multiple servers), a dedicated set of hosts for WebSocket connections separate from the HTTP servers is probably easier to manage.

10.3.4.1.9 Subprotocols using the WebSocket protocol

This section is non-normative.

The client can request that the server use a specific subprotocol by including the Sec-Websocket-Protocol field in its handshake. If it is specified, the server needs to include the same field and value in its response for the connection to be established.

These subprotocol names do not need to be registered, but if a subprotocol is intended to be implemented by multiple independent WebSocket servers, potential clashes with the names of subprotocols defined independently can be avoided by using names that contain the domain name of the subprotocol's originator. For example, if Example Corporation were to create a Chat subprotocol to be implemented by many servers around the Web, they could name it "chat.example.com". If the Example Organisation called their competing subprotocol "example.org's chat protocol", then the two subprotocols could be implemented by servers simultaneously, with the server dynamically selecting which subprotocol to use based on the value sent by the client.

Subprotocols can be versioned in backwards-incompatible ways by changing the subprotocol name, eg. going from "bookings.example.net" to "bookings.example.net2". These subprotocols would be considered completely separate by WebSocket clients. Backwards-compatible versioning can be implemented by reusing the same subprotocol string but carefully designing the actual subprotocol to support this kind of extensibility.

10.3.4.1.10 Terminology

When an implementation is required to send data as part of the WebSocket protocol, the implementation may delay the actual transmission arbitrarily, e.g. buffering data so as to send fewer IP packets.

10.3.4.2 WebSocket URLs
10.3.4.2.1 Parsing WebSocket URLs

The steps to parse a WebSocket URL's components from a string url are as follows. These steps return either a host, a port, a resource name, and a secure flag, or they fail.

  1. If the url string is not an absolute URL, then fail this algorithm. [WEBADDRESSES]

  2. Resolve the url string using the resolve a Web address algorithm defined by the Web addresses specification, with the URL character encoding set to UTF-8. [WEBADDRESSES] [RFC3629]

    It doesn't matter what it is resolved relative to, since we already know it is an absolute URL at this point.

  3. If url does not have a <scheme> component whose value, when converted to ASCII lowercase, is either "ws" or "wss", then fail this algorithm.

  4. If url has a <fragment> component, then fail this algorithm.

  5. If the <scheme> component of url is "ws", set secure to false; otherwise, the <scheme> component is "wss", set secure to true.

  6. Let host be the value of the <host> component of url, converted to ASCII lowercase.

  7. If url has a <port> component, then let port be that component's value; otherwise, there is no explicit port.

  8. If there is no explicit port, then: if secure is false, let port be 80, otherwise let port be 443.

  9. Let resource name be the value of the <path> component (which might be empty) of url.

  10. If resource name is the empty string, set it to a single character U+002F SOLIDUS (/).

  11. If url has a <query> component, then append a single U+003F QUESTION MARK character (?) to resource name, followed by the value of the <query> component.

  12. Return host, port, resource name, and secure.

10.3.4.2.2 Constructing WebSocket URLs

The steps to construct a WebSocket URL from a host, a port, a resource name, and a secure flag, are as follows:

  1. Let url be the empty string.
  2. If the secure flag is false, then append the string "ws://" to url. Otherwise, append the string "wss://" to url.
  3. Append host to url.
  4. If the secure flag is false and port is not 80, or if the secure flag is true and port is not 443, then append the string ":" followed by port to url.
  5. Append resource name to url.
  6. Return url.
10.3.4.3 Client-side requirements

This section only applies to user agents, not to servers.

This specification doesn't currently define a limit to the number of simultaneous connections that a client can establish to a server.

10.3.4.3.1 Opening handshake

When the user agent is to establish a WebSocket connection to a host host, on a port port, from an origin whose ASCII serialization is origin, with a flag secure, with a string giving a resource name, and optionally with a string giving a protocol, it must run the following steps. The host must be ASCII-only (i.e. it must have been punycode-encoded already if necessary). The resource name and protocol strings must be non-empty strings of ASCII characters in the range U+0020 to U+007E. The resource name string must start with a U+002F SOLIDUS character (/) and must not contain a U+0020 SPACE character. [ORIGIN]

  1. If the user agent already has a WebSocket connection to the remote host (IP address) identified by host, even if known by another name, wait until that connection has been established or for that connection to have failed. If multiple connections to the same IP address are attempted simultaneously, the user agent must serialize them so that there is no more than one connection at a time running through the following steps.

    This makes it harder for a script to perform a denial of service attack by just opening a large number of WebSocket connections to a remote host.

    There is no limit to the number of established WebSocket connections a user agent can have with a single remote host. Servers can refuse to connect users with an excessive number of connections, or disconnect resource-hogging users when suffering high load.

  2. Connect: If the user agent is configured to use a proxy when using the WebSocket protocol to connect to host host and/or port port, then connect to that proxy and ask it to open a TCP connection to the host given by host and the port given by port.

    For example, if the user agent uses an HTTP proxy for all traffic, then if it was to try to connect to port 80 on server example.com, it might send the following lines to the proxy server:

    CONNECT example.com:80 HTTP/1.1
    Host: example.com

    If there was a password, the connection might look like:

    CONNECT example.com:80 HTTP/1.1
    Host: example.com
    Proxy-authorization: Basic ZWRuYW1vZGU6bm9jYXBlcyE=

    Otherwise, if the user agent is not configured to use a proxy, then open a TCP connection to the host given by host and the port given by port.

    Implementations that do not expose explicit UI for selecting a proxy for WebSocket connections separate from other proxies are encouraged to use a SOCKS proxy for WebSocket connections, if available, or failing that, to prefer the proxy configured for HTTPS connections over the proxy configured for HTTP connections.

    For the purpose of proxy autoconfiguration scripts, the URL to pass the function must be constructed from host, port, resource name, and the secure flag using the steps to construct a WebSocket URL.

    The WebSocket protocol can be identified in proxy autoconfiguration scripts from the scheme ("ws:" for unencrypted connections and "wss:" for encrypted connections).

  3. If the connection could not be opened, then fail the WebSocket connection and abort these steps.

  4. If secure is true, perform a TLS handshake over the connection. If this fails (e.g. the server's certificate could not be verified), then fail the WebSocket connection and abort these steps. Otherwise, all further communication on this channel must run through the encrypted tunnel. [RFC2246]

    User agents must use the Server Name Indication extension in the TLS handshake. [RFC4366]

  5. Send the UTF-8 string "GET" followed by a UTF-8-encoded U+0020 SPACE character to the remote side (the server).

    Send the resource name value, encoded as UTF-8.

    Send another UTF-8-encoded U+0020 SPACE character, followed by the UTF-8 string "HTTP/1.1", followed by a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF).

  6. Let fields be an empty list of strings.

  7. Add the string "Upgrade: WebSocket" to fields.

  8. Add the string "Connection: Upgrade" to fields.

  9. Let hostport be an empty string.

  10. Append the host value, converted to ASCII lowercase, to hostport.

  11. If secure is false, and port is not 80, or if secure is true, and port is not 443, then append a U+003A COLON character (:) followed by the value of port, expressed as a base-ten integer, to hostport.

  12. Add the string consisting of the concatenation of the string "Host:", a U+0020 SPACE character, and hostport, to fields.

  13. Add the string consisting of the concatenation of the string "Origin:", a U+0020 SPACE character, and the origin value, converted to ASCII lowercase, to fields.

  14. If there is no protocol, then skip this step.

    Otherwise, add the string consisting of the concatenation of the string "Sec-WebSocket-Protocol:", a U+0020 SPACE character, and the protocol value, to fields.

  15. If the client has any cookies that would be relevant to a resource accessed over HTTP, if secure is false, or HTTPS, if it is true, on host host, port port, with resource name as the path (and possibly query parameters), then add to fields any HTTP headers that would be appropriate for that information. [HTTP] [COOKIES]

    This includes "HttpOnly" cookies (cookies with the http-only-flag set to true); the WebSocket protocol is not considered a non-HTTP API for the purpose of cookie processing.

  16. Let spaces1 be a random integer from 1 to 12 inclusive.

    Let spaces2 be a random integer from 1 to 12 inclusive.

    For example, 5 and 9.

  17. Let max1 be the largest integer not greater than 4,294,967,295 divided by spaces1.

    Let max2 be the largest integer not greater than 4,294,967,295 divided by spaces2.

    Continuing the example, 858,993,459 and 477,218,588.

  18. Let number1 be a random integer from 0 to max1 inclusive.

    Let number2 be a random integer from 0 to max2 inclusive.

    For example, 777,007,543 and 114,997,259.

  19. Let product1 be the result of multiplying number1 and spaces1 together.

    Let product2 be the result of multiplying number2 and spaces2 together.

    Continuing the example, 3,885,037,715 and 1,034,975,331.

  20. Let key1 be a string consisting of product1, expressed in base ten using the numerals in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9).

    Let key2 be a string consisting of product2, expressed in base ten using the numerals in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9).

    Continuing the example, "3885037715" and "1034975331".

  21. Insert spaces1 U+0020 SPACE characters into key1 at random positions.

    Insert spaces2 U+0020 SPACE characters into key2 at random positions.

    Continuing the example, this could lead to "388 5037  7  15" and "1   0 3  4 97 53 31".

  22. Insert between one and twelve random characters from the ranges U+0021 to U+002F and U+003A to U+007E into key1 at random positions.

    Insert between one and twelve random characters from the ranges U+0021 to U+002F and U+003A to U+007E into key2 at random positions.

    This corresponds to random printable ASCII characters other than the digits and the U+0020 SPACE character.

    Continuing the example, this could lead to "388P O503D&ul7 {K%gX( %7  15" and "1 N ?|k UT0or 3o  4 I97N 5-S3O 31".

  23. Add the string consisting of the concatenation of the string "Sec-WebSocket-Key1:", a U+0020 SPACE character, and the key1 value, to fields.

    Add the string consisting of the concatenation of the string "Sec-WebSocket-Key2:", a U+0020 SPACE character, and the key2 value, to fields.

  24. For each string in fields, in a random order: send the string, encoded as UTF-8, followed by a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF). It is important that the fields be output in a random order so that servers not depend on the particular order used by any particular client.

  25. Send a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF).

  26. Let key3 be a string consisting of eight random bytes (or equivalently, a random 64 bit integer encoded in big-endian order).

    For example, 0x47 0x30 0x22 0x2D 0x5A 0x3F 0x47 0x58.

  27. Send key3 to the server.

  28. Read bytes from the server until either the connection closes, or a 0x0A byte is read. Let field be these bytes, including the 0x0A byte.

    If field is not at least seven bytes long, or if the last two bytes aren't 0x0D and 0x0A respectively, or if it does not contain at least two 0x20 bytes, then fail the WebSocket connection and abort these steps.

    User agents may apply a timeout to this step, failing the WebSocket connection if the server does not send back data in a suitable time period.

  29. Let code be the substring of field that starts from the byte after the first 0x20 byte, and ends with the byte before the second 0x20 byte.

  30. If code is not three bytes long, or if any of the bytes in code are not in the range 0x30 to 0x39, then fail the WebSocket connection and abort these steps.

  31. If code, interpreted as UTF-8, is "101", then move to the next step.

    If code, interpreted as UTF-8, is "407", then either close the connection and jump back to step 2, providing appropriate authentication information, or fail the WebSocket connection. 407 is the code used by HTTP meaning "Proxy Authentication Required". User agents that support proxy authentication must interpret the response as defined by HTTP (e.g. to find and interpret the Proxy-Authenticate header).

    Otherwise, fail the WebSocket connection and abort these steps.

  32. Let fields be a list of name-value pairs, initially empty.

  33. Field: Let name and value be empty byte arrays.

  34. Read a byte from the server.

    If the connection closes before this byte is received, then fail the WebSocket connection and abort these steps.

    Otherwise, handle the byte as described in the appropriate entry below:

    If the byte is 0x0D (ASCII CR)
    If the name byte array is empty, then jump to the fields processing step. Otherwise, fail the WebSocket connection and abort these steps.
    If the byte is 0x0A (ASCII LF)
    Fail the WebSocket connection and abort these steps.
    If the byte is 0x3A (ASCII :)
    Move on to the next step.
    If the byte is in the range 0x41 to 0x5A (ASCII A-Z)
    Append a byte whose value is the byte's value plus 0x20 to the name byte array and redo this step for the next byte.
    Otherwise
    Append the byte to the name byte array and redo this step for the next byte.

    This reads a field name, terminated by a colon, converting upper-case ASCII letters to lowercase, and aborting if a stray CR or LF is found.

  35. Read a byte from the server.

    If the connection closes before this byte is received, then fail the WebSocket connection and abort these steps.

    Otherwise, handle the byte as described in the appropriate entry below:

    If the byte is 0x20 (ASCII space)
    Ignore the byte and move on to the next step.
    Otherwise
    Treat the byte as described by the list in the next step, then move on to that next step for real.

    This skips past a space character after the colon, if necessary.

  36. Read a byte from the server.

    If the connection closes before this byte is received, then fail the WebSocket connection and abort these steps.

    Otherwise, handle the byte as described in the appropriate entry below:

    If the byte is 0x0D (ASCII CR)
    Move on to the next step.
    If the byte is 0x0A (ASCII LF)
    Fail the WebSocket connection and abort these steps.
    Otherwise
    Append the byte to the value byte array and redo this step for the next byte.

    This reads a field value, terminated by a CRLF.

  37. Read a byte from the server.

    If the connection closes before this byte is received, or if the byte is not a 0x0A byte (ASCII LF), then fail the WebSocket connection and abort these steps.

    This skips past the LF byte of the CRLF after the field.

  38. Append an entry to the fields list that has the name given by the string obtained by interpreting the name byte array as a UTF-8 byte stream and the value given by the string obtained by interpreting the value byte array as a UTF-8 byte stream.

  39. Return to the "Field" step above.

  40. Fields processing: Read a byte from the server.

    If the connection closes before this byte is received, or if the byte is not a 0x0A byte (ASCII LF), then fail the WebSocket connection and abort these steps.

    This skips past the LF byte of the CRLF after the blank line after the fields.

  41. If there is not exactly one entry in the fields list whose name is "upgrade", or if there is not exactly one entry in the fields list whose name is "connection", or if there is not exactly one entry in the fields list whose name is "sec-websocket-origin", or if there is not exactly one entry in the fields list whose name is "sec-websocket-location", or if the protocol was specified but there is not exactly one entry in the fields list whose name is "sec-websocket-protocol", or if there are any entries in the fields list whose names are the empty string, then fail the WebSocket connection and abort these steps. Otherwise, handle each entry in the fields list as follows:

    If the entry's name is "upgrade"

    If the value is not exactly equal to the string "WebSocket", then fail the WebSocket connection and abort these steps.

    If the entry's name is "connection"

    If the value, converted to ASCII lowercase, is not exactly equal to the string "upgrade", then fail the WebSocket connection and abort these steps.

    If the entry's name is "sec-websocket-origin"

    If the value is not exactly equal to origin, converted to ASCII lowercase, then fail the WebSocket connection and abort these steps. [ORIGIN]

    If the entry's name is "sec-websocket-location"

    If the value is not exactly equal to a string obtained from the steps to construct a WebSocket URL from host, port, resource name, and the secure flag, then fail the WebSocket connection and abort these steps.

    If the entry's name is "sec-websocket-protocol"

    If there was a protocol specified, and the value is not exactly equal to protocol, then fail the WebSocket connection and abort these steps. (If no protocol was specified, the field is ignored.)

    If the entry's name is "set-cookie" or "set-cookie2" or another cookie-related field name

    If the relevant specification is supported by the user agent, handle the cookie as defined by the appropriate specification, with the resource being the one with the host host, the port port, the path (and possibly query parameters) resource name, and the scheme http if secure is false and https if secure is true. [COOKIES]

    Any other name
    Ignore it.

  42. Let challenge be the concatenation of number1, expressed as a big-endian 32 bit integer, number2, expressed as a big-endian 32 bit integer, and the eight bytes of key3 in the order they were sent on the wire.

    Using the examples given earlier, this leads to the 16 bytes 0x2E 0x50 0x31 0xB7 0x06 0xDA 0xB8 0x0B 0x47 0x30 0x22 0x2D 0x5A 0x3F 0x47 0x58.

  43. Let expected be the MD5 fingerprint of challenge as a big-endian 128 bit string. [RFC1321]

    Using the examples given earlier, this leads to the 16 bytes 0x30 0x73 0x74 0x33 0x52 0x6C 0x26 0x71 0x2D 0x32 0x5A 0x55 0x5E 0x77 0x65 0x75. In ASCII, these bytes correspond to the string "0st3Rl&q-2ZU^weu".

  44. Read sixteen bytes from the server. Let reply be those bytes.

    If the connection closes before these bytes are received, then fail the WebSocket connection and abort these steps.

  45. If reply does not exactly equal expected, then fail the WebSocket connection and abort these steps.

  46. The WebSocket connection is established. Now the user agent must send and receive to and from the connection as described in the next section.

10.3.4.3.2 Data framing

Once a WebSocket connection is established, the user agent must run through the following state machine for the bytes sent by the server. If at any point during these steps a read is attempted but fails because the WebSocket connection is closed, then abort.

  1. Try to read a byte from the server. Let frame type be that byte.

  2. Let error be false.

  3. Handle the frame type byte as follows:

    If the high-order bit of the frame type byte is set (i.e. if frame type anded with 0x80 returns 0x80)

    Run these steps:

    1. Let length be zero.

    2. Length: Read a byte, let b be that byte.

    3. Let bv be an integer corresponding to the low 7 bits of b (the value you would get by anding b with 0x7F).

    4. Multiply length by 128, add bv to that result, and store the final result in length.

    5. If the high-order bit of b is set (i.e. if b anded with 0x80 returns 0x80), then return to the step above labeled length.

    6. Read length bytes.

      It is possible for a server to (innocently or maliciously) send frames with lengths greater than 231 or 232 bytes, overflowing a signed or unsigned 32bit integer. User agents may therefore impose implementation-specific limits on the lengths of invalid frames that they will skip; even supporting frames 2GB in length is considered, at the time of writing, as going well above and beyond the call of duty.

    7. Discard the read bytes.

    8. If the frame type is 0xFF and the length was 0, then run the following substeps:

      1. If the WebSocket closing handshake has not yet started, then start the WebSocket closing handshake.
      2. Wait until either the WebSocket closing handshake has started or the WebSocket connection is closed.
      3. If the WebSocket connection is not already closed, then close the WebSocket connection: The WebSocket closing handshake has finished. (If the connection closes before this happens, then the closing handshake doesn't finish.)
      4. Abort these steps. Any data on the connection after the 0xFF frame is discarded.

      Otherwise, let error be true.

    If the high-order bit of the frame type byte is not set (i.e. if frame type anded with 0x80 returns 0x00)

    Run these steps:

    1. Let raw data be an empty byte array.

    2. Data: Read a byte, let b be that byte.

    3. If b is not 0xFF, then append b to raw data and return to the previous step (labeled data).

    4. Interpret raw data as a UTF-8 string, and store that string in data.

    5. If frame type is 0x00, then a WebSocket message has been received with text data. Otherwise, discard the data and let error be true.

  4. If error is true, then a WebSocket error has been detected.

  5. Return to the first step to read the next byte.

If the user agent is faced with content that is too large to be handled appropriately, runs out of resources for buffering incoming data, or hits an artificial resource limit intended to avoid resource starvation, then it must fail the WebSocket connection.


Once a WebSocket connection is established, but before the WebSocket closing handshake has started, the user agent must use the following steps to send data using the WebSocket:

  1. Send a 0x00 byte to the server.

  2. Encode data using UTF-8 and send the resulting byte stream to the server.

  3. Send a 0xFF byte to the server.

Once the WebSocket closing handshake has started, the user agent must not send any further data on the connection.


Once a WebSocket connection is established, the user agent must use the following steps to start the WebSocket closing handshake. These steps must be run asynchronously relative to whatever algorithm invoked this one.

  1. If the WebSocket closing handshake has started, then abort these steps.

  2. Send a 0xFF byte to the server.

  3. Send a 0x00 byte to the server.

  4. The WebSocket closing handshake has started.

  5. Wait a user-agent-determined length of time, or until the WebSocket connection is closed.

  6. If the WebSocket connection is not already closed, then close the WebSocket connection. (If this happens, then the closing handshake doesn't finish.)

The closing handshake finishes once the server returns the 0xFF packet, as described above.


If at any point there is a fatal problem with sending data to the server, the user agent must fail the WebSocket connection.

10.3.4.3.3 Handling errors in UTF-8 from the server

When a client is to interpret a byte stream as UTF-8 but finds that the byte stream is not in fact a valid UTF-8 stream, then any bytes or sequences of bytes that are not valid UTF-8 sequences must be interpreted as a U+FFFD REPLACEMENT CHARACTER.

10.3.4.4 Server-side requirements

This section only applies to servers.

10.3.4.4.1 Reading the client's opening handshake

When a client starts a WebSocket connection, it sends its part of the opening handshake. The server must parse at least part of this handshake in order to obtain the necessary information to generate the server part of the handshake.

The client handshake consists of the following parts. If the server, while reading the handshake, finds that the client did not send a handshake that matches the description below, the server should abort the WebSocket connection.

  1. The three-character UTF-8 string "GET".

  2. A UTF-8-encoded U+0020 SPACE character (0x20 byte).

  3. A string consisting of all the bytes up to the next UTF-8-encoded U+0020 SPACE character (0x20 byte). The result of decoding this string as a UTF-8 string is the name of the resource requested by the server. If the server only supports one resource, then this can safely be ignored; the client verifies that the right resource is supported based on the information included in the server's own handshake. The resource name will begin with U+002F SOLIDUS character (/) and will only include characters in the range U+0021 to U+007E.

  4. A string of bytes terminated by a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF). All the characters from the second 0x20 byte up to the first 0x0D 0x0A byte pair in the data from the client can be safely ignored. (It will probably be the string "HTTP/1.1".)

  5. A series of fields.

    Each field is terminated by a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF). The end of the fields is denoted by the terminating CRLF pair being followed immediately by another CRLF pair.

    In other words, the fields start with the first 0x0D 0x0A byte pair, end with the first 0x0D 0x0A 0x0D 0x0A byte sequence, and are separate from each other by 0x0D 0x0A byte pairs.

    The fields are encoded as UTF-8.

    Each field consists of a name, consisting of one or more characters in the ranges U+0021 to U+0039 and U+003B to U+007E, followed by a U+003A COLON character (:) and a U+0020 SPACE character, followed by zero or more characters forming the value.

    The expected field names, the meaning of their corresponding values, and the processing servers are required to apply to those fields, are described below, after the description of the client handshake.

  6. After the first 0x0D 0x0A 0x0D 0x0A byte sequence, indicating the end of the fields, the client sends eight random bytes. These are used in constructing the server handshake.

The expected field names, and the meaning of their corresponding values, are as follows. Field names must be compared in an ASCII case-insensitive manner.

Upgrade

Invariant part of the handshake. Will always have a value that is an ASCII case-insensitive match for the string "WebSocket".

Can be safely ignored, though the server should abort the WebSocket connection if this field is absent or has a different value, to avoid vulnerability to cross-protocol attacks.

Connection

Invariant part of the handshake. Will always have a value that is an ASCII case-insensitive match for the string "Upgrade".

Can be safely ignored, though the server should abort the WebSocket connection if this field is absent or has a different value, to avoid vulnerability to cross-protocol attacks.

Host

The value gives the hostname that the client intended to use when opening the WebSocket. It would be of interest in particular to virtual hosting environments, where one server might serve multiple hosts, and might therefore want to return different data.

Can be safely ignored, though the server should abort the WebSocket connection if this field is absent or has a value that does not match the server's host name, to avoid vulnerability to cross-protocol attacks and DNS rebinding attacks.

Origin

The value gives the scheme, hostname, and port (if it's not the default port for the given scheme) of the page that asked the client to open the WebSocket. It would be interesting if the server's operator had deals with operators of other sites, since the server could then decide how to respond (or indeed, whether to respond) based on which site was requesting a connection. [ORIGIN]

Can be safely ignored, though the server should abort the WebSocket connection if this field is absent or has a value that does not match one of the origins the server is expecting to communicate with, to avoid vulnerability to cross-protocol attacks and cross-site scripting attacks.

Sec-WebSocket-Protocol

The value gives the name of a subprotocol that the client is intending to select. It would be interesting if the server supports multiple protocols or protocol versions.

Can be safely ignored, though the server may abort the WebSocket connection if the field is absent but the conventions for communicating with the server are such that the field is expected; and the server should abort the WebSocket connection if the field has a value that does not match one of the subprotocols that the server supports, to avoid integrity errors once the connection is established.

Sec-WebSocket-Key1
Sec-WebSocket-Key2

The values provide the information required for computing the server's handshake, as described in the next section.

Other fields

Other fields can be used, such as "Cookie", for authentication purposes. Their semantics are equivalent to the semantics of the HTTP headers with the same names.

If a server reads fields for authentication purposes (such as Cookie), or if a server assumes that its clients are authorized on the basis that they can connect (e.g. because they are on an intranet firewalled from the public Internet), then the server should also verify that the client's handshake includes the invariant "Upgrade" and "Connection" parts of the handshake, and should send the server's handshake before changing any user data. Otherwise, an attacker could trick a client into sending WebSocket frames to a server (e.g. using XMLHttpRequest) and cause the server to perform actions on behalf of the user without the user's consent. (Sending the server's handshake ensures that the frames were not sent as part of a cross-protocol attack, since other protocols do not send the necessary components in the client's initial handshake for forming the server's handshake.)

Unrecognized fields can be safely ignored, and are probably either the result of intermediaries injecting fields unrelated to the operation of the WebSocket protocol, or clients that support future versions of the protocol offering options that the server doesn't support.

10.3.4.4.2 Sending the server's opening handshake

When a client establishes a WebSocket connection to a server, the server must run the following steps.

  1. If the server supports encryption, perform a TLS handshake over the connection. If this fails (e.g. the client indicated a host name in the extended client hello "server_name" extension that the server does not host), then close the connection; otherwise, all further communication for the connection (including the server handshake) must run through the encrypted tunnel. [RFC2246]

  2. Establish the following information:

    host
    The host name or IP address of the WebSocket server, as it is to be addressed by clients. The host name must be punycode-encoded if necessary. If the server can respond to requests to multiple hosts (e.g. in a virtual hosting environment), then the value should be derived from the client's handshake, specifically from the "Host" field.
    port
    The port number on which the server expected and/or received the connection.
    resource name
    An identifier for the service provided by the server. If the server provides multiple services, then the value should be derived from the resource name given in the client's handshake.
    secure flag
    True if the connection is encrypted or if the server expected it to be encrypted; false otherwise.
    origin
    The ASCII serialization of the origin that the server is willing to communicate with. If the server can respond to requests from multiple origins (or indeed, all origins), then the value should be derived from the client's handshake, specifically from the "Origin" field. [ORIGIN]
    subprotocol
    Either null, or a string representing the subprotocol the server is ready to use. If the server supports multiple subprotocols, then the value should be derived from the client's handshake, specifically from the "Sec-WebSocket-Protocol" field. The absence of such a field is equivalent to the null value. The empty string is not the same as the null value for these purposes.
    key1
    The value of the "Sec-WebSocket-Key1" field in the client's handshake.
    key2
    The value of the "Sec-WebSocket-Key2" field in the client's handshake.
    key3
    The eight random bytes sent after the first 0x0D 0x0A 0x0D 0x0A sequence in the client's handshake.
  3. Let location be the string that results from constructing a WebSocket URL from host, port, resource name, and secure flag.

  4. Let key-number1 be the digits (characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9)) in key1, interpreted as a base ten integer, ignoring all other characters in key1.

    Let key-number2 be the digits (characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9)) in key2, interpreted as a base ten integer, ignoring all other characters in key2.

    For example, assume that the client handshake was:

    GET / HTTP/1.1
    Connection: Upgrade
    Host: example.com
    Upgrade: WebSocket
    Sec-WebSocket-Key1: 3e6b263  4 17 80
    Origin: http://example.com
    Sec-WebSocket-Key2: 17  9 G`ZD9   2 2b 7X 3 /r90
    
    WjN}|M(6

    The key-number1 would be the number 3,626,341,780, and the key-number2 would be the number 1,799,227,390.

    In this example, incidentally, key3 is "WjN}|M(6", or 0x57 0x6A 0x4E 0x7D 0x7C 0x4D 0x28 0x36.

  5. Let spaces1 be the number of U+0020 SPACE characters in key1.

    Let spaces2 be the number of U+0020 SPACE characters in key2.

    If either spaces1 or spaces2 is zero, then abort the WebSocket connection. This is a symptom of a cross-protocol attack.

    In the example above, spaces1 would be 4 and spaces2 would be 10.

  6. If key-number1 is not an integral multiple of spaces1, then abort the WebSocket connection.

    If key-number2 is not an integral multiple of spaces2, then abort the WebSocket connection.

    This can only happen if the client is not a conforming WebSocket client.

  7. Let part1 be key-number1 divided by spaces1.

    Let part2 be key-number2 divided by spaces2.

    In the example above, part1 would be 906,585,445 and part2 would be 179,922,739.

  8. Let challenge be the concatenation of part1, expressed as a big-endian 32 bit integer, part2, expressed as a big-endian 32 bit integer, and the eight bytes of key3 in the order they were sent on the wire.

    In the example above, this would be the 16 bytes 0x36 0x09 0x65 0x65 0x0A 0xB9 0x67 0x33 0x57 0x6A 0x4E 0x7D 0x7C 0x4D 0x28 0x36.

  9. Let response be the MD5 fingerprint of challenge as a big-endian 128 bit string. [RFC1321]

    In the example above, this would be the 16 bytes 0x6E 0x60 0x39 0x65 0x42 0x6B 0x39 0x7A 0x24 0x52 0x38 0x70 0x4F 0x74 0x56 0x62, or "n`9eBk9z$R8pOtVb" in ASCII.

  10. Send the following line, terminated by the two characters U+000D CARRIAGE RETURN and U+000A LINE FEED (CRLF) and encoded as UTF-8, to the client:

    HTTP/1.1 101 WebSocket Protocol Handshake

    This line may be sent differently if necessary, but must match the Status-Line production defined in the HTTP specification, with the Status-Code having the value 101.

  11. Send the following fields to the client. Each field must be sent as a line consisting of the field name, which must be an ASCII case-insensitive match for the field name in the list below, followed by a U+003A COLON character (:) and a U+0020 SPACE character, followed by the field value as specified in the list below, followed by the two characters U+000D CARRIAGE RETURN and U+000A LINE FEED (CRLF). The lines must be encoded as UTF-8. The lines may be sent in any order.

    Upgrade

    The value must be the string "WebSocket".

    Connection

    The value must be the string "Upgrade".

    Sec-WebSocket-Location

    The value must be location

    Sec-WebSocket-Origin

    The value must be origin

    Sec-WebSocket-Protocol

    This field must be included if subprotocol is not null, and must not be included if subprotocol is null.

    If included, the value must be subprotocol

    Optionally, include "Set-Cookie", "Set-Cookie2", or other cookie-related fields, with values equal to the values that would be used for the identically named HTTP headers. [COOKIES]

  12. Send two bytes 0x0D 0x0A (ASCII CRLF).

  13. Send response.

This completes the server's handshake. If the server finishes these steps without aborting the WebSocket connection, and if the client does not then fail the connection, then the connection is established and the server may begin and receiving sending data, as described in the next section.

10.3.4.4.3 Data framing

The server must run through the following steps to process the bytes sent by the client. If at any point during these steps a read is attempted but fails because the WebSocket connection is closed, then abort.

  1. Frame: Read a byte from the client. Let type be that byte.

  2. If type is not a 0x00 byte, then the server may disconnect from the client.

  3. If the most significant bit of type is not set, then run the following steps:

    1. Let raw data be an empty byte array.

    2. Data: Read a byte, let b be that byte.

    3. If b is not 0xFF, then append b to raw data and return to the previous step (labeled data).

    4. Interpret raw data as a UTF-8 string, and apply whatever server-specific processing is to occur for the resulting string (the message from the client).

    Otherwise, the most significant bit of type is set. Run the following steps. This can never happen if type is 0x00, and therefore these steps are not necessary if the server aborts when type is not 0x00, as allowed above.

    1. Let length be zero.

    2. Length: Read a byte, let b be that byte.

    3. Let bv be an integer corresponding to the low 7 bits of b (the value you would get by anding b with 0x7F).

    4. Multiply length by 128, add bv to that result, and store the final result in length.

    5. If the high-order bit of b is set (i.e. if b anded with 0x80 returns 0x80), then return to the step above labeled length.

    6. Read length bytes.

      It is possible for a malicious client to send frames with lengths greater than 231 or 232 bytes, overflowing a signed or unsigned 32bit integer. Servers may therefore impose implementation-specific limits on the lengths of invalid frames that they will skip, if they support skipping such frames at all. If a server cannot correctly skip past a long frame, then the server must abort these steps (discarding all future data), and should either immediately disconnect from the client or set the client terminated flag.

    7. Discard the read bytes.

    8. If type is 0xFF and length is 0, then set the client terminated flag and abort these steps. All further data sent by the client should be discarded.

  4. Return to the step labeled frame.


The server must run through the following steps to send strings to the client:

  1. Send a 0x00 byte to the client to indicate the start of a string.

  2. Encode data using UTF-8 and send the resulting byte stream to the client.

  3. Send a 0xFF byte to the client to indicate the end of the message.


At any time, the server may decide to terminate the WebSocket connection by running through the following steps:

  1. Send a 0xFF byte and a 0x00 byte to the client to indicate the start of the closing handshake.

  2. Wait until the client terminated flag has been set, or until a server-defined timeout expires.

  3. Close the WebSocket connection.

Once these steps have started, the server must not send any further data to the server. The 0xFF 0x00 bytes indicate the end of the server's data, and further bytes will be discarded by the client.

10.3.4.4.4 Handling errors in UTF-8 from the client

When a server is to interpret a byte stream as UTF-8 but finds that the byte stream is not in fact a valid UTF-8 stream, behavior is undefined. A server could close the connection, convert invalid byte sequences to U+FFFD REPLACEMENT CHARACTERs, store the data verbatim, or perform application-specific processing. Subprotocols layered on the WebSocket protocol might define specific behavior for servers.

10.3.4.5 Closing the connection
10.3.4.5.1 Client-initiated closure

Certain algorithms require the user agent to fail the WebSocket connection. To do so, the user agent must close the WebSocket connection, and may report the problem to the user (which would be especially useful for developers).

Except as indicated above or as specified by the application layer (e.g. a script using the WebSocket API), user agents should not close the connection.

User agents must not convey any failure information to scripts in a way that would allow a script to distinguish the following situations:

10.3.4.5.2 Server-initiated closure

Certain algorithms require or recommend that the server abort the WebSocket connection during the opening handshake. To do so, the server must simply close the WebSocket connection.

10.3.4.5.3 Closure

To close the WebSocket connection, the user agent or server must close the TCP connection, using whatever mechanism possible (e.g. either the TCP RST or FIN mechanisms). When a user agent notices that the server has closed its connection, it must immediately close its side of the connection also. Whether the user agent or the server closes the connection first, it is said that the WebSocket connection is closed. If the connection was closed after the client finished the WebSocket closing handshake, then the WebSocket connection is said to have been closed cleanly.

Servers may close the WebSocket connection whenever desired. User agents should not close the WebSocket connection arbitrarily.

10.3.4.6 Security considerations

While this protocol is intended to be used by scripts in Web pages, it can also be used directly by hosts. Such hosts are acting on their own behalf, and can therefore send fake "Origin" fields, misleading the server. Servers should therefore be careful about assuming that they are talking directly to scripts from known origins, and must consider that they might be accessed in unexpected ways. In particular, a server should not trust that any input is valid.

For example, if the server uses input as part of SQL queries, all input text should be escaped before being passed to the SQL server, lest the server be susceptible to SQL injection.


Servers that are not intended to process input from any Web page but only for certain sites should verify the "Origin" field is an origin they expect, and should only respond with the corresponding "Sec-WebSocket-Origin" if it is an accepted origin. Servers that only accept input from one origin can just send back that value in the "Sec-WebSocket-Origin" field, without bothering to check the client's value.


If at any time a server is faced with data that it does not understand, or that violates some criteria by which the server determines safety of input, or when the server sees a handshake that does not correspond to the values the server is expecting (e.g. incorrect path or origin), the server should just disconnect. It is always safe to disconnect.


The biggest security risk when sending text data using this protocol is sending data using the wrong encoding. If an attacker can trick the server into sending data encoded as ISO-8859-1 verbatim (for instance), rather than encoded as UTF-8, then the attacker could inject arbitrary frames into the data stream.

10.3.4.7 IANA considerations
10.3.4.7.1 Registration of ws: scheme

A ws: URL identifies a WebSocket server and resource name.

URI scheme name.
ws
Status.
Permanent.
URI scheme syntax.

In ABNF terms using the terminals from the URI specifications: [ABNF] [RFC3986]

"ws" ":" hier-part [ "?" query ]

The path and query components form the resource name sent to the server to identify the kind of service desired. Other components have the meanings described in RFC3986.

URI scheme semantics.
The only operation for this scheme is to open a connection using the WebSocket protocol.
Encoding considerations.

Characters in the host component that are excluded by the syntax defined above must be converted from Unicode to ASCII by applying the IDNA ToASCII algorithm to the Unicode host name, with both the AllowUnassigned and UseSTD3ASCIIRules flags set, and using the result of this algorithm as the host in the URI. [RFC3490]

Characters in other components that are excluded by the syntax defined above must be converted from Unicode to ASCII by first encoding the characters as UTF-8 and then replacing the corresponding bytes using their percent-encoded form as defined in the URI and IRI specification. [RFC3986] [RFC3987]

Applications/protocols that use this URI scheme name.
WebSocket protocol.
Interoperability considerations.
None.
Security considerations.
See "Security considerations" section above.
Contact.
Ian Hickson <ian@hixie.ch>
Author/Change controller.
Ian Hickson <ian@hixie.ch>
References.
This document.
10.3.4.7.2 Registration of wss: scheme

A wss: URL identifies a WebSocket server and resource name, and indicates that traffic over that connection is to be encrypted.

URI scheme name.
wss
Status.
Permanent.
URI scheme syntax.

In ABNF terms using the terminals from the URI specifications: [ABNF] [RFC3986]

"wss" ":" hier-part [ "?" query ]

The path and query components form the resource name sent to the server to identify the kind of service desired. Other components have the meanings described in RFC3986.

URI scheme semantics.
The only operation for this scheme is to open a connection using the WebSocket protocol, encrypted using TLS.
Encoding considerations.

Characters in the host component that are excluded by the syntax defined above must be converted from Unicode to ASCII by applying the IDNA ToASCII algorithm to the Unicode host name, with both the AllowUnassigned and UseSTD3ASCIIRules flags set, and using the result of this algorithm as the host in the URI. [RFC3490]

Characters in other components that are excluded by the syntax defined above must be converted from Unicode to ASCII by first encoding the characters as UTF-8 and then replacing the corresponding bytes using their percent-encoded form as defined in the URI and IRI specification. [RFC3986] [RFC3987]

Applications/protocols that use this URI scheme name.
WebSocket protocol over TLS.
Interoperability considerations.
None.
Security considerations.
See "Security considerations" section above.
Contact.
Ian Hickson <ian@hixie.ch>
Author/Change controller.
Ian Hickson <ian@hixie.ch>
References.
This document.
10.3.4.7.3 Registration of the "WebSocket" HTTP Upgrade keyword
Name of token.
WebSocket
Author/Change controller.
Ian Hickson <ian@hixie.ch>
Contact.
Ian Hickson <ian@hixie.ch>
References.
This document.
10.3.4.7.4 Sec-WebSocket-Key1 and Sec-WebSocket-Key2

This section describes two header fields for registration in the Permanent Message Header Field Registry. [RFC3864]

Header field name
Sec-WebSocket-Key1
Applicable protocol
http
Status
reserved; do not use outside WebSocket handshake
Author/Change controller
IETF
Specification document(s)
This document is the relevant specification.
Related information
None.
Header field name
Sec-WebSocket-Key2
Applicable protocol
http
Status
reserved; do not use outside WebSocket handshake
Author/Change controller
IETF
Specification document(s)
This document is the relevant specification.
Related information
None.

The Sec-WebSocket-Key1 and Sec-WebSocket-Key2 headers are used in the WebSocket handshake. They are sent from the client to the server to provide part of the information used by the server to prove that it received a valid WebSocket handshake. This helps ensure that the server does not accept connections from non-Web-Socket clients (e.g. HTTP clients) that are being abused to send data to unsuspecting WebSocket servers.

10.3.4.7.5 Sec-WebSocket-Location

This section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]

Header field name
Sec-WebSocket-Location
Applicable protocol
http
Status
reserved; do not use outside WebSocket handshake
Author/Change controller
IETF
Specification document(s)
This document is the relevant specification.
Related information
None.

The Sec-WebSocket-Location header is used in the WebSocket handshake. It is sent from the server to the client to confirm the URL of the connection. This enables the client to verify that the connection was established to the right server, port, and path, instead of relying on the server to verify that the requested host, port, and path are correct.

10.3.4.7.6 Sec-WebSocket-Origin

This section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]

Header field name
Sec-WebSocket-Origin
Applicable protocol
http
Status
reserved; do not use outside WebSocket handshake
Author/Change controller
IETF
Specification document(s)
This document is the relevant specification.
Related information
None.

The Sec-WebSocket-Origin header is used in the WebSocket handshake. It is sent from the server to the client to confirm the origin of the script that opened the connection. This enables user agents to verify that the server is willing to serve the script that opened the connection.

10.3.4.7.7 Sec-WebSocket-Protocol

This section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]

Header field name
Sec-WebSocket-Protocol
Applicable protocol
http
Status
reserved; do not use outside WebSocket handshake
Author/Change controller
IETF
Specification document(s)
This document is the relevant specification.
Related information
None.

The Sec-WebSocket-Protocol header is used in the WebSocket handshake. It is sent from the client to the server and back from the server to the client to confirm the subprotocol of the connection. This enables scripts to both select a subprotocol and be sure that the server agreed to serve that subprotocol.

10.3.4.8 Using the WebSocket protocol from other specifications

The WebSocket protocol is intended to be used by another specification to provide a generic mechanism for dynamic author-defined content, e.g. in a specification defining a scripted API.

Such a specification first needs to "establish a WebSocket connection", providing that algorithm with:

The host, port, resource name, and secure flag are usually obtained from a URL using the steps to parse a WebSocket URL's components. These steps fail if the URL does not specify a WebSocket.

If a connection can be established, then it is said that the "WebSocket connection is established".

If at any time the connection is to be closed, then the specification needs to use the "close the WebSocket connection" algorithm.

When the connection is closed, for any reason including failure to establish the connection in the first place, it is said that the "WebSocket connection is closed".

While a connection is open, the specification will need to handle the cases when "a WebSocket message has been received" with text data.

To send some text data to an open connection, the specification needs to "send data using the WebSocket".

10.4 Cross-document messaging

Web browsers, for security and privacy reasons, prevent documents in different domains from affecting each other; that is, cross-site scripting is disallowed.

While this is an important security feature, it prevents pages from different domains from communicating even when those pages are not hostile. This section introduces a messaging system that allows documents to communicate with each other regardless of their source domain, in a way designed to not enable cross-site scripting attacks.

The task source for the tasks in cross-document messaging is the posted message task source.

10.4.1 Introduction

This section is non-normative.

For example, if document A contains an iframe element that contains document B, and script in document A calls postMessage() on the Window object of document B, then a message event will be fired on that object, marked as originating from the Window of document A. The script in document A might look like:

var o = document.getElementsByTagName('iframe')[0];
o.contentWindow.postMessage('Hello world', 'http://b.example.org/');

To register an event handler for incoming events, the script would use addEventListener() (or similar mechanisms). For example, the script in document B might look like:

window.addEventListener('message', receiver, false);
function receiver(e) {
  if (e.origin == 'http://example.com') {
    if (e.data == 'Hello world') {
      e.source.postMessage('Hello', e.origin);
    } else {
      alert(e.data);
    }
  }
}

This script first checks the domain is the expected domain, and then looks at the message, which it either displays to the user, or responds to by sending a message back to the document which sent the message in the first place.

10.4.2 Security

10.4.2.1 Authors

Use of this API requires extra care to protect users from hostile entities abusing a site for their own purposes.

Authors should check the origin attribute to ensure that messages are only accepted from domains that they expect to receive messages from. Otherwise, bugs in the author's message handling code could be exploited by hostile sites.

Furthermore, even after checking the origin attribute, authors should also check that the data in question is of the expected format. Otherwise, if the source of the event has been attacked using a cross-site scripting flaw, further unchecked processing of information sent using the postMessage() method could result in the attack being propagated into the receiver.

Authors should not use the wildcard keyword (*) in the targetOrigin argument in messages that contain any confidential information, as otherwise there is no way to guarantee that the message is only delivered to the recipient to which it was intended.

10.4.2.2 User agents

The integrity of this API is based on the inability for scripts of one origin to post arbitrary events (using dispatchEvent() or otherwise) to objects in other origins (those that are not the same).

Implementors are urged to take extra care in the implementation of this feature. It allows authors to transmit information from one domain to another domain, which is normally disallowed for security reasons. It also requires that UAs be careful to allow access to certain properties but not others.

10.4.3 Posting messages

window . postMessage(message, [ ports, ] targetOrigin)

Posts a message, optionally with an array of ports, to the given window.

If the origin of the target window doesn't match the given origin, the message is discarded, to avoid information leakage. To send the message to the target regardless of origin, set the target origin to "*". To restrict the message to same-origin targets only, without needing to explicitly state the origin, set the target origin to "/".

Throws an INVALID_STATE_ERR if the ports array is not null and it contains either null entries or duplicate ports.

When a script invokes the postMessage(message, targetOrigin) method (with only two arguments) on a Window object, the user agent must follow these steps:

  1. If the value of the targetOrigin argument is neither a single U+002A ASTERISK character (*), a single U+002F SOLIDUS character (/), nor an absolute URL with a <host-specific> component that is either empty or a single U+002F SOLIDUS character (/), then throw a SYNTAX_ERR exception and abort the overall set of steps.

  2. Let message clone be the result of obtaining a structured clone of the message argument. If this throws an exception, then throw that exception and abort these steps.

  3. Return from the postMessage() method, but asynchronously continue running these steps.

  4. If the targetOrigin argument is a single literal U+002F SOLIDUS character (/), and the Document of the Window object on which the method was invoked does not have the same origin as the entry script's document, then abort these steps silently.

    Otherwise, if the targetOrigin argument is an absolute URL, and the Document of the Window object on which the method was invoked does not have the same origin as targetOrigin, then abort these steps silently.

    Otherwise, the targetOrigin argument is a single literal U+002A ASTERISK character (*), and no origin check is made.

  5. Create an event that uses the MessageEvent interface, with the event name message, which does not bubble, is not cancelable, and has no default action. The data attribute must be set to the value of message clone, the origin attribute must be set to the Unicode serialization of the origin of the script that invoked the method, and the source attribute must be set to the script's global object's WindowProxy object.

  6. Queue a task to dispatch the event created in the previous step at the Window object on which the method was invoked. The task source for this task is the posted message task source.