A later version of the API, though, might want to offload all the crypto work onto subworkers. This could be done as follows:
function handleMessage(e) {
if (e.data == "genkeys")
genkeys(e.ports[0]);
else if (e.data == "encrypt")
encrypt(e.ports[0]);
else if (e.data == "decrypt")
decrypt(e.ports[0]);
}
function genkeys(p) {
var generator = new Worker('libcrypto-v2-generator.js');
generator.postMessage('', [p]);
}
function encrypt(p) {
p.onmessage = function (e) {
var key = e.data;
var encryptor = new Worker('libcrypto-v2-encryptor.js');
encryptor.postMessage(key, [p]);
};
}
function encrypt(p) {
p.onmessage = function (e) {
var key = e.data;
var decryptor = new Worker('libcrypto-v2-decryptor.js');
decryptor.postMessage(key, [p]);
};
}
// support being used as a shared worker as well as a dedicated worker
if ('onmessage' in this) // dedicated worker
onmessage = handleMessage;
else // shared worker
onconnect = function (e) { e.ports[0].onmessage = handleMessage };
The little subworkers would then be as follows.
For generating key pairs:
onmessage = function (e) {
var k = _generateKeyPair();
e.ports[0].postMessage(k[0]);
e.ports[0].postMessage(k[1]);
close();
}
function _generateKeyPair() {
return [Math.random(), Math.random()];
}
For encrypting:
onmessage = function (e) {
var key = e.data;
e.ports[0].onmessage = function (e) {
var s = e.data;
postMessage(_encrypt(key, s));
}
}
function _encrypt(k, s) {
return 'encrypted-' + k + ' ' + s;
}
For decrypting:
onmessage = function (e) {
var key = e.data;
e.ports[0].onmessage = function (e) {
var s = e.data;
postMessage(_decrypt(key, s));
}
}
function _decrypt(k, s) {
return s.substr(s.indexOf(' ')+1);
}
Notice how the users of the API don't have to even know that this is happening — the API hasn't changed; the library can delegate to subworkers without changing its API, even though it is accepting data using message channels.
There are two kinds of workers; dedicated workers, and shared workers. Dedicated workers, once created, and are linked to their creator; but message ports can be used to communicate from a dedicated worker to multiple other browsing contexts or workers. Shared workers, on the other hand, are named, and once created any script running in the same origin can obtain a reference to that worker and communicate with it.
The global scope is the "inside" of a worker.
WorkerGlobalScope abstract interfaceinterface WorkerGlobalScope {
readonly attribute WorkerGlobalScope self;
readonly attribute WorkerLocation location;
void close();
attribute Function onerror;
};
WorkerGlobalScope implements WorkerUtils;
WorkerGlobalScope implements EventTarget;
The self attribute
must return the WorkerGlobalScope object itself.
The location
attribute must return the WorkerLocation object created
for the WorkerGlobalScope object when the worker was
created. It represents the absolute URL of the script
that was used to initialize the worker, after any redirects.
When a script invokes the close()
method on a WorkerGlobalScope object, the user agent
must run the following steps (atomically):
Discard any tasks that have been added to the event loop's task queues.
Set the worker's WorkerGlobalScope object's
closing flag to
true. (This prevents any further tasks from being queued.)
Disentangle all the ports in the list of the worker's ports.
The following are the event handlers (and their
corresponding event handler
event types) that must be supported, as IDL attributes, by
objects implementing the WorkerGlobalScope
interface:
| Event handler | Event handler event type |
|---|---|
onerror | error
|
The WorkerGlobalScope interface must not exist if
the interface's relevant namespace object is a
Window object. [WEBIDL]
DedicatedWorkerGlobalScope interface[Supplemental, NoInterfaceObject]
interface DedicatedWorkerGlobalScope : WorkerGlobalScope {
void postMessage(in any message, in optional MessagePortArray ports);
attribute Function onmessage;
};
DedicatedWorkerGlobalScope objects act as if they
had an implicit MessagePort associated with them. This
port is part of a channel that is set up when the worker is created,
but it is not exposed. This object must never be garbage collected
before the DedicatedWorkerGlobalScope object.
All messages received by that port must immediately be retargeted
at the DedicatedWorkerGlobalScope object.
The postMessage()
method on
DedicatedWorkerGlobalScope objects must act as if, when
invoked, it immediately invoked the
method of the same name on the port, with the same arguments, and
returned the same return value.
The following are the event handlers (and their
corresponding event handler
event types) that must be supported, as IDL attributes, by
objects implementing the DedicatedWorkerGlobalScope
interface:
| Event handler | Event handler event type |
|---|---|
onmessage | message
|
For the purposes of the application cache networking model, a dedicated worker is an extension of the cache host from which it was created.
SharedWorkerGlobalScope inteface[Supplemental, NoInterfaceObject]
interface SharedWorkerGlobalScope : WorkerGlobalScope {
readonly attribute DOMString name;
readonly attribute ApplicationCache applicationCache;
attribute Function onconnect;
};
Shared workers receive message ports through connect events on
their global object for each connection.
The name
attribute must return the value it was assigned when the
SharedWorkerGlobalScope object was created by the
"run a worker" algorithm. Its value represents the name
that can be used to obtain a reference to the worker using the
SharedWorker constructor.
The following are the event handlers (and their
corresponding event handler
event types) that must be supported, as IDL attributes, by
objects implementing the SharedWorkerGlobalScope
interface:
| Event handler | Event handler event type |
|---|---|
onconnect | connect
|
For the purposes of the application cache networking model, a shared worker is its own cache host. The run a worker algorithm takes care of associating the worker with an application cache.
The applicationCache
attribute returns the ApplicationCache object for the
worker.
Both the origin and effective script
origin of scripts running in workers are the
origin of the absolute URL given in that
the worker's location attribute
represents.
Each WorkerGlobalScope object has an event
loop distinct from those defined for units of related
similar-origin browsing contexts. This event
loop has no associated browsing context, and its
task queues only have events,
callbacks, and networking activity as tasks. The processing model of these
event loops is defined below in the
run a worker algorithm.
Each WorkerGlobalScope object also has a closing flag, which must
initially be false, but which can get set to true by the algorithms
in the processing model section below.
Once the WorkerGlobalScope's closing flag is set to
true, the event loop's task
queues must discard any further tasks that would be added to them (tasks
already on the queue are unaffected except where otherwise
specified). Effectively, once the closing flag is true,
timers stop firing, notifications for all pending asynchronous
operations are dropped, etc.
Workers communicate with other workers and with browsing contexts through message channels and their
MessagePort objects.
Each WorkerGlobalScope worker global
scope has a list of the worker's ports, which
consists of all the MessagePort objects that are
entangled with another port and that have one (but only one) port
owned by worker global scope. This list includes
the implicit
MessagePort in the case of dedicated workers.
Each WorkerGlobalScope also has a list of the
worker's workers. Initially this list is empty; it is
populated when the worker creates or obtains further workers.
Finally, each WorkerGlobalScope also has a list of
the worker's Documents. Initially this list
is empty; it is populated when the worker is created.
Whenever a Document d is added to the
worker's Documents, the user agent must, for each
worker in the list of the worker's workers whose list
of the worker's Documents does not contain
d, add d to q's
WorkerGlobalScope owner's list of the worker's
Documents.
Whenever a Document object is discarded, it must be removed from the list of
the worker's Documents of each worker
whose list contains that Document.
Given a script's global object o
when creating or obtaining a worker, the list of relevant
Document objects to add depends on the type of
o. If o is a
WorkerGlobalScope object (i.e. if we are creating a
nested worker), then the relevant Documents are the
Documents that are in o's own list
of the worker's Documents. Otherwise, o is a Window object, and the relevant
Document is just the Document that is the
active document of the Window object o.
A worker is said to be a permissible worker if its
list of the worker's Documents is not
empty.
A worker is said to be a protected worker if it is a
permissible worker and either it has outstanding
timers, database transactions, or network connections, or its list
of the worker's ports is not empty, or its
WorkerGlobalScope is actually a
SharedWorkerGlobalScope object (i.e. the worker is a
shared worker).
A worker is said to be an active needed worker if any
of the Document objects in the worker's
Documents are fully active.
A worker is said to be a suspendable worker if it is not an active needed worker but it is a permissible worker.
When a user agent is to run a worker for a script with
URL url, a browsing
context owner browsing context, a
Document owner document, an
origin owner origin, and with
global scope worker global scope, it must run
the following steps:
Create a completely separate and parallel execution environment (i.e. a separate thread or process or equivalent construct), and run the rest of these steps asynchronously in that context.
If worker global scope is actually a
SharedWorkerGlobalScope object (i.e. the worker is a
shared worker), and there are any relevant application caches that are identified by a
manifest URL with the same origin as url and that have url as one of
their entries, not excluding entries marked as foreign, then associate the
worker global scope with the most appropriate application
cache of those that match.
Attempt to fetch the resource identified by url, from the owner origin.
If the attempt fails, or if the attempt involves any redirects
to URIs that do not have the same origin as url (even if the final URI is at the same
origin as the original url), then for
each Worker or SharedWorker object
associated with worker global scope,
queue a task to fire a simple event
named error at that
object. Abort these steps.
If the attempt succeeds, then convert the script resource to Unicode by assuming it was encoded as UTF-8, to obtain its source.
Let language be JavaScript.
As with script elements, the MIME
type of the script is ignored. Unlike with script
elements, there is no way to override the type. It's always
assumed to be JavaScript.
A new script is now created, as follows.
Create a new script execution environment set up as appropriate for the scripting language language.
Parse/compile/initialize source using that script execution environment, as appropriate for language, and thus obtain a list of code entry-points; set the initial code entry-point to the entry-point for any executable code to be immediately run.
Set the script's global object to worker global scope.
Set the script's browsing context to owner browsing context.
Set the script's document to owner document.
Set the script's URL character encoding to UTF-8. (This is just used for encoding non-ASCII characters in the query component of URLs.)
Set the script's base URL to url.
Closing orphan workers: Start monitoring the worker such that no sooner than it stops being either a protected worker or a suspendable worker, and no later than it stops being a permissible worker, worker global scope's closing flag is set to true.
Suspending workers: Start monitoring the worker, such that whenever worker global scope's closing flag is false and the worker is a suspendable worker, the user agent suspends execution of script in that worker until such time as either the closing flag switches to true or the worker stops being a suspendable worker.
Jump to the script's initial code entry-point, and let that run until it either returns, fails to catch an exception, or gets prematurely aborted by the "kill a worker" or "terminate a worker" algorithms defined below.
If worker global scope is actually a
DedicatedWorkerGlobalScope object (i.e. the worker is
a dedicated worker), then enable the port message
queue of the worker's implicit port.
Event loop: Wait until either there is a task in one of the event loop's task queues or worker global scope's closing flag is set to true.
Run the oldest task on one of the event loop's task queues, if any. The user agent may pick any task queue.
The handling of events or the execution of callbacks might get prematurely aborted by the "kill a worker" or "terminate a worker" algorithms defined below.
Remove the task run in the previous step, if any, from its task queue.
If there are any more events in the event loop's task queues or if worker global scope's closing flag is set to false, then jump back to the step above labeled event loop.
If there are any outstanding transactions that have callbacks that involve scripts whose global object is the worker global scope, roll them back (without invoking any of the callbacks).
Empty the worker global scope's list of active timeouts and its list of active intervals.
When a user agent is to kill a worker it must run the following steps in parallel with the worker's main loop (the "run a worker" processing model defined above):
Set the worker's WorkerGlobalScope object's closing flag to
true.
If there are any tasks queued in the event loop's task queues, discard them without processing them.
Wait a user-agent-defined amount of time.
Abort the script currently running in the worker.
User agents may invoke the "kill a worker" processing model on a worker at any time, e.g. in response to user requests, in response to CPU quota management, or when a worker stops being an active needed worker if the worker continues executing even after its closing flag was set to true.
When a user agent is to terminate a worker it must run the following steps in parallel with the worker's main loop (the "run a worker" processing model defined above):
Set the worker's WorkerGlobalScope object's
closing flag to
true.
If there are any tasks queued in the event loop's task queues, discard them without processing them.
Abort the script currently running in the worker.
If the worker's WorkerGlobalScope object is
actually a DedicatedWorkerGlobalScope object (i.e. the
worker is a dedicated worker), then empty the port message
queue of the port that the worker's implicit port is
entangled with.
The task source for the tasks mentioned above is the DOM manipulation task source.
Whenever an uncaught runtime script error occurs in one of the
worker's scripts, if the error did not occur while handling a
previous script error, the user agent must report the
error using the WorkerGlobalScope object's onerror
attribute.
For shared workers, if the error is still not handled afterwards, or if the error occurred while handling a previous script error, the error may be reported to the user.
For dedicated workers, if the error is still not handled afterwards, or if
the error occurred while handling a previous script error, the user
agent must queue a task to fire a worker error
event at the Worker object associated with the
worker.
When the user agent is to fire a worker error event at
a Worker object, it must dispatch an event that uses
the ErrorEvent interface, with the name error, that doesn't bubble and is
cancelable, with its message, filename, and lineno attributes set
appropriately. The default action of this event depends on whether
the Worker object is itself in a worker. If it is, and
that worker is also a dedicated worker, then the user agent must
again queue a task to fire a worker error
event at the Worker object associated with
that worker. Otherwise, then the error may be reported
to the user.
The task source for the tasks mentioned above is the DOM manipulation task source.
interface ErrorEvent : Event {
readonly attribute DOMString message;
readonly attribute DOMString filename;
readonly attribute unsigned long lineno;
void initErrorEvent(in DOMString typeArg, in boolean canBubbleArg, in boolean cancelableArg, in DOMString messageArg, in DOMString filenameArg, in unsigned long linenoArg);
};
The initErrorEvent()
method must initialize the event in a manner analogous to the
similarly-named method in the DOM Events interfaces. [DOMEVENTS]
The message
attribute represents the error message.
The filename
attribute represents the absolute URL of the script in
which the error originally occurred.
The lineno
attribute represents the line number where the error occurred in the
script.
AbstractWorker abstract interface[Supplemental, NoInterfaceObject]
interface AbstractWorker {
attribute Function onerror;
};
AbstractWorker implements EventTarget;
The following are the event handlers (and their
corresponding event handler
event types) that must be supported, as IDL attributes, by
objects implementing the AbstractWorker interface:
| Event handler | Event handler event type |
|---|---|
onerror | error
|
Worker interface[Constructor(in DOMString scriptURL)]
interface Worker : AbstractWorker {
void terminate();
void postMessage(in any message, in optional MessagePortArray ports);
attribute Function onmessage;
};
The terminate() method,
when invoked, must cause the "terminate a worker"
algorithm to be run on the worker with with the object is
associated.
Worker objects act as if they had an implicit
MessagePort associated with them. This port is part of
a channel that is set up when the worker is created, but it is not
exposed. This object must never be garbage collected before the
Worker object.
All messages received by that port must immediately be retargeted
at the Worker object.
The postMessage()
method on Worker objects
must act as if, when invoked, it
immediately invoked the method of the same name on the port, with
the same arguments, and returned the same return value.
The following are the event handlers (and their
corresponding event handler
event types) that must be supported, as IDL attributes, by
objects implementing the Worker interface:
| Event handler | Event handler event type |
|---|---|
onmessage | message
|
When the Worker(scriptURL) constructor is invoked, the
user agent must run the following steps:
Resolve the scriptURL argument relative to the entry script's base URL, when the method is invoked.
If this fails, throw a SYNTAX_ERR
exception.
If the origin of the resulting absolute
URL is not the same as the
origin of the entry script, then throw a
SECURITY_ERR exception.
Thus, scripts must be external files with the same
scheme as the original page: you can't load a script from a data: URL or javascript:
URL, and a https: page couldn't start workers using
scripts with http: URLs.
Create a new DedicatedWorkerGlobalScope
object. Let worker global scope be this new
object.
Create a new Worker object, associated with
worker global scope. Let worker be this new object.
Create a new MessagePort object
owned by the global
object of the script that
invoked the constructor. Let this be the outside
port.
Associate the outside port with worker.
Create a new MessagePort object
owned by worker global scope. Let inside port be this new object.
Associate inside port with worker global scope.
Entangle outside port and inside port.
Return worker, and run the following steps asynchronously.
Enable outside port's port message queue.
Let docs be the list of relevant
Document objects to add given the global object of the script that invoked the
constructor.
Add to
worker global scope's list of the
worker's Documents the
Document objects in docs.
If the global object
of the script that invoked the
constructor is a WorkerGlobalScope object (i.e. we
are creating a nested worker), add worker global
scope to the list of the worker's workers of the
WorkerGlobalScope object that is the global object of the script that invoked the
constructor.
Run a worker for the resulting absolute URL, with the script's browsing context of the script that invoked the method as the owner browsing context, with the script's document of the script that invoked the method as the owner document, with the origin of the entry script as the owner origin, and with worker global scope as the global scope.
This constructor must be visible when the script's global
object is either a Window object or an object
implementing the WorkerUtils interface.
SharedWorker interface[Constructor(in DOMString scriptURL, in optional DOMString name)]
interface SharedWorker : AbstractWorker {
readonly attribute MessagePort port;
};
The port
attribute must return the value it was assigned by the object's
constructor. It represents the MessagePort for
communicating with the shared worker.
When the SharedWorker(scriptURL, name)
constructor is invoked, the user agent must run the following
steps:
Resolve the scriptURL argument.
If this fails, throw a SYNTAX_ERR
exception.
Otherwise, let scriptURL be the resulting absolute URL.
Let name be the value of the second argument, or the empty string if the second argument was omitted.
If the origin of scriptURL is
not the same as the origin of the
entry script, then throw a SECURITY_ERR
exception.
Thus, scripts must be external files with the same
scheme as the original page: you can't load a script from a data: URL or javascript:
URL, and a https: page couldn't start workers using
scripts with http: URLs.
Let docs be the list of relevant
Document objects to add given the global object of the script that invoked the
constructor.
Execute the following substeps atomically:
Create a new SharedWorker object, which will
shortly be associated with a SharedWorkerGlobalScope
object. Let this SharedWorker object be worker.
Create a new MessagePort object
owned by the global
object of the script that invoked the method. Let this be
the outside port.
Assign outside port to the port attribute of worker.
Let worker global scope be null.
If name is not the empty string and
there exists a SharedWorkerGlobalScope object whose
closing flag
is false, whose name attribute is
exactly equal to name, and whose location attribute
represents an absolute URL with the same
origin as scriptURL, then let worker global scope be that
SharedWorkerGlobalScope object.
Otherwise, if name is the empty string
and there exists a SharedWorkerGlobalScope object
whose closing
flag is false, and whose location attribute
represents an absolute URL that is exactly equal to
scriptURL, then let worker
global scope be that SharedWorkerGlobalScope
object.
If worker global scope is not null, then run these steps:
If worker global scope's location
attribute represents an absolute URL that is not
exactly equal to scriptURL, then throw a
URL_MISMATCH_ERR exception and abort all these
steps.
Associate worker with worker global scope.
Create a new MessagePort
object owned by worker global
scope. Let this be the inside
port.
Entangle outside port and inside port.
Return worker and perform the next step asynchronously.
Create an event that uses the MessageEvent
interface, with the name connect, which does not bubble, is
not cancelable, has no default action, has a data attribute whose value
is the empty string and has a ports attribute whose
value is an array containing only the newly created port, and
queue a task to dispatch the event at worker global scope.
Add to
worker global scope's list of the
worker's Documents the
Document objects in docs.
If the global
object of the script
that invoked the constructor is a
WorkerGlobalScope object, add worker global scope to the list of the
worker's workers of the WorkerGlobalScope
object that is the global
object of the script
that invoked the constructor.
Abort all these steps.
Create a new SharedWorkerGlobalScope
object. Let worker global scope be this new
object.
Associate worker with worker global scope.
Set the name attribute of
worker global scope to name.
Create a new MessagePort object
owned by worker global scope. Let inside port be this new object.
Entangle outside port and inside port.
Return worker and perform the remaining steps asynchronously.
Create an event that uses the MessageEvent
interface, with the name connect, which does not bubble, is not
cancelable, has no default action, has a data attribute whose value is
the empty string and has a ports attribute whose value
is an array containing only the newly created port, and queue
a task to dispatch the event at worker global
scope.
Add to
worker global scope's list of the
worker's Documents the
Document objects in docs.
If the global object
of the script that invoked the
constructor is a WorkerGlobalScope object, add worker global scope to the list of the
worker's workers of the WorkerGlobalScope
object that is the global
object of the script
that invoked the constructor.
Run a worker for scriptURL, with the script's browsing context of the script that invoked the method as the owner browsing context, with the script's document of the script that invoked the method as the owner document, with the origin of the entry script as the owner origin, and with worker global scope as the global scope.
This constructor must be visible when the script's global
object is either a Window object or an object
implementing the WorkerUtils interface.
The task source for the tasks mentioned above is the DOM manipulation task source.
[Supplemental, NoInterfaceObject]
interface WorkerUtils {
void importScripts(in DOMString... urls);
readonly attribute WorkerNavigator navigator;
};
WorkerUtils implements WindowTimers;
The DOM APIs (Node objects, Document
objects, etc) are not available to workers in this version of this
specification.
When a script invokes the importScripts(urls) method on a
WorkerGlobalScope object, the user agent must run the
following steps:
If there are no arguments, return without doing anything. Abort these steps.
Resolve each argument.
If any fail, throw a SYNTAX_ERR
exception.
Attempt to fetch each resource identified by the resulting absolute URLs, from the entry script's origin.
For each argument in turn, in the order given, starting with the first one, run these substeps:
Wait for the fetching attempt for the corresponding resource to complete.
If the fetching attempt failed, throw a
NETWORK_ERR exception and abort all these
steps.
If the attempt succeeds, then convert the script resource to Unicode by assuming it was encoded as UTF-8, to obtain its source.
Let language be JavaScript.
As with the worker's script, the script here is always assumed to be JavaScript, regardless of the MIME type.
Create a script, using source as the script source and language as the scripting language, using the same global object, browsing context, URL character encoding, base URL, and script group as the script that was created by the worker's run a worker algorithm.
Let the newly created script run until it either returns, fails to parse, fails to catch an exception, or gets prematurely aborted by the "kill a worker" or "terminate a worker" algorithms defined above.
If it failed to parse, then throw an ECMAScript
SyntaxError exception and abort all these
steps. [ECMA262]
If an exception was raised or if the script was prematurely
aborted, then abort all these steps, letting the exception or
aborting continue to be processed by the script that called the
importScripts()
method.
If the "kill a worker" or "terminate a worker" algorithms abort the script then abort all these steps.
WorkerNavigator objectThe navigator attribute
of the WorkerUtils interface must return an instance of
the WorkerNavigator interface, which represents the
identity and state of the user agent (the client):
interface WorkerNavigator {};
WorkerNavigator implements NavigatorID;
WorkerNavigator implements NavigatorOnLine;
Objects implementing the WorkerNavigator interface
also implement the NavigatorID and
NavigatorOnLine interfaces.
This WorkerNavigator interface must not exist if the
interface's relevant namespace object is a
Window object. [WEBIDL]
The openDatabase() and
openDatabaseSync()
methods are defined in the Web SQL Database specification. [WEBSQL]
There must be no interface objects and constructors available in
the global scope of scripts whose script's global
object is a WorkerGlobalScope object except for
the following:
XMLHttpRequest and all interface objects and
constructors defined by the XMLHttpRequest specifications, except
that the document response entity body must always be
null. The XMLHttpRequest base URL is the
script's base URL; the
XMLHttpRequest origin is the script's
origin. [XHR]
The interface objects and constructors defined by this specification.
Constructors defined by specifications that explicitly say
that they should be visible when the script's global
object is a DedicatedWorkerGlobalScope, a
SharedWorkerGlobalScope, or an object implementing the
WorkerUtils interface; the interfaces of any objects
with such constructors; and the interfaces of any objects made
accessible through APIs exposed by those constructors or made
accessible through interfaces to be implemented by any objects that
are themselves accessible to scripts whose script's global
object implements the WorkerUtils
interface.
These requirements do not override the requirements
defined by the Web IDL specification, in particular concerning the
visibility of interfaces annotated with the [NoInterfaceObject] extended attribute.
interface WorkerLocation {
readonly attribute DOMString href;
readonly attribute DOMString protocol;
readonly attribute DOMString host;
readonly attribute DOMString hostname;
readonly attribute DOMString port;
readonly attribute DOMString pathname;
readonly attribute DOMString search;
readonly attribute DOMString hash;
};
A WorkerLocation object represents an absolute
URL set at its creation.
The href
attribute must return the absolute URL that the object
represents.
The WorkerLocation interface also has the complement
of URL decomposition IDL attributes, protocol,
host, port, hostname,
pathname,
search,
and hash. These must
follow the rules given for URL decomposition IDL attributes, with the
input being the
absolute URL that the object represents (same as the
href attribute), and
the common setter action
being a no-op, since the attributes are defined to be readonly.
The WorkerLocation interface must not exist if the
interface's relevant namespace object is a
Window object. [WEBIDL]
Messages in server-sent events, Web
sockets, cross-document messaging, and
channel messaging use the message event.
The following interface is defined for this event:
interface MessageEvent : Event {
readonly attribute any data;
readonly attribute DOMString origin;
readonly attribute DOMString lastEventId;
readonly attribute WindowProxy source;
readonly attribute MessagePortArray ports;
void initMessageEvent(in DOMString typeArg, in boolean canBubbleArg, in boolean cancelableArg, in any dataArg, in DOMString originArg, in DOMString lastEventIdArg, in WindowProxy sourceArg, in MessagePortArray portsArg);
};
dataReturns the data of the message.
originReturns the origin of the message, for server-sent events and cross-document messaging.
lastEventIdReturns the last event ID, for server-sent events.
sourceReturns the WindowProxy of the source window, for
cross-document messaging.
portsReturns the MessagePortArray sent with the
message, for cross-document messaging and
channel messaging.
The initMessageEvent()
method must initialize the event in a manner analogous to the
similarly-named method in the DOM Events interfaces. [DOMEVENTS]
The data
attribute represents the message being sent.
The origin attribute
represents, in server-sent events and
cross-document messaging, the origin of
the document that sent the message (typically the scheme, hostname,
and port of the document, but not its path or fragment
identifier).
The lastEventId
attribute represents, in server-sent events, the last event ID
string of the event source.
The source attribute
represents, in cross-document messaging, the
WindowProxy of the browsing context of the
Window object from which the message came.
The ports
attribute represents, in cross-document messaging and
channel messaging the MessagePortArray
being sent, if any.
Except where otherwise specified, when the user agent creates and
dispatches a message event in the
algorithms described in the following sections, the lastEventId attribute
must be the empty string, the origin attribute must be the
empty string, the source attribute must be
null, and the ports
attribute must be null.
This section is non-normative.
To enable servers to push data to Web pages over HTTP or using
dedicated server-push protocols, this specification introduces the
EventSource interface.
Using this API consists of creating an EventSource
object and registering an event listener.
var source = new EventSource('updates.cgi');
source.onmessage = function (event) {
alert(event.data);
};
On the server-side, the script ("updates.cgi" in this case) sends messages in the
following form, with the text/event-stream MIME
type:
data: This is the first message. data: This is the second message, it data: has two lines. data: This is the third message.
Using this API rather than emulating it using
XMLHttpRequest or an iframe allows the
user agent to make better use of network resources in cases where
the user agent implementor and the network operator are able to
coordinate in advance. Amongst other benefits, this can result in
significant savings in battery life on portable devices. This is
discussed further in the section below on connectionless push.
EventSource interface[Constructor(in DOMString url)]
interface EventSource {
readonly attribute DOMString URL;
// ready state
const unsigned short CONNECTING = 0;
const unsigned short OPEN = 1;
const unsigned short CLOSED = 2;
readonly attribute unsigned short readyState;
// networking
attribute Function onopen;
attribute Function onmessage;
attribute Function onerror;
void close();
};
EventSource implements EventTarget;
The EventSource(url) constructor takes one argument,
url, which specifies the URL to
which to connect. When the EventSource() constructor is
invoked, the UA must run these steps:
Resolve the URL specified in url, relative to the entry script's base URL.
If the previous step failed, then throw a
SYNTAX_ERR exception.
Return a new EventSource object, and continue
these steps in the background (without blocking scripts).
Fetch the resource identified by the resulting absolute URL, from the entry script's origin, and process it as described below.
The definition of the fetching algorithm is such that if the browser is already fetching the resource identified by the given absolute URL, that connection can be reused, instead of a new connection being established. All messages received up to this point are dispatched immediately, in this case.
This constructor must be visible when the script's global
object is either a Window object or an object
implementing the WorkerUtils interface.
The URL
attribute must return the absolute URL that resulted
from resolving the value that was
passed to the constructor.
The readyState
attribute represents the state of the connection. It can have the
following values:
CONNECTING (numeric value 0)OPEN (numeric value 1)CLOSED (numeric value 2)close() method was
invoked.When the object is created its readyState must be set to
CONNECTING (0). The
rules given below for handling the connection define when the value
changes.
The close()
method must close the connection, if any; must abort any
reconnection attempt, if any; and must set the readyState attribute to
CLOSED. If the
connection is already closed, the method must do nothing.
The following are the event handlers (and their
corresponding event handler
event types) that must be supported, as IDL attributes, by
all objects implementing the EventSource interface:
| Event handler | Event handler event type |
|---|---|
onopen | open
|
onmessage | message
|
onerror | error
|
In addition to the above, each EventSource object
has the following associated with it:
These values are not currently exposed on the interface.
The resource indicated in the argument to the EventSource constructor is fetched when the constructor is run.
For HTTP connections, the Accept header may
be included; if included, it must contain only formats of event
framing that are supported by the user agent (one of which must be
text/event-stream, as described below).
If the event source's last event ID string is not the empty
string, then a Last-Event-ID
HTTP header must be included with the request, whose value is the
value of the event source's last event ID string.
User agents should use the Cache-Control: no-cache
header in requests to bypass any caches for requests of event
sources. User agents should ignore HTTP cache headers in the
response, never caching event sources.
User agents must act as if the connection had failed due to a
network error if the origin of the URL of
the resource to be fetched is not the
same origin as that of the entry script
when the EventSource()
constructor is invoked.
As data is received, the tasks queued by the networking task source to handle the data must act as follows.
HTTP 200 OK responses with a Content-Type header
specifying the type text/event-stream must be processed
line by line as described
below.
When a successful response with a supported MIME type is received, such that the user agent begins parsing the contents of the stream, the user agent must announce the connection.
The task that the networking task source places on the task queue once the fetching algorithm for such a resource (with the correct MIME type) has completed must reestablish the connection. This applies whether the connection is closed gracefully or unexpectedly. It doesn't apply for the error conditions listed below.
HTTP 200 OK responses that have a Content-Type other
than text/event-stream (or some other supported type)
must cause the user agent to fail the connection.
HTTP 204 No Content, and 205 Reset Content responses are equivalent to 200 OK responses with the right MIME type but no content, and thus must reestablish the connection.
Other HTTP response codes in the 2xx range must similarly reestablish the connection. They are, however, likely to indicate an error has occurred somewhere and may cause the user agent to emit a warning.
HTTP 301 Moved Permanently responses must cause the user agent to
reconnect using the new server specified URL instead of the
previously specified URL for all subsequent requests for this event
source. (It doesn't affect other EventSource objects
with the same URL unless they also receive 301 responses, and it
doesn't affect future sessions, e.g. if the page is reloaded.)
HTTP 302 Found, 303 See Other, and 307 Temporary Redirect responses must cause the user agent to connect to the new server-specified URL, but if the user agent needs to again request the resource at a later point, it must return to the previously specified URL for this event source.
The Origin specification also introduces some relevant requirements when dealing with redirects. [ORIGIN]
HTTP 305 Use Proxy, HTTP 401 Unauthorized, and 407 Proxy Authentication Required should be treated transparently as for any other subresource.
Any other HTTP response code not listed here, and any network error that prevents the HTTP connection from being established in the first place (e.g. DNS errors), must cause the user agent to fail the connection.
For non-HTTP protocols, UAs should act in equivalent ways.
When a user agent is to announce the connection, the
user agent must set the readyState attribute to
OPEN and queue a
task to fire a simple event named open at the
EventSource object.
When a user agent is to reestablish the connection, the user
agent must set the readyState attribute to
CONNECTING,
queue a task to fire a simple event named
error at the
EventSource object, and then fetch the
event source resource again after a delay equal to the reconnection
time of the event source, from the same origin as the
original request triggered by the EventSource()
constructor. Only if the user agent reestablishes the connection does the connection get
opened anew!
When a user agent is to fail the connection, the user
agent must set the readyState attribute to
CLOSED and queue a
task to fire a simple event named error at the EventSource
object. Once the user agent has failed the connection, it does not
attempt to reconnect!
The task source for any tasks that are queued by EventSource objects is the
remote event task source.
This event stream format's MIME type is
text/event-stream.
The event stream format is as described by the stream production of the following ABNF, the
character set for which is Unicode. [ABNF]
stream = [ bom ] *event
event = *( comment / field ) end-of-line
comment = colon *any-char end-of-line
field = 1*name-char [ colon [ space ] *any-char ] end-of-line
end-of-line = ( cr lf / cr / lf / eof )
eof = < matches repeatedly at the end of the stream >
; characters
lf = %x000A ; U+000A LINE FEED (LF)
cr = %x000D ; U+000D CARRIAGE RETURN (CR)
space = %x0020 ; U+0020 SPACE
colon = %x003A ; U+003A COLON (:)
bom = %xFEFF ; U+FEFF BYTE ORDER MARK
name-char = %x0000-0009 / %x000B-000C / %x000E-0039 / %x003B-10FFFF
; a Unicode character other than U+000A LINE FEED (LF), U+000D CARRIAGE RETURN (CR), or U+003A COLON (:)
any-char = %x0000-0009 / %x000B-000C / %x000E-10FFFF
; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)
Event streams in this format must always be encoded as UTF-8.
Lines must be separated by either a U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF) character, or a single U+000D CARRIAGE RETURN (CR) character.
Since connections established to remote servers for such resources are expected to be long-lived, UAs should ensure that appropriate buffering is used. In particular, while line buffering with lines are defined to end with a single U+000A LINE FEED (LF) character is safe, block buffering or line buffering with different expected line endings can cause delays in event dispatch.
Bytes or sequences of bytes that are not valid UTF-8 sequences must be interpreted as the U+FFFD REPLACEMENT CHARACTER.
One leading U+FEFF BYTE ORDER MARK character must be ignored if any are present.
The stream must then be parsed by reading everything line by line, with a U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF) character not preceded by a U+000D CARRIAGE RETURN (CR) character, a single U+000D CARRIAGE RETURN (CR) character not followed by a U+000A LINE FEED (LF) character, and the end of the file being the four ways in which a line can end.
When a stream is parsed, a data buffer and an event name buffer must be associated with it. They must be initialized to the empty string
Lines must be processed, in the order they are received, as follows:
Dispatch the event, as defined below.
Ignore the line.
Collect the characters on the line before the first U+003A COLON character (:), and let field be that string.
Collect the characters on the line after the first U+003A COLON character (:), and let value be that string. If value starts with a U+0020 SPACE character, remove it from value.
Process the field using the steps described below, using field as the field name and value as the field value.
Process the field using the steps described below, using the whole line as the field name, and the empty string as the field value.
Once the end of the file is reached, the user agent must dispatch the event one final time, as defined below.
The steps to process the field given a field name and a field value depend on the field name, as given in the following list. Field names must be compared literally, with no case folding performed.
Set the event name buffer to field value.
Append the field value to the data buffer, then append a single U+000A LINE FEED (LF) character to the data buffer.
Set the event stream's last event ID to the field value.
If the field value consists of only characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), then interpret the field value as an integer in base ten, and set the event stream's reconnection time to that integer. Otherwise, ignore the field.
The field is ignored.
When the user agent is required to dispatch the event, then the user agent must act as follows:
If the data buffer is an empty string, set the data buffer and the event name buffer to the empty string and abort these steps.
If the data buffer's last character is a U+000A LINE FEED (LF) character, then remove the last character from the data buffer.
If the event name buffer is not the empty string but is also not a valid event type name, as defined by the DOM Events specification, set the data buffer and the event name buffer to the empty string and abort these steps. [DOMEVENTS]
Otherwise, create an event that uses the
MessageEvent interface, with the event name message, which does not bubble, is not
cancelable, and has no default action. The data attribute must be set to
the value of the data buffer, the origin attribute must be set
to the Unicode
serialization of the origin of the event
stream's URL, and the lastEventId attribute
must be set to the last event ID
string of the event source.
If the event name buffer has a value other than the empty string, change the type of the newly created event to equal the value of the event name buffer.
Set the data buffer and the event name buffer to the empty string.
Queue a task to dispatch the newly created
event at the EventSource object.
If an event doesn't have an "id" field, but an
earlier event did set the event source's last event ID
string, then the event's lastEventId field will
be set to the value of whatever the last seen "id" field was.
The following event stream, once followed by a blank line:
data: YHOO data: +2 data: 10
...would cause an event message with the interface
MessageEvent to be dispatched on the
EventSource object. The event's data attribute would contain
the string YHOO\n+2\n10 (where \n
represents a newline).
This could be used as follows:
var stocks = new EventSource("http://stocks.example.com/ticker.php");
stocks.onmessage = function (event) {
var data = event.data.split('\n');
updateStocks(data[0], data[1], data[2]);
};
...where updateStocks() is a function defined as:
function updateStocks(symbol, delta, value) { ... }
...or some such.
The following stream contains four blocks. The first block has
just a comment, and will fire nothing. The second block has two
fields with names "data" and "id" respectively; an event will be
fired for this block, with the data "first event", and will then
set the last event ID to "1" so that if the connection died between
this block and the next, the server would be sent a Last-Event-ID header with the
value "1". The third block fires an event with data "second event",
and also has an "id" field, this time with no value, which resets
the last event ID to the empty string (meaning no Last-Event-ID header will now be
sent in the event of a reconnection being attempted). Finally, the
last block just fires an event with the data " third event"
(with a single leading space character). Note that the last block
doesn't have to end with a blank line, the end of the stream is
enough to trigger the dispatch of the last event.
: test stream data: first event id: 1 data:second event id data: third event
The following stream fires just one event:
data data data data:
The first and last blocks do nothing, since they do not contain any actual data (the data buffer remains at the empty string, and so nothing gets dispatched). The middle block fires an event with the data set to a single newline character.
The following stream fires two identical events:
data:test data: test
This is because the space after the colon is ignored if present.
Legacy proxy servers are known to, in certain cases, drop HTTP connections after a short timeout. To protect against such proxy servers, authors can include a comment line (one starting with a ':' character) every 15 seconds or so.
Authors wishing to relate event source connections to each other or to specific documents previously served might find that relying on IP addresses doesn't work, as individual clients can have multiple IP addresses (due to having multiple proxy servers) and individual IP addresses can have multiple clients (due to sharing a proxy server). It is better to include a unique identifier in the document when it is served and then pass that identifier as part of the URL when the connection is established.
Authors are also cautioned that HTTP chunking can have unexpected negative effects on the reliability of this protocol. Where possible, chunking should be disabled for serving event streams unless the rate of messages is high enough for this not to matter.
Clients that support HTTP's per-server connection limitation
might run into trouble when opening multiple pages from a site if
each page has an EventSource to the same
domain. Authors can avoid this using the relatively complex
mechanism of using unique domain names per connection, or by
allowing the user to enable or disable the EventSource
functionality on a per-page basis, or by sharing a single
EventSource object using a shared worker.
User agents running in controlled environments, e.g. browsers on mobile handsets tied to specific carriers, may offload the management of the connection to a proxy on the network. In such a situation, the user agent for the purposes of conformance is considered to include both the handset software and the network proxy.
For example, a browser on a mobile device, after having established a connection, might detect that it is on a supporting network and request that a proxy server on the network take over the management of the connection. The timeline for such a situation might be as follows:
EventSource constructor.EventSource constructor (possibly
including a Last-Event-ID
HTTP header, etc).This can reduce the total data usage, and can therefore result in considerable power savings.
As well as implementing the existing API and
text/event-stream wire format as defined by this
specification and in more distributed ways as described above,
formats of event framing defined by other applicable
specifications may be supported. This specification does not
define how they are to be parsed or processed.
While an EventSource object's readyState is not CLOSED, and the object has one
or more event listeners registered for message events, there must be a strong
reference from the Window or WorkerUtils
object that the EventSource object's constructor was
invoked from to the EventSource object itself.
If an EventSource object is garbage collected while
its connection is still open, the connection must be closed.
text/event-streamThis registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
An event stream from an origin distinct from the origin of the content consuming the event stream can result in information leakage. To avoid this, user agents are required to block all cross-origin loads.
Event streams can overwhelm a user agent; a user agent is expected to apply suitable restrictions to avoid depleting local resources because of an overabundance of information from an event stream.
Servers can be overwhelmed if a situation develops in which the server is causing clients to reconnect rapidly. Servers should use a 5xx status code to indicate capacity problems, as this will prevent conforming clients from reconnecting automatically.
Fragment identifiers have no meaning with
text/event-stream resources.
Last-Event-IDThis section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]
This section is non-normative.
To enable Web applications to maintain bidirectional
communications with server-side processes, this specification
introduces the WebSocket interface.
This interface does not allow for raw access to the underlying network. For example, this interface could not be used to implement an IRC client without proxying messages through a custom server.
WebSocket interface[Constructor(in DOMString url, in optional DOMString protocol)]
interface WebSocket {
readonly attribute DOMString URL;
// ready state
const unsigned short CONNECTING = 0;
const unsigned short OPEN = 1;
const unsigned short CLOSING = 2;
const unsigned short CLOSED = 3;
readonly attribute unsigned short readyState;
readonly attribute unsigned long bufferedAmount;
// networking
attribute Function onopen;
attribute Function onmessage;
attribute Function onerror;
attribute Function onclose;
boolean send(in DOMString data);
void close();
};
WebSocket implements EventTarget;
The WebSocket(url, protocol)
constructor takes one or two arguments. The first argument, url, specifies the URL to which to
connect. The second, protocol, if present,
specifies a sub-protocol that the server must support for the
connection to be successful. The sub-protocol name must be a
non-empty ASCII string with no control characters in it (i.e. only
characters in the range U+0020 to U+007E).
When the WebSocket() constructor is invoked, the UA
must run these steps:
Parse a WebSocket URL's components from the
url argument, to obtain host, port, resource name, and secure. If
this fails, throw a SYNTAX_ERR exception and abort
these steps.
If port is a port to which the user
agent is configured to block access, then throw a
SECURITY_ERR exception. (User agents typically block
access to well-known ports like SMTP.)
If protocol is present but is either the
empty string or contains characters with Unicode code points less
than U+0020 or greater than U+007E (i.e. any characters that are
not printable ASCII characters), then throw a
SYNTAX_ERR exception and abort these steps.
Let origin be the ASCII serialization of the
origin of the script that invoked the WebSocket() constructor,
converted to ASCII lowercase.
Return a new WebSocket object, and continue
these steps in the background (without blocking scripts).
Establish a WebSocket connection to a host host, on port port (if one was specified), from origin, with the flag secure, with resource name as the resource name, and with protocol as the protocol (if it is present).
If the "establish a WebSocket
connection" algorithm fails, it triggers the "fail
the WebSocket connection" algorithm, which then invokes
the "close the WebSocket connection" algorithm,
which then establishes that the "WebSocket connection is
closed", which fires the close event as described below.
This constructor must be visible when the script's global
object is either a Window object or an object
implementing the WorkerUtils interface.
The URL
attribute must return the result of resolving the URL that was passed to the
constructor. (It doesn't matter what it is resolved relative to,
since we already know it is an absolute URL.)
The readyState
attribute represents the state of the connection. It can have the
following values:
CONNECTING (numeric value 0)OPEN (numeric value 1)CLOSING (numeric value 2)CLOSED (numeric value 3)When the object is created its readyState must be set to
CONNECTING (0).
The send(data) method transmits data using the
connection. If the readyState attribute is
CONNECTING, it must
raise an INVALID_STATE_ERR exception. If the data argument has any unpaired surrogates, then it
must raise SYNTAX_ERR. If the connection is
established, and the string has no unpaired surrogates, and the WebSocket
closing handshake has not yet started, then the user agent
must send data using the
WebSocket. If the data cannot be sent, e.g. because it would
need to be buffered but the buffer is full, the user agent must
close the WebSocket connection. The method must then
return true if the connection is still established (and the data was
queued or sent successfully), or false if the connection is closing
or closed (e.g. because the user agent just had a buffer overflow
and failed to send the data, or because the WebSocket closing
handshake has started).
The close()
method must run the first matching steps from the following list:
readyState
attribute is in the CLOSING (2) or CLOSED (3) stateDo nothing.
The connection is already closing or is already
closed. If it has not already, a close event will eventually fire as described below.
Fail the WebSocket connection and set the readyState attribute's
value to CLOSING
(2).
The "fail the WebSocket connection"
algorithm invokes the "close the WebSocket
connection" algorithm, which then establishes that the
"WebSocket connection is closed", which fires the
close event as described below.
Start the WebSocket closing handshake and set the
readyState
attribute's value to CLOSING (2).
The "start the WebSocket closing
handshake" algorithm eventually invokes the "close
the WebSocket connection" algorithm, which then establishes
that the "WebSocket connection is closed", which
fires the close event as described below.
Set the readyState attribute's
value to CLOSING
(2).
The WebSocket closing handshake has
started, and will eventually invokethe "close the
WebSocket connection" algorithm, which will establish that
the "WebSocket connection is closed", and thus the
close event will fire, as described below.
The bufferedAmount
attribute must return the number of bytes that have been queued but
not yet sent. This does not include framing overhead incurred by the
protocol. If the connection is closed, this attribute's value will
only increase with each call to the send() method (the number does not
reset to zero once the connection closes).
In this simple example, the bufferedAmount
attribute is used to ensure that updates are sent either at the
rate of one update every 50ms, if the network can handle that rate,
or at whatever rate the network can handle, if that is too
fast.
var socket = new WebSocket('ws://game.example.com:12010/updates');
socket.onopen = function () {
setInterval(function() {
if (socket.bufferedAmount == 0)
socket.send(getUpdateData());
}, 50);
};
The bufferedAmount
attribute can also be used to saturate the network without sending
the data at a higher rate than the network can handle, though this
requires more careful monitoring of the value of the attribute over
time.
The following are the event handlers that must be
supported, as IDL attributes, by all objects implementing the
WebSocket interface:
| Event handler | Event handler event type |
|---|---|
onopen | open
|
onmessage | message
|
onerror | error
|
onclose | close
|
When the WebSocket connection is established, the user
agent must queue a task to first change the readyState attribute's value
to OPEN (1), and then
fire a simple event named open at the WebSocket
object.
When a WebSocket message has been received with text data, the user agent must create an event that uses
the MessageEvent interface, with the event name message, which does not bubble, is not
cancelable, has no default action, and whose data attribute is set to data, and queue a task to check to see
if the readyState
attribute's value is OPEN
(1) or CLOSING (2), and
if so, dispatch the event at the WebSocket object.
When a WebSocket error has been detected, the user agent
must queue a task to check to see if the readyState attribute's value
is OPEN (1) or CLOSING (2), and if so,
fire a simple event named error at the WebSocket
object.
When the WebSocket closing handshake has started, the user
agent must queue a task to change the readyState attribute's value
to CLOSING (2). (If the
close() method was called,
the readyState
attribute's value will already be set to CLOSING (2) when this task
runs.)
When the WebSocket connection is
closed, possibly cleanly, the user agent must
create an event that uses the CloseEvent interface,
with the event name close, which
does not bubble, is not cancelable, has no default action, and whose
wasClean attribute is
set to true if the connection closed cleanly and
false otherwise; and queue a task to first change the
readyState attribute's
value to CLOSED (3), and
then dispatch the event at the WebSocket object.
The task source for all tasks queued in this section is the WebSocket task source.
interface CloseEvent : Event {
readonly attribute boolean wasClean;
void initCloseEvent(in DOMString typeArg, in boolean canBubbleArg, in boolean cancelableArg, in boolean wasCleanArg);
};
The initCloseEvent()
method must initialize the event in a manner analogous to the
similarly-named method in the DOM Events interfaces. [DOMEVENTS]
The wasClean
attribute represents whether the connection closed cleanly or
not.
A WebSocket object with an open connection must not
be garbage collected if there are any event listeners registered for
message events.
If a WebSocket object is garbage collected while its
connection is still open, the user agent must close the
WebSocket connection.
This section is non-normative.
Historically, creating an instant messenger chat client as a Web application has required an abuse of HTTP to poll the server for updates while sending upstream notifications as distinct HTTP calls.
This results in a variety of problems:
A simpler solution would be to use a single TCP connection for traffic in both directions. This is what the WebSocket protocol provides. Combined with the WebSocket API, it provides an alternative to HTTP polling for two-way communication from a Web page to a remote server.
The same technique can be used for a variety of Web applications: games, stock tickers, multiuser applications with simultaneous editing, user interfaces exposing server-side services in real time, etc.
This section is non-normative.
The protocol has two parts: a handshake, and then the data transfer.
The handshake from the client looks as follows:
GET /demo HTTP/1.1 Host: example.com Connection: Upgrade Sec-WebSocket-Key2: 12998 5 Y3 1 .P00 Sec-WebSocket-Protocol: sample Upgrade: WebSocket Sec-WebSocket-Key1: 4 @1 46546xW%0l 1 5 Origin: http://example.com ^n:ds[4U
The handshake from the server looks as follows:
HTTP/1.1 101 WebSocket Protocol Handshake Upgrade: WebSocket Connection: Upgrade Sec-WebSocket-Origin: http://example.com Sec-WebSocket-Location: ws://example.com/demo Sec-WebSocket-Protocol: sample 8jKS'y:G*Co,Wxa-
The leading line from the client follows the Request-Line format. The leading line from the server follows the Status-Line format. The Request-Line and Status-Line productions are defined in the HTTP specification.
After the leading line in both cases come an unordered ASCII case-insensitive set of fields, one per line, that each match the following non-normative ABNF: [ABNF]
field = 1*name-char colon [ space ] *any-char cr lf
colon = %x003A ; U+003A COLON (:)
space = %x0020 ; U+0020 SPACE
cr = %x000D ; U+000D CARRIAGE RETURN (CR)
lf = %x000A ; U+000A LINE FEED (LF)
name-char = %x0000-0009 / %x000B-000C / %x000E-0039 / %x003B-10FFFF
; a Unicode character other than U+000A LINE FEED (LF), U+000D CARRIAGE RETURN (CR), or U+003A COLON (:)
any-char = %x0000-0009 / %x000B-000C / %x000E-10FFFF
; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)
The character set for the above ABNF is Unicode. The fields themselves are encoded as UTF-8.
Lines that don't match the above production cause the connection to be aborted.
Finally, after the last field, the client sends 10 bytes starting with 0x0D 0x0A and followed by 8 random bytes, part of a challenge, and the server sends 18 bytes starting with 0x0D 0x0A and followed by 16 bytes consisting of a challenge response. The details of this challenge and other parts of the handshake are described in the next section.
Once the client and server have both sent their handshakes, and if the handshake was successful, then the data transfer part starts. This is a two-way communication channel where each side can, independently from the other, send data at will.
Data is sent in the form of UTF-8 text. Each frame of data starts with a 0x00 byte and ends with a 0xFF byte, with the UTF-8 text in between.
The WebSocket protocol uses this framing so that specifications that use the WebSocket protocol can expose such connections using an event-based mechanism instead of requiring users of those specifications to implement buffering and piecing together of messages manually.
To close the connection cleanly, a frame consisting of just a 0xFF byte followed by a 0x00 byte is sent from one peer to ask that the other peer close the connection.
The protocol is designed to support other frame types in future. Instead of the 0x00 and 0xFF bytes, other bytes might in future be defined. Frames denoted by bytes that do not have the high bit set (0x00 to 0x7F) are treated as a stream of bytes terminated by 0xFF. Frames denoted by bytes that have the high bit set (0x80 to 0xFF) have a leading length indicator, which is encoded as a series of 7-bit bytes stored in octets with the 8th bit being set for all but the last byte. The remainder of the frame is then as much data as was specified. (The closing handshake contains no data and therefore has a length byte of 0x00.)
This wire format for the data transfer part is described by the following non-normative ABNF, which is given in two alternative forms: the first describing the wire format as allowed by this specification, and the second describing how an arbitrary bytestream would be parsed. [ABNF]
; the wire protocol as allowed by this specification frames = *frame frame = text-frame / closing-frame text-frame = %x00 *( UTF8-char ) %xFF closing-frame = %xFF %x00 ; the wire protocol including error-handling and forward-compatible parsing rules frames = *frame frame = text-frame / binary-frame text-frame = (%x00-7F) *(%x00-FE) %xFF binary-frame = (%x80-FF) length < as many bytes as given by the length > length = *(%x80-FF) (%x00-7F)
The UTF8-char rule is defined in the UTF-8 specification. [RFC3629]
The above ABNF is intended for a binary octet environment.
At this time, the WebSocket protocol cannot be used to send binary data. Using any of the frame types other than 0x00 and 0xFF is invalid.
The following diagram summarises the protocol:
Handshake | V Frame type byte <--------------------------------------. | | | | `--> (0x00 to 0x7F) --> Data... --> 0xFF -->-+ | | `--> (0x80 to 0xFE) --> Length --> Data... ------->-'
This section is non-normative.
The opening handshake is intended to be compatible with HTTP-based server-side software, so that a single port can be used by both HTTP clients talking to that server and WebSocket clients talking to that server. To this end, the WebSocket client's handshake appears to HTTP servers to be a regular GET request with an Upgrade offer:
GET / HTTP/1.1 Upgrade: WebSocket Connection: Upgrade
Fields in the handshake are sent by the client in a random order; the order is not meaningful.
Additional fields are used to select options in the WebSocket
protocol. The only options available in this version are the
subprotocol selector, Sec-WebSocket-Protocol,
and Cookie, which can used for
sending cookies to the server (e.g. as an authentication
mechanism). The Sec-WebSocket-Protocol
field takes an arbitrary string:
Sec-WebSocket-Protocol: chat
This field indicates the subprotocol (the application-level protocol layered over the WebSocket protocol) that the client intends to use. The server echoes this field in its handshake to indicate that it supports that subprotocol.
The other fields in the handshake are all security-related. The
Host field is used to protect against
DNS rebinding attacks and to allow multiple domains to be served
from one IP address.
Host: example.com
The server includes the hostname in the Sec-WebSocket-Location
field of its handshake, so that both the client and the server can
verify that they agree on which host is in use.
The Origin field is used to
protect against unauthorized cross-origin use of a WebSocket server
by scripts using the WebSocket API in a Web
browser. The server specifies which origin it is willing to receive
requests from by including a Sec-WebSocket-Origin field
with that origin. If multiple origins are authorized, the server
echoes the value in the Origin
field of the client's handshake.
Origin: http://example.com
Finally, the server has to prove to the client that it received
the client's WebSocket handshake, so that the server doesn't accept
connections that are not WebSocket connections. This prevents an
attacker from tricking a WebSocket server by sending it
carefully-crafted packets using XMLHttpRequest or a
form submission.
To prove that the handshake was received, the server has to take
three pieces of information and combine them to form a response. The
first two pieces of information come from the Sec-WebSocket-Key1 and Sec-WebSocket-Key2 fields in
the client handshake:
Sec-WebSocket-Key1: 18x 6]8vM;54 *(5: { U1]8 z [ 8
Sec-WebSocket-Key2: 1_ tx7X d < nw 334J702) 7]o}` 0
For each of these fields, the server has to take the digits from the value to obtain a number (in this case 1868545188 and 1733470270 respectively), then divide that number by the number of spaces characters in the value (in this case 12 and 10) to obtain a 32-bit number (155712099 and 173347027). These two resulting numbers are then used in the server handshake, as described below.
The counting of spaces is intended to make it impossible to smuggle this field into the resource name; making this even harder is the presence of two such fields, and the use of a newline as the only reliable indicator that the end of the key has been reached. The use of random characters interspersed with the spaces and the numbers ensures that the implementor actually looks for spaces and newlines, instead of being treating any character like a space, which would make it again easy to smuggle the fields into the path and trick the server. Finally, dividing by this number of spaces is intended to make sure that even the most naïve of implementations will check for spaces, since if ther server does not verify that there are some spaces, the server will try to divide by zero, which is usually fatal (a correct handshake will always have at least one space).
The third piece of information is given after the fields, in the last eight bytes of the handshake, expressed here as they would be seen if interpreted as ASCII:
Tm[K T2u
The concatenation of the number obtained from processing the
Sec-WebSocket-Key1
field, expressed as a big-endian 32 bit number, the number
obtained from processing the Sec-WebSocket-Key2 field,
again expressed as a big-endian 32 bit number, and finally the
eight bytes at the end of the handshake, form a 128 bit string whose
MD5 sum is then used by the server to prove that it read the
handshake.
The handshake from the server is much simpler than the client handshake. The first line is an HTTP Status-Line, with the status code 101 (the HTTP version and reason phrase aren't important):
HTTP/1.1 101 WebSocket Protocol Handshake
The fields follow. Two of the fields are just for compatibility with HTTP:
Upgrade: WebSocket Connection: Upgrade
Two of the fields are part of the security model described above, echoing the origin and stating the exact host, port, resource name, and whether the connection is expected to be encrypted:
Sec-WebSocket-Origin: http://example.com Sec-WebSocket-Location: ws://example.com/
These fields are checked by the Web browser when it is acting as
a WebSocket client for scripted pages. A server that
only handles one origin and only serves one resource can therefore
just return hard-coded values and does not need to parse the
client's handshake to verify the correctness of the values.
Option fields can also be included. In this version of the
protocol, the main option field is Sec-WebSocket-Protocol,
which indicates the subprotocol that the server speaks. Web browsers
verify that the server included the same value as was specified in
the WebSocket constructor, so a server that speaks
multiple subprotocols has to make sure it selects one based on the
client's handshake and specifies the right one in its handshake.
Sec-WebSocket-Protocol: chat
The server can also set cookie-related option fields to set cookies, as in HTTP.
After the fields, the server sends the aforementioned MD5 sum, a 16 byte (128 bit) value, shown here as if interpreted as ASCII:
fQJ,fN/4F4!~K~MH
This value depends on what the client sends, as described above. If it doesn't match what the client is expecting, the client would disconnect.
Having part of the handshake appear after the fields ensures that both the server and the client verify that the connection is not being interrupted by an HTTP intermediary such as a man-in-the-middle cache or proxy.
This section is non-normative.
The closing handshake is far simpler than the opening handshake.
Either peer can send a 0xFF frame with length 0x00 to begin the closing handshake. Upon receiving a 0xFF frame, the other peer sends an identical 0xFF frame in acknowledgement, if it hasn't already sent one. Upon receiving that 0xFF frame, the first peer then closes the connection, safe in the knowledge that no further data is forthcoming.
After sending a 0xFF frame, a peer does not send any further data; after receiving a 0xFF frame, a peer discards any further data received.
It is safe for both peers to initiate this handshake simultaneously.
The closing handshake is intended to replace the TCP closing handshake (FIN/ACK), on the basis that the TCP closing handshake is not always reliable end-to-end, especially in the presence of man-in-the-middle proxies and other intermediaries.
This section is non-normative.
The WebSocket protocol is designed on the principle that there should be minimal framing (the only framing that exists is to make the protocol frame-based instead of stream-based, and to support a distinction between Unicode text and binary frames). It is expected that metadata would be layered on top of WebSocket by the application layer, in the same way that metadata is layered on top of TCP by the application layer (HTTP).
Conceptually, WebSocket is really just a layer on top of TCP that adds a Web "origin"-based security model for browsers; adds an addressing and protocol naming mechanism to support multiple services on one port and multiple host names on one IP address; layers a framing mechanism on top of TCP to get back to the IP packet mechanism that TCP is built on, but without length limits; and reimplements the closing handshake in-band. Other than that, it adds nothing. Basically it is intended to be as close to just exposing raw TCP to script as possible given the constraints of the Web. It's also designed in such a way that its servers can share a port with HTTP servers, by having its handshake be a valid HTTP Upgrade handshake also.
The protocol is intended to be extensible; future versions will likely introduce a mechanism to compress data and might support sending binary data.
This section is non-normative.
The WebSocket protocol uses the origin model used by Web browsers to restrict which Web pages can contact a WebSocket server when the WebSocket protocol is used from a Web page. Naturally, when the WebSocket protocol is used by a dedicated client directly (i.e. not from a Web page through a Web browser), the origin model is not useful, as the client can provide any arbitrary origin string.
This protocol is intended to fail to establish a connection with servers of pre-existing protocols like SMTP or HTTP, while allowing HTTP servers to opt-in to supporting this protocol if desired. This is achieved by having a strict and elaborate handshake, and by limiting the data that can be inserted into the connection before the handshake is finished (thus limiting how much the server can be influenced).
It is similarly intended to fail to establish a connection when
data from other protocols, especially HTTP, is sent to a WebSocket
server, for example as might happen if an HTML form
were submitted to a WebSocket server. This is primarily achieved by
requiring that the server prove that it read the handshake, which it
can only do if the handshake contains the appropriate parts which
themselves can only be sent by a WebSocket handshake; in
particular, fields starting with Sec- cannot
be set by an attacker from a Web browser, even when using
XMLHttpRequest.
This section is non-normative.
The WebSocket protocol is an independent TCP-based protocol. Its only relationship to HTTP is that its handshake is interpreted by HTTP servers as an Upgrade request.
Based on the expert recommendation of the IANA, the WebSocket protocol by default uses port 80 for regular WebSocket connections and port 443 for WebSocket connections tunneled over TLS.
This section is non-normative.
There are several options for establishing a WebSocket connection.
On the face of it, the simplest method would seem to be to use port 80 to get a direct connection to a WebSocket server. Port 80 traffic, however, will often be intercepted by man-in-the-middle HTTP proxies, which can lead to the connection failing to be established.
The most reliable method, therefore, is to use TLS encryption and port 443 to connect directly to a WebSocket server. This has the advantage of being more secure; however, TLS encryption can be computationally expensive.
When a connection is to be made to a port that is shared by an HTTP server (a situation that is quite likely to occur with traffic to ports 80 and 443), the connection will appear to the HTTP server to be a regular GET request with an Upgrade offer. In relatively simple setups with just one IP address and a single server for all traffic to a single hostname, this might allow a practical way for systems based on the WebSocket protocol to be deployed. In more elaborate setups (e.g. with load balancers and multiple servers), a dedicated set of hosts for WebSocket connections separate from the HTTP servers is probably easier to manage.
This section is non-normative.
The client can request that the server use a specific subprotocol
by including the Sec-Websocket-Protocol
field in its handshake. If it is specified, the server needs to
include the same field and value in its response for the connection
to be established.
These subprotocol names do not need to be registered, but if a subprotocol is intended to be implemented by multiple independent WebSocket servers, potential clashes with the names of subprotocols defined independently can be avoided by using names that contain the domain name of the subprotocol's originator. For example, if Example Corporation were to create a Chat subprotocol to be implemented by many servers around the Web, they could name it "chat.example.com". If the Example Organisation called their competing subprotocol "example.org's chat protocol", then the two subprotocols could be implemented by servers simultaneously, with the server dynamically selecting which subprotocol to use based on the value sent by the client.
Subprotocols can be versioned in backwards-incompatible ways by changing the subprotocol name, eg. going from "bookings.example.net" to "bookings.example.net2". These subprotocols would be considered completely separate by WebSocket clients. Backwards-compatible versioning can be implemented by reusing the same subprotocol string but carefully designing the actual subprotocol to support this kind of extensibility.
When an implementation is required to send data as part of the WebSocket protocol, the implementation may delay the actual transmission arbitrarily, e.g. buffering data so as to send fewer IP packets.
The steps to parse a WebSocket URL's components from a string url are as follows. These steps return either a host, a port, a resource name, and a secure flag, or they fail.
If the url string is not an absolute URL, then fail this algorithm. [WEBADDRESSES]
Resolve the url string using the resolve a Web address algorithm defined by the Web addresses specification, with the URL character encoding set to UTF-8. [WEBADDRESSES] [RFC3629]
It doesn't matter what it is resolved relative to, since we already know it is an absolute URL at this point.
If url does not have a <scheme> component whose value,
when converted to ASCII lowercase, is either "ws" or "wss", then fail this
algorithm.
If url has a <fragment> component, then fail this algorithm.
If the <scheme>
component of url is "ws",
set secure to false; otherwise, the <scheme> component is "wss", set secure to
true.
Let host be the value of the <host> component of url, converted to ASCII lowercase.
If url has a <port> component, then let port be that component's value; otherwise, there is no explicit port.
If there is no explicit port, then: if secure is false, let port be 80, otherwise let port be 443.
Let resource name be the value of the <path> component (which might be empty) of url.
If resource name is the empty string, set it to a single character U+002F SOLIDUS (/).
If url has a <query> component, then append a single U+003F QUESTION MARK character (?) to resource name, followed by the value of the <query> component.
Return host, port, resource name, and secure.
The steps to construct a WebSocket URL from a host, a port, a resource name, and a secure flag, are as follows:
ws://" to url. Otherwise, append the string "wss://" to url.:"
followed by port to url.This section only applies to user agents, not to servers.
This specification doesn't currently define a limit to the number of simultaneous connections that a client can establish to a server.
When the user agent is to establish a WebSocket connection to a host host, on a port port, from an origin whose ASCII serialization is origin, with a flag secure, with a string giving a resource name, and optionally with a string giving a protocol, it must run the following steps. The host must be ASCII-only (i.e. it must have been punycode-encoded already if necessary). The resource name and protocol strings must be non-empty strings of ASCII characters in the range U+0020 to U+007E. The resource name string must start with a U+002F SOLIDUS character (/) and must not contain a U+0020 SPACE character. [ORIGIN]
If the user agent already has a WebSocket connection to the remote host (IP address) identified by host, even if known by another name, wait until that connection has been established or for that connection to have failed. If multiple connections to the same IP address are attempted simultaneously, the user agent must serialize them so that there is no more than one connection at a time running through the following steps.
This makes it harder for a script to perform a denial of service attack by just opening a large number of WebSocket connections to a remote host.
There is no limit to the number of established WebSocket connections a user agent can have with a single remote host. Servers can refuse to connect users with an excessive number of connections, or disconnect resource-hogging users when suffering high load.
Connect: If the user agent is configured to use a proxy when using the WebSocket protocol to connect to host host and/or port port, then connect to that proxy and ask it to open a TCP connection to the host given by host and the port given by port.
For example, if the user agent uses an HTTP proxy for all traffic, then if it was to try to connect to port 80 on server example.com, it might send the following lines to the proxy server:
CONNECT example.com:80 HTTP/1.1 Host: example.com
If there was a password, the connection might look like:
CONNECT example.com:80 HTTP/1.1 Host: example.com Proxy-authorization: Basic ZWRuYW1vZGU6bm9jYXBlcyE=
Otherwise, if the user agent is not configured to use a proxy, then open a TCP connection to the host given by host and the port given by port.
Implementations that do not expose explicit UI for selecting a proxy for WebSocket connections separate from other proxies are encouraged to use a SOCKS proxy for WebSocket connections, if available, or failing that, to prefer the proxy configured for HTTPS connections over the proxy configured for HTTP connections.
For the purpose of proxy autoconfiguration scripts, the URL to pass the function must be constructed from host, port, resource name, and the secure flag using the steps to construct a WebSocket URL.
The WebSocket protocol can be identified in proxy
autoconfiguration scripts from the scheme ("ws:" for unencrypted connections and "wss:" for encrypted connections).
If the connection could not be opened, then fail the WebSocket connection and abort these steps.
If secure is true, perform a TLS handshake over the connection. If this fails (e.g. the server's certificate could not be verified), then fail the WebSocket connection and abort these steps. Otherwise, all further communication on this channel must run through the encrypted tunnel. [RFC2246]
User agents must use the Server Name Indication extension in the TLS handshake. [RFC4366]
Send the UTF-8 string "GET" followed by a UTF-8-encoded U+0020 SPACE character to the remote side (the server).
Send the resource name value, encoded as UTF-8.
Send another UTF-8-encoded U+0020 SPACE character, followed by the UTF-8 string "HTTP/1.1", followed by a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF).
Let fields be an empty list of strings.
Add the string "Upgrade: WebSocket" to fields.
Add the string "Connection: Upgrade" to fields.
Let hostport be an empty string.
Append the host value, converted to ASCII lowercase, to hostport.
If secure is false, and port is not 80, or if secure is true, and port is not 443, then append a U+003A COLON character (:) followed by the value of port, expressed as a base-ten integer, to hostport.
Add the string consisting of the concatenation of the string "Host:", a U+0020 SPACE character, and hostport, to fields.
Add the string consisting of the concatenation of the string "Origin:", a U+0020 SPACE character, and the origin value, converted to ASCII lowercase, to fields.
If there is no protocol, then skip this step.
Otherwise, add the string consisting of the concatenation of the string "Sec-WebSocket-Protocol:", a U+0020 SPACE character, and the protocol value, to fields.
If the client has any cookies that would be relevant to a resource accessed over HTTP, if secure is false, or HTTPS, if it is true, on host host, port port, with resource name as the path (and possibly query parameters), then add to fields any HTTP headers that would be appropriate for that information. [HTTP] [COOKIES]
This includes "HttpOnly" cookies (cookies with the
http-only-flag set to true); the WebSocket protocol is not
considered a non-HTTP API for the purpose of cookie
processing.
Let spaces1 be a random integer from 1 to 12 inclusive.
Let spaces2 be a random integer from 1 to 12 inclusive.
For example, 5 and 9.
Let max1 be the largest integer not greater than 4,294,967,295 divided by spaces1.
Let max2 be the largest integer not greater than 4,294,967,295 divided by spaces2.
Continuing the example, 858,993,459 and 477,218,588.
Let number1 be a random integer from 0 to max1 inclusive.
Let number2 be a random integer from 0 to max2 inclusive.
For example, 777,007,543 and 114,997,259.
Let product1 be the result of multiplying number1 and spaces1 together.
Let product2 be the result of multiplying number2 and spaces2 together.
Continuing the example, 3,885,037,715 and 1,034,975,331.
Let key1 be a string consisting of product1, expressed in base ten using the numerals in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9).
Let key2 be a string consisting of product2, expressed in base ten using the numerals in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9).
Continuing the example, "3885037715" and "1034975331".
Insert spaces1 U+0020 SPACE characters into key1 at random positions.
Insert spaces2 U+0020 SPACE characters into key2 at random positions.
Continuing the example, this could lead to "388 5037 7 15" and "1 0 3 4 97 53 31".
Insert between one and twelve random characters from the ranges U+0021 to U+002F and U+003A to U+007E into key1 at random positions.
Insert between one and twelve random characters from the ranges U+0021 to U+002F and U+003A to U+007E into key2 at random positions.
This corresponds to random printable ASCII characters other than the digits and the U+0020 SPACE character.
Continuing the example, this could lead to "388P O503D&ul7 {K%gX( %7 15" and "1 N ?|k UT0or 3o 4 I97N 5-S3O 31".
Add the string consisting of the concatenation of the string "Sec-WebSocket-Key1:", a U+0020 SPACE character, and the key1 value, to fields.
Add the string consisting of the concatenation of the string "Sec-WebSocket-Key2:", a U+0020 SPACE character, and the key2 value, to fields.
For each string in fields, in a random order: send the string, encoded as UTF-8, followed by a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF). It is important that the fields be output in a random order so that servers not depend on the particular order used by any particular client.
Send a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF).
Let key3 be a string consisting of eight random bytes (or equivalently, a random 64 bit integer encoded in big-endian order).
For example, 0x47 0x30 0x22 0x2D 0x5A 0x3F 0x47 0x58.
Send key3 to the server.
Read bytes from the server until either the connection closes, or a 0x0A byte is read. Let field be these bytes, including the 0x0A byte.
If field is not at least seven bytes long, or if the last two bytes aren't 0x0D and 0x0A respectively, or if it does not contain at least two 0x20 bytes, then fail the WebSocket connection and abort these steps.
User agents may apply a timeout to this step, failing the WebSocket connection if the server does not send back data in a suitable time period.
Let code be the substring of field that starts from the byte after the first 0x20 byte, and ends with the byte before the second 0x20 byte.
If code is not three bytes long, or if any of the bytes in code are not in the range 0x30 to 0x39, then fail the WebSocket connection and abort these steps.
If code, interpreted as UTF-8, is "101", then move to the next step.
If code, interpreted as UTF-8, is "407", then either close the connection and jump
back to step 2, providing appropriate authentication information,
or fail the WebSocket connection. 407 is the code
used by HTTP meaning "Proxy Authentication Required". User agents
that support proxy authentication must interpret the response as
defined by HTTP (e.g. to find and interpret the Proxy-Authenticate
header).
Otherwise, fail the WebSocket connection and abort these steps.
Let fields be a list of name-value pairs, initially empty.
Field: Let name and value be empty byte arrays.
Read a byte from the server.
If the connection closes before this byte is received, then fail the WebSocket connection and abort these steps.
Otherwise, handle the byte as described in the appropriate entry below:
This reads a field name, terminated by a colon, converting upper-case ASCII letters to lowercase, and aborting if a stray CR or LF is found.
Read a byte from the server.
If the connection closes before this byte is received, then fail the WebSocket connection and abort these steps.
Otherwise, handle the byte as described in the appropriate entry below:
This skips past a space character after the colon, if necessary.
Read a byte from the server.
If the connection closes before this byte is received, then fail the WebSocket connection and abort these steps.
Otherwise, handle the byte as described in the appropriate entry below:
This reads a field value, terminated by a CRLF.
Read a byte from the server.
If the connection closes before this byte is received, or if the byte is not a 0x0A byte (ASCII LF), then fail the WebSocket connection and abort these steps.
This skips past the LF byte of the CRLF after the field.
Append an entry to the fields list that has the name given by the string obtained by interpreting the name byte array as a UTF-8 byte stream and the value given by the string obtained by interpreting the value byte array as a UTF-8 byte stream.
Return to the "Field" step above.
Fields processing: Read a byte from the server.
If the connection closes before this byte is received, or if the byte is not a 0x0A byte (ASCII LF), then fail the WebSocket connection and abort these steps.
This skips past the LF byte of the CRLF after the blank line after the fields.
If there is not exactly one entry in the fields list whose name is "upgrade", or
if there is not exactly one entry in the fields list whose name is "connection", or
if there is not exactly one entry in the fields list whose name is "sec-websocket-origin", or
if there is not exactly one entry in the fields list whose name is "sec-websocket-location", or
if the protocol was specified but there is not exactly one entry in the fields list whose name is "sec-websocket-protocol", or
if there are any entries in the fields list
whose names are the empty string, then fail the WebSocket
connection and abort these steps. Otherwise, handle each
entry in the fields list as follows:
upgrade"If the value is not exactly equal to the string "WebSocket", then fail the WebSocket connection and abort these steps.
connection"If the value, converted to ASCII lowercase, is not exactly equal to the string "upgrade", then fail the WebSocket connection and abort these steps.
sec-websocket-origin"If the value is not exactly equal to origin, converted to ASCII lowercase, then fail the WebSocket connection and abort these steps. [ORIGIN]
sec-websocket-location"If the value is not exactly equal to a string obtained from the steps to construct a WebSocket URL from host, port, resource name, and the secure flag, then fail the WebSocket connection and abort these steps.
sec-websocket-protocol"If there was a protocol specified, and the value is not exactly equal to protocol, then fail the WebSocket connection and abort these steps. (If no protocol was specified, the field is ignored.)
set-cookie" or "set-cookie2" or another
cookie-related field nameIf the relevant specification is supported by the user
agent, handle the cookie as defined by the appropriate
specification, with the resource being the one with the host host, the port port, the path
(and possibly query parameters) resource
name, and the scheme http if secure is false and https if
secure is true.
[COOKIES]
Let challenge be the concatenation of number1, expressed as a big-endian 32 bit integer, number2, expressed as a big-endian 32 bit integer, and the eight bytes of key3 in the order they were sent on the wire.
Using the examples given earlier, this leads to the 16 bytes 0x2E 0x50 0x31 0xB7 0x06 0xDA 0xB8 0x0B 0x47 0x30 0x22 0x2D 0x5A 0x3F 0x47 0x58.
Let expected be the MD5 fingerprint of challenge as a big-endian 128 bit string. [RFC1321]
Using the examples given earlier, this leads to the 16 bytes 0x30 0x73 0x74 0x33 0x52 0x6C 0x26 0x71 0x2D 0x32 0x5A 0x55 0x5E 0x77 0x65 0x75. In ASCII, these bytes correspond to the string "0st3Rl&q-2ZU^weu".
Read sixteen bytes from the server. Let reply be those bytes.
If the connection closes before these bytes are received, then fail the WebSocket connection and abort these steps.
If reply does not exactly equal expected, then fail the WebSocket connection and abort these steps.
The WebSocket connection is established. Now the user agent must send and receive to and from the connection as described in the next section.
Once a WebSocket connection is established, the user agent must run through the following state machine for the bytes sent by the server. If at any point during these steps a read is attempted but fails because the WebSocket connection is closed, then abort.
Try to read a byte from the server. Let frame type be that byte.
Let error be false.
Handle the frame type byte as follows:
Run these steps:
Let length be zero.
Length: Read a byte, let b be that byte.
Let bv be an integer corresponding to the low 7 bits of b (the value you would get by anding b with 0x7F).
Multiply length by 128, add bv to that result, and store the final result in length.
If the high-order bit of b is set (i.e. if b anded with 0x80 returns 0x80), then return to the step above labeled length.
Read length bytes.
It is possible for a server to (innocently or maliciously) send frames with lengths greater than 231 or 232 bytes, overflowing a signed or unsigned 32bit integer. User agents may therefore impose implementation-specific limits on the lengths of invalid frames that they will skip; even supporting frames 2GB in length is considered, at the time of writing, as going well above and beyond the call of duty.
Discard the read bytes.
If the frame type is 0xFF and the length was 0, then run the following substeps:
Otherwise, let error be true.
Run these steps:
Let raw data be an empty byte array.
Data: Read a byte, let b be that byte.
If b is not 0xFF, then append b to raw data and return to the previous step (labeled data).
Interpret raw data as a UTF-8 string, and store that string in data.
If frame type is 0x00, then a WebSocket message has been received with text data. Otherwise, discard the data and let error be true.
If error is true, then a WebSocket error has been detected.
Return to the first step to read the next byte.
If the user agent is faced with content that is too large to be handled appropriately, runs out of resources for buffering incoming data, or hits an artificial resource limit intended to avoid resource starvation, then it must fail the WebSocket connection.
Once a WebSocket connection is established, but before the WebSocket closing handshake has started, the user agent must use the following steps to send data using the WebSocket:
Send a 0x00 byte to the server.
Encode data using UTF-8 and send the resulting byte stream to the server.
Send a 0xFF byte to the server.
Once the WebSocket closing handshake has started, the user agent must not send any further data on the connection.
Once a WebSocket connection is established, the user agent must use the following steps to start the WebSocket closing handshake. These steps must be run asynchronously relative to whatever algorithm invoked this one.
If the WebSocket closing handshake has started, then abort these steps.
Send a 0xFF byte to the server.
Send a 0x00 byte to the server.
The WebSocket closing handshake has started.
Wait a user-agent-determined length of time, or until the WebSocket connection is closed.
The closing handshake finishes once the server returns the 0xFF packet, as described above.
If at any point there is a fatal problem with sending data to the server, the user agent must fail the WebSocket connection.
When a client is to interpret a byte stream as UTF-8 but finds that the byte stream is not in fact a valid UTF-8 stream, then any bytes or sequences of bytes that are not valid UTF-8 sequences must be interpreted as a U+FFFD REPLACEMENT CHARACTER.
This section only applies to servers.
When a client starts a WebSocket connection, it sends its part of the opening handshake. The server must parse at least part of this handshake in order to obtain the necessary information to generate the server part of the handshake.
The client handshake consists of the following parts. If the server, while reading the handshake, finds that the client did not send a handshake that matches the description below, the server should abort the WebSocket connection.
The three-character UTF-8 string "GET".
A UTF-8-encoded U+0020 SPACE character (0x20 byte).
A string consisting of all the bytes up to the next UTF-8-encoded U+0020 SPACE character (0x20 byte). The result of decoding this string as a UTF-8 string is the name of the resource requested by the server. If the server only supports one resource, then this can safely be ignored; the client verifies that the right resource is supported based on the information included in the server's own handshake. The resource name will begin with U+002F SOLIDUS character (/) and will only include characters in the range U+0021 to U+007E.
A string of bytes terminated by a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF). All the characters from the second 0x20 byte up to the first 0x0D 0x0A byte pair in the data from the client can be safely ignored. (It will probably be the string "HTTP/1.1".)
A series of fields.
Each field is terminated by a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF). The end of the fields is denoted by the terminating CRLF pair being followed immediately by another CRLF pair.
In other words, the fields start with the first 0x0D 0x0A byte pair, end with the first 0x0D 0x0A 0x0D 0x0A byte sequence, and are separate from each other by 0x0D 0x0A byte pairs.
The fields are encoded as UTF-8.
Each field consists of a name, consisting of one or more characters in the ranges U+0021 to U+0039 and U+003B to U+007E, followed by a U+003A COLON character (:) and a U+0020 SPACE character, followed by zero or more characters forming the value.
The expected field names, the meaning of their corresponding values, and the processing servers are required to apply to those fields, are described below, after the description of the client handshake.
After the first 0x0D 0x0A 0x0D 0x0A byte sequence, indicating the end of the fields, the client sends eight random bytes. These are used in constructing the server handshake.
The expected field names, and the meaning of their corresponding values, are as follows. Field names must be compared in an ASCII case-insensitive manner.
UpgradeInvariant part of the handshake. Will always have a value that is an ASCII case-insensitive match for the string "WebSocket".
Can be safely ignored, though the server should abort the WebSocket connection if this field is absent or has a different value, to avoid vulnerability to cross-protocol attacks.
ConnectionInvariant part of the handshake. Will always have a value that is an ASCII case-insensitive match for the string "Upgrade".
Can be safely ignored, though the server should abort the WebSocket connection if this field is absent or has a different value, to avoid vulnerability to cross-protocol attacks.
HostThe value gives the hostname that the client intended to use when opening the WebSocket. It would be of interest in particular to virtual hosting environments, where one server might serve multiple hosts, and might therefore want to return different data.
Can be safely ignored, though the server should abort the WebSocket connection if this field is absent or has a value that does not match the server's host name, to avoid vulnerability to cross-protocol attacks and DNS rebinding attacks.
OriginThe value gives the scheme, hostname, and port (if it's not the default port for the given scheme) of the page that asked the client to open the WebSocket. It would be interesting if the server's operator had deals with operators of other sites, since the server could then decide how to respond (or indeed, whether to respond) based on which site was requesting a connection. [ORIGIN]
Can be safely ignored, though the server should abort the WebSocket connection if this field is absent or has a value that does not match one of the origins the server is expecting to communicate with, to avoid vulnerability to cross-protocol attacks and cross-site scripting attacks.
Sec-WebSocket-ProtocolThe value gives the name of a subprotocol that the client is intending to select. It would be interesting if the server supports multiple protocols or protocol versions.
Can be safely ignored, though the server may abort the WebSocket connection if the field is absent but the conventions for communicating with the server are such that the field is expected; and the server should abort the WebSocket connection if the field has a value that does not match one of the subprotocols that the server supports, to avoid integrity errors once the connection is established.
Sec-WebSocket-Key1Sec-WebSocket-Key2The values provide the information required for computing the server's handshake, as described in the next section.
Other fields can be used, such as "Cookie", for authentication
purposes. Their semantics are equivalent to the semantics of the
HTTP headers with the same names.
If a server reads fields for authentication
purposes (such as Cookie), or if a server
assumes that its clients are authorized on the basis that they can
connect (e.g. because they are on an intranet firewalled from the
public Internet), then the server should also verify that the
client's handshake includes the invariant "Upgrade" and
"Connection" parts of the handshake, and should send the server's
handshake before changing any user data. Otherwise, an attacker
could trick a client into sending WebSocket frames to a server
(e.g. using XMLHttpRequest) and cause the server to
perform actions on behalf of the user without the user's
consent. (Sending the server's handshake ensures that the frames
were not sent as part of a cross-protocol attack, since other
protocols do not send the necessary components in the client's
initial handshake for forming the server's handshake.)
Unrecognized fields can be safely ignored, and are probably either the result of intermediaries injecting fields unrelated to the operation of the WebSocket protocol, or clients that support future versions of the protocol offering options that the server doesn't support.
When a client establishes a WebSocket connection to a server, the server must run the following steps.
If the server supports encryption, perform a TLS handshake over the connection. If this fails (e.g. the client indicated a host name in the extended client hello "server_name" extension that the server does not host), then close the connection; otherwise, all further communication for the connection (including the server handshake) must run through the encrypted tunnel. [RFC2246]
Establish the following information:
Host" field.Origin" field. [ORIGIN]Sec-WebSocket-Protocol"
field. The absence of such a field is equivalent to the null
value. The empty string is not the same as the null value for
these purposes.Sec-WebSocket-Key1"
field in the client's handshake.Sec-WebSocket-Key2"
field in the client's handshake.Let location be the string that results from constructing a WebSocket URL from host, port, resource name, and secure flag.
Let key-number1 be the digits (characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9)) in key1, interpreted as a base ten integer, ignoring all other characters in key1.
Let key-number2 be the digits (characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9)) in key2, interpreted as a base ten integer, ignoring all other characters in key2.
For example, assume that the client handshake was:
GET / HTTP/1.1 Connection: Upgrade Host: example.com Upgrade: WebSocket Sec-WebSocket-Key1: 3e6b263 4 17 80 Origin: http://example.com Sec-WebSocket-Key2: 17 9 G`ZD9 2 2b 7X 3 /r90 WjN}|M(6
The key-number1 would be the number 3,626,341,780, and the key-number2 would be the number 1,799,227,390.
In this example, incidentally, key3 is "WjN}|M(6", or 0x57 0x6A 0x4E 0x7D 0x7C 0x4D 0x28 0x36.
Let spaces1 be the number of U+0020 SPACE characters in key1.
Let spaces2 be the number of U+0020 SPACE characters in key2.
If either spaces1 or spaces2 is zero, then abort the WebSocket connection. This is a symptom of a cross-protocol attack.
In the example above, spaces1 would be 4 and spaces2 would be 10.
If key-number1 is not an integral multiple of spaces1, then abort the WebSocket connection.
If key-number2 is not an integral multiple of spaces2, then abort the WebSocket connection.
This can only happen if the client is not a conforming WebSocket client.
Let part1 be key-number1 divided by spaces1.
Let part2 be key-number2 divided by spaces2.
In the example above, part1 would be 906,585,445 and part2 would be 179,922,739.
Let challenge be the concatenation of part1, expressed as a big-endian 32 bit integer, part2, expressed as a big-endian 32 bit integer, and the eight bytes of key3 in the order they were sent on the wire.
In the example above, this would be the 16 bytes 0x36 0x09 0x65 0x65 0x0A 0xB9 0x67 0x33 0x57 0x6A 0x4E 0x7D 0x7C 0x4D 0x28 0x36.
Let response be the MD5 fingerprint of challenge as a big-endian 128 bit string. [RFC1321]
In the example above, this would be the 16 bytes 0x6E 0x60 0x39 0x65 0x42 0x6B 0x39 0x7A 0x24 0x52 0x38 0x70 0x4F 0x74 0x56 0x62, or "n`9eBk9z$R8pOtVb" in ASCII.
Send the following line, terminated by the two characters U+000D CARRIAGE RETURN and U+000A LINE FEED (CRLF) and encoded as UTF-8, to the client:
HTTP/1.1 101 WebSocket Protocol Handshake
This line may be sent differently if necessary, but must match the Status-Line production defined in the HTTP specification, with the Status-Code having the value 101.
Send the following fields to the client. Each field must be sent as a line consisting of the field name, which must be an ASCII case-insensitive match for the field name in the list below, followed by a U+003A COLON character (:) and a U+0020 SPACE character, followed by the field value as specified in the list below, followed by the two characters U+000D CARRIAGE RETURN and U+000A LINE FEED (CRLF). The lines must be encoded as UTF-8. The lines may be sent in any order.
UpgradeThe value must be the string "WebSocket".
ConnectionThe value must be the string "Upgrade".
Sec-WebSocket-LocationThe value must be location
Sec-WebSocket-OriginThe value must be origin
Sec-WebSocket-ProtocolThis field must be included if subprotocol is not null, and must not be included if subprotocol is null.
If included, the value must be subprotocol
Optionally, include "Set-Cookie", "Set-Cookie2", or other
cookie-related fields, with values equal to the values that would
be used for the identically named HTTP headers.
[COOKIES]
Send two bytes 0x0D 0x0A (ASCII CRLF).
Send response.
This completes the server's handshake. If the server finishes these steps without aborting the WebSocket connection, and if the client does not then fail the connection, then the connection is established and the server may begin and receiving sending data, as described in the next section.
The server must run through the following steps to process the bytes sent by the client. If at any point during these steps a read is attempted but fails because the WebSocket connection is closed, then abort.
Frame: Read a byte from the client. Let type be that byte.
If type is not a 0x00 byte, then the server may disconnect from the client.
If the most significant bit of type is not set, then run the following steps:
Let raw data be an empty byte array.
Data: Read a byte, let b be that byte.
If b is not 0xFF, then append b to raw data and return to the previous step (labeled data).
Interpret raw data as a UTF-8 string, and apply whatever server-specific processing is to occur for the resulting string (the message from the client).
Otherwise, the most significant bit of type is set. Run the following steps. This can never happen if type is 0x00, and therefore these steps are not necessary if the server aborts when type is not 0x00, as allowed above.
Let length be zero.
Length: Read a byte, let b be that byte.
Let bv be an integer corresponding to the low 7 bits of b (the value you would get by anding b with 0x7F).
Multiply length by 128, add bv to that result, and store the final result in length.
If the high-order bit of b is set (i.e. if b anded with 0x80 returns 0x80), then return to the step above labeled length.
Read length bytes.
It is possible for a malicious client to send frames with lengths greater than 231 or 232 bytes, overflowing a signed or unsigned 32bit integer. Servers may therefore impose implementation-specific limits on the lengths of invalid frames that they will skip, if they support skipping such frames at all. If a server cannot correctly skip past a long frame, then the server must abort these steps (discarding all future data), and should either immediately disconnect from the client or set the client terminated flag.
Discard the read bytes.
If type is 0xFF and length is 0, then set the client terminated flag and abort these steps. All further data sent by the client should be discarded.
Return to the step labeled frame.
The server must run through the following steps to send strings to the client:
Send a 0x00 byte to the client to indicate the start of a string.
Encode data using UTF-8 and send the resulting byte stream to the client.
Send a 0xFF byte to the client to indicate the end of the message.
At any time, the server may decide to terminate the WebSocket connection by running through the following steps:
Send a 0xFF byte and a 0x00 byte to the client to indicate the start of the closing handshake.
Wait until the client terminated flag has been set, or until a server-defined timeout expires.
Once these steps have started, the server must not send any further data to the server. The 0xFF 0x00 bytes indicate the end of the server's data, and further bytes will be discarded by the client.
When a server is to interpret a byte stream as UTF-8 but finds that the byte stream is not in fact a valid UTF-8 stream, behavior is undefined. A server could close the connection, convert invalid byte sequences to U+FFFD REPLACEMENT CHARACTERs, store the data verbatim, or perform application-specific processing. Subprotocols layered on the WebSocket protocol might define specific behavior for servers.
Certain algorithms require the user agent to fail the WebSocket connection. To do so, the user agent must close the WebSocket connection, and may report the problem to the user (which would be especially useful for developers).
Except as indicated above or as specified by the application layer (e.g. a script using the WebSocket API), user agents should not close the connection.
User agents must not convey any failure information to scripts in a way that would allow a script to distinguish the following situations:
Certain algorithms require or recommend that the server abort the WebSocket connection during the opening handshake. To do so, the server must simply close the WebSocket connection.
To close the WebSocket connection, the user agent or server must close the TCP connection, using whatever mechanism possible (e.g. either the TCP RST or FIN mechanisms). When a user agent notices that the server has closed its connection, it must immediately close its side of the connection also. Whether the user agent or the server closes the connection first, it is said that the WebSocket connection is closed. If the connection was closed after the client finished the WebSocket closing handshake, then the WebSocket connection is said to have been closed cleanly.
Servers may close the WebSocket connection whenever desired. User agents should not close the WebSocket connection arbitrarily.
While this protocol is intended to be used by scripts in Web pages, it can also be used directly by hosts. Such hosts are acting on their own behalf, and can therefore send fake "Origin" fields, misleading the server. Servers should therefore be careful about assuming that they are talking directly to scripts from known origins, and must consider that they might be accessed in unexpected ways. In particular, a server should not trust that any input is valid.
For example, if the server uses input as part of SQL queries, all input text should be escaped before being passed to the SQL server, lest the server be susceptible to SQL injection.
Servers that are not intended to process input from any Web page but only for certain sites should verify the "Origin" field is an origin they expect, and should only respond with the corresponding "Sec-WebSocket-Origin" if it is an accepted origin. Servers that only accept input from one origin can just send back that value in the "Sec-WebSocket-Origin" field, without bothering to check the client's value.
If at any time a server is faced with data that it does not understand, or that violates some criteria by which the server determines safety of input, or when the server sees a handshake that does not correspond to the values the server is expecting (e.g. incorrect path or origin), the server should just disconnect. It is always safe to disconnect.
The biggest security risk when sending text data using this protocol is sending data using the wrong encoding. If an attacker can trick the server into sending data encoded as ISO-8859-1 verbatim (for instance), rather than encoded as UTF-8, then the attacker could inject arbitrary frames into the data stream.
ws: schemeA ws: URL identifies a WebSocket server
and resource name.
In ABNF terms using the terminals from the URI specifications: [ABNF] [RFC3986]
"ws" ":" hier-part [ "?" query ]
The path and query components form the resource name sent to the server to identify the kind of service desired. Other components have the meanings described in RFC3986.
Characters in the host component that are excluded by the syntax defined above must be converted from Unicode to ASCII by applying the IDNA ToASCII algorithm to the Unicode host name, with both the AllowUnassigned and UseSTD3ASCIIRules flags set, and using the result of this algorithm as the host in the URI. [RFC3490]
Characters in other components that are excluded by the syntax defined above must be converted from Unicode to ASCII by first encoding the characters as UTF-8 and then replacing the corresponding bytes using their percent-encoded form as defined in the URI and IRI specification. [RFC3986] [RFC3987]
wss: schemeA wss: URL identifies a WebSocket server
and resource name, and indicates that traffic over that connection
is to be encrypted.
In ABNF terms using the terminals from the URI specifications: [ABNF] [RFC3986]
"wss" ":" hier-part [ "?" query ]
The path and query components form the resource name sent to the server to identify the kind of service desired. Other components have the meanings described in RFC3986.
Characters in the host component that are excluded by the syntax defined above must be converted from Unicode to ASCII by applying the IDNA ToASCII algorithm to the Unicode host name, with both the AllowUnassigned and UseSTD3ASCIIRules flags set, and using the result of this algorithm as the host in the URI. [RFC3490]
Characters in other components that are excluded by the syntax defined above must be converted from Unicode to ASCII by first encoding the characters as UTF-8 and then replacing the corresponding bytes using their percent-encoded form as defined in the URI and IRI specification. [RFC3986] [RFC3987]
WebSocket" HTTP Upgrade keywordSec-WebSocket-Key1 and Sec-WebSocket-Key2This section describes two header fields for registration in the Permanent Message Header Field Registry. [RFC3864]
The Sec-WebSocket-Key1 and Sec-WebSocket-Key2 headers
are used in the WebSocket handshake. They are sent from the client
to the server to provide part of the information used by the server
to prove that it received a valid WebSocket handshake. This helps
ensure that the server does not accept connections from
non-Web-Socket clients (e.g. HTTP clients) that are being abused to
send data to unsuspecting WebSocket servers.
Sec-WebSocket-LocationThis section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]
The Sec-WebSocket-Location
header is used in the WebSocket handshake. It is sent from the
server to the client to confirm the URL of the
connection. This enables the client to verify that the connection
was established to the right server, port, and path, instead of
relying on the server to verify that the requested host, port, and
path are correct.
Sec-WebSocket-OriginThis section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]
The Sec-WebSocket-Origin header
is used in the WebSocket handshake. It is sent from the server to
the client to confirm the origin of the script that
opened the connection. This enables user agents to verify that the
server is willing to serve the script that opened the
connection.
Sec-WebSocket-ProtocolThis section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]
The Sec-WebSocket-Protocol
header is used in the WebSocket handshake. It is sent from the
client to the server and back from the server to the client to
confirm the subprotocol of the connection. This enables scripts to
both select a subprotocol and be sure that the server agreed to
serve that subprotocol.
The WebSocket protocol is intended to be used by another specification to provide a generic mechanism for dynamic author-defined content, e.g. in a specification defining a scripted API.
Such a specification first needs to "establish a WebSocket connection", providing that algorithm with:
The host, port, resource name, and secure flag are usually obtained from a URL using the steps to parse a WebSocket URL's components. These steps fail if the URL does not specify a WebSocket.
If a connection can be established, then it is said that the "WebSocket connection is established".
If at any time the connection is to be closed, then the specification needs to use the "close the WebSocket connection" algorithm.
When the connection is closed, for any reason including failure to establish the connection in the first place, it is said that the "WebSocket connection is closed".
While a connection is open, the specification will need to handle the cases when "a WebSocket message has been received" with text data.
To send some text data to an open connection, the specification needs to "send data using the WebSocket".
Web browsers, for security and privacy reasons, prevent documents in different domains from affecting each other; that is, cross-site scripting is disallowed.
While this is an important security feature, it prevents pages from different domains from communicating even when those pages are not hostile. This section introduces a messaging system that allows documents to communicate with each other regardless of their source domain, in a way designed to not enable cross-site scripting attacks.
The task source for the tasks in cross-document messaging is the posted message task source.
This section is non-normative.
For example, if document A contains an iframe
element that contains document B, and script in document A calls
postMessage() on the
Window object of document B, then a message event will
be fired on that object, marked as originating from the
Window of document A. The script in document A might
look like:
var o = document.getElementsByTagName('iframe')[0];
o.contentWindow.postMessage('Hello world', 'http://b.example.org/');
To register an event handler for incoming events, the script
would use addEventListener() (or similar
mechanisms). For example, the script in document B might look
like:
window.addEventListener('message', receiver, false);
function receiver(e) {
if (e.origin == 'http://example.com') {
if (e.data == 'Hello world') {
e.source.postMessage('Hello', e.origin);
} else {
alert(e.data);
}
}
}
This script first checks the domain is the expected domain, and then looks at the message, which it either displays to the user, or responds to by sending a message back to the document which sent the message in the first place.
Use of this API requires extra care to protect users from hostile entities abusing a site for their own purposes.
Authors should check the origin attribute to ensure
that messages are only accepted from domains that they expect to
receive messages from. Otherwise, bugs in the author's message
handling code could be exploited by hostile sites.
Furthermore, even after checking the origin attribute, authors
should also check that the data in question is of the expected
format. Otherwise, if the source of the event has been attacked
using a cross-site scripting flaw, further unchecked processing of
information sent using the postMessage() method could
result in the attack being propagated into the receiver.
Authors should not use the wildcard keyword (*) in the targetOrigin argument in messages that contain any confidential information, as otherwise there is no way to guarantee that the message is only delivered to the recipient to which it was intended.
The integrity of this API is based on the inability for scripts
of one origin to post arbitrary events (using dispatchEvent() or otherwise) to objects in other
origins (those that are not the same).
Implementors are urged to take extra care in the implementation of this feature. It allows authors to transmit information from one domain to another domain, which is normally disallowed for security reasons. It also requires that UAs be careful to allow access to certain properties but not others.
postMessage(message, [ ports, ] targetOrigin)Posts a message, optionally with an array of ports, to the given window.
If the origin of the target window doesn't match the given
origin, the message is discarded, to avoid information leakage. To
send the message to the target regardless of origin, set the
target origin to "*". To restrict the
message to same-origin targets only, without needing to explicitly
state the origin, set the target origin to "/".
Throws an INVALID_STATE_ERR if the ports array is not null and it contains either null
entries or duplicate ports.
When a script invokes the postMessage(message, targetOrigin) method (with only two
arguments) on a Window object, the user agent must
follow these steps:
If the value of the targetOrigin argument
is neither a single U+002A ASTERISK character (*), a single U+002F
SOLIDUS character (/), nor an absolute URL with a
<host-specific>
component that is either empty or a single U+002F SOLIDUS
character (/), then throw a SYNTAX_ERR exception and
abort the overall set of steps.
Let message clone be the result of obtaining a structured clone of the message argument. If this throws an exception, then throw that exception and abort these steps.
Return from the postMessage() method, but
asynchronously continue running these steps.
If the targetOrigin argument is a single
literal U+002F SOLIDUS character (/), and the
Document of the Window object on which
the method was invoked does not have the same origin
as the entry script's document, then abort these steps silently.
Otherwise, if the targetOrigin argument is
an absolute URL, and the Document of the
Window object on which the method was invoked does
not have the same origin as targetOrigin, then abort these steps silently.
Otherwise, the targetOrigin argument is a single literal U+002A ASTERISK character (*), and no origin check is made.
Create an event that uses the MessageEvent
interface, with the event name message, which does not bubble, is
not cancelable, and has no default action. The data attribute must be set to
the value of message clone, the origin attribute must be
set to the Unicode serialization of the origin of
the script that invoked the method, and the source attribute must be
set to the script's global object's
WindowProxy object.
Queue a task to dispatch the event created in the
previous step at the Window object on which the
method was invoked. The task source for this task is the posted message task
source.