C3: An Experimental, Extensible, Reconfigurable Platform for HTML

C3: An Experimental, Extensible, Reconﬁgurable Platform for HTML-based

Applications

Benjamin S. Lerner Brian Burg

University of Washington

Herman Venter Wolfram Schulte

Microsoft Research

Abstract

The common conception of a (client-side) web applica-

tion is some collection of HTML, CSS and JavaScript

(JS) that is hosted within a web browser and that interacts

with the user in some non-trivial ways. The common

conception of a web browser is a monolithic program

that can render HTML, execute JS, and gives the user a

portal to navigate the web. Both of these are misconcep-

tions: nothing inherently conﬁnes webapps to a browser’s

page-navigation idiom, and browsers can do far more

than merely render content. Indeed, browsers and web

apps are converging in functionality, but their underlying

technologies are so far largely distinct.

We present C3, an implementation of the

HTML/CSS/JS platform designed for web-client

research and experimentation. C3’s typesafe, modular

architecture lowers the barrier to webapp and browser

research. Additionally, C3 explores the role of extensibil-

ity throughout the web platform for customization and

research efforts, by introducing novel extension points

and generalizing existing ones. We discuss and evaluate

C3’s design choices for ﬂexibility, and provide examples

of various extensions that we and others have built.

1 Introduction

We spend vast amounts of time using web browsers: ca-

sual users view them as portals to the web, while power

users enjoy them as ﬂexible, sophisticated tools with

countless uses. Researchers of all stripes view browsers

and the web itself as systems worthy of study: browsers

are a common thread for web-related research in ﬁelds

such as HCI, security, information retrieval, sociology,

software engineering, and systems research. Yet today’s

production-quality browsers are all monolithic, complex

systems that do not lend themselves to easy experimenta-

tion. Instead, researchers often must modify the source

code of the browsers—usually tightly-optimized, obscure,

and sprawling C/C++ code—and this requirement of deep

domain knowledge poses a high barrier to entry, correct-

ness, and adoption of research results.

Of course, this is a simpliﬁed depiction: browsers are

not entirely monolithic. Modern web browsers, such

as Internet Explorer, Firefox or Chrome, support exten-

sions, pieces of code—usually a mix of HTML, CSS and

JavaScript (JS)—that are written by third-party develop-

ers and downloaded by end users, that build on top of the

browser and customize it dynamically at runtime.

date, such customizations focus primarily on modifying

the user interfaces of browsers. (Browsers also support

plug-ins, binary components that provide functionality,

such as playing new multimedia ﬁle types, not otherwise

available from the base browser. Unlike extensions, plug-

ins cannot interact directly with each other or extend each

other further.)

Extensions are widely popular among both users and

developers: millions of Firefox users have downloaded

thousands of different extensions over two billion times

Some research projects have used extensions to imple-

ment their ideas. But because current extension mecha-

nisms have limited power and ﬂexibility, many research

projects still must resort to patching browser sources:

XML3D [

] deﬁnes new HTML tags and renders

them with a 3D ray-tracing engine—but neither

HTML nor the layout algorithm are extensible.

Maverick [

] permits writing device drivers in JS

and connecting the devices (e.g., webcams, USB

thumb drives, GPUs, etc.) to web pages—but JS

cannot send raw USB packets to the USB root hub.

RePriv [

] experiments with new ways to securely

expose and interact with private browsing informa-

Opera supports widgets, which do not interact with the browser or

content, and Safari recently added small but slowly-growing support for

extensions in a manner similar to Chrome. We ignore these browsers in

the following discussions.

https://addons.mozilla.org/en-US/statistics/

tion (e.g. topics inferred from browsing history) via

reference-monitored APIs—but neither plug-ins nor

JS extensions can guarantee the integrity or security

of the mined data as it ﬂows through the browser.

These projects incur development and maintenance costs

well above the inherent complexity of their added func-

tionality. Moreover, patching browser sources makes

it difﬁcult to update the projects for new versions of

the browsers. This overhead obscures the fact that such

research projects are essentially extensions to the web-

browsing experience, and would be much simpler to real-

ize on a ﬂexible platform with more powerful extension

mechanisms. Though existing extension points in main-

stream browsers vary widely in both design and power,

none can support the research projects described above.

1.1 The extensible future of web browsers

Web browsers have evolved from their beginnings as mere

document viewers into web-application runtime platforms.

Applications such as Outlook Web Access or Google

Documents are sophisticated programs written in HTML,

CSS and JS that use the browser only for rendering and

execution and ignore everything else browsers provide

(bookmarks, navigation, tab management, etc.). Projects

like Mozilla Prism

strip away all the browser “chrome”

while reusing the underlying HTML/CSS/JS implementa-

tion (in this case, Gecko), letting webapps run like native

apps, outside of the typical browser. Taken to an extreme,

“traditional” applications such as Firefox or Thunderbird

are written using Gecko’s HTML/CSS/JS engine, and

clearly are not themselves hosted within a browser.

While browsers and web apps are growing closer,

they are still mostly separate with no possibility of

tight, customizable integration between them. Blogging

clients such as WordPress, instant messaging clients such

as Gchat, and collaborative document editors such as

Mozilla Skywriter are three disjoint web applications, all

designed to create and share content. An author might be

using all three simultaneously, and searching for relevant

web resources to include as she writes. Yet the only way

to do so is to “escape the system”, copying and pasting

web content via the operating system.

1.2 Contributions

The time has come to reconsider browser architectures

with a focus on extensibility. We present C3: a reconﬁg-

urable, extensible implementation of HTML, CSS and

JS designed for web client research and experimentation.

C3 is written entirely in C

and takes advantage of .Net’s

libraries and type-safety. Similar to Firefox building atop

http://prism.mozillalabs.com/

Gecko, we have built a prototype browser atop C3, using

only HTML, CSS and JS.

By reconﬁgurable, we mean that each of the modules

in our browser—Document Object Model (DOM) imple-

mentation, HTML parser, JS engine, etc.—is loosely cou-

pled by narrow, typesafe interfaces and can be replaced

with alternate implementations compiled separately from

C3 itself. By extensible, we mean that the default imple-

mentations of the modules support run-time extensions

that can be systematically introduced to

1. extend the syntax and implementation of HTML

transform the DOM when being parsed from HTML

3. extend the UI of the running browser

4. extend the environment for executing JS, and

5. transform and modify running JS code.

Compared to existing browsers, C3 introduces novel ex-

tension points (1) and (5), and generalizes existing exten-

sion points (2)–(4). These extension points are treated in

order in Section 3. We discuss their functionality and their

security implications with respect to the same-origin pol-

icy [

]. We also provide examples of various extensions

that we and others have built.

The rest of the paper is structured as follows. Sec-

tion 2 gives an overview of C3’s architecture and high-

lights the software engineering choices made to further

our modularity and extensibility design goals. Section 3

presents the design rationale for our extension points and

discusses their implementation. Section 4 evaluates the

performance, expressiveness, and security implications

of our extension points. Section 5 describes future work.

Section 6 concludes.

2 C3 architecture and design choices

As a research platform, C3’s explicit design goals are

architectural modularity and ﬂexibility where possible,

instead of raw performance. Supporting the various ex-

tension mechanisms above requires hooks at many levels

of the system. These goals are realized through careful

design and implementation choices. Since many require-

ments of an HTML platform are standardized, aspects of

our architecture are necessarily similar to other HTML

implementations. C3 lacks some of the features present in

mature implementations, but contains all of the essential

architectural details of an HTML platform.

C3’s clean-slate implementation presented an opportu-

nity to leverage modern software engineering tools and

practices. Using a managed language such as C

sidesteps

the headaches of memory management, buffer overruns,

and many of the common vulnerabilities in production

Event loop

DOM implementation

Node, Element, . . .

Default down-

load manager

JS Engine

Browser executable

WinForms

renderer

Layout

ILayoutListener

IRenderer

IHtmlParser

HtmlParser

CssParser

ICssParser

IDownloadManager

Assembly

Class

Interface

Communication

Implementation

Threads

Figure 1: C3’s modular architecture

browsers. Using a higher-level language better preserves

abstractions and simpliﬁes many implementation details.

Code Contracts [

] are used throughout C3 to ensure

implementation-level invariants and safety properties—

something that is not feasible in existing browsers.

Below, we sketch C3’s module-level architecture, and

elaborate on several core design choices and resulting

customization opportunities. We also highlight features

that enable the extension points examined in Section 3.

2.1 Pieces of an HTML platform

The primary task of any web platform is to parse, ren-

der, and display an HTML document. For interactivity,

web applications additionally require the managing of

events such as user input, network connections, and script

evaluation. Many of these sub-tasks are independent; Fig-

ure 1 shows C3’s module-level decomposition of these

tasks. The HTML parser converts a text stream into an

object tree, while the CSS parser recognizes stylesheets.

The JS engine dispatches and executes event handlers.

The DOM implementation implements the API of DOM

nodes, and implements bindings to expose these methods

to JS scripts. The download manager handles actual net-

work communication and interactions with any on-disk

cache. The layout engine computes the visual structure

and appearance of a DOM tree given current CSS styles.

The renderer displays a computed layout. The browser’s

UI displays the output of the renderer on the screen, and

routes user input to the DOM.

2.2 Modularity

Unlike many modern browsers, C3’s design embraces

loose coupling between browser components. For ex-

ample, it is trivial to replace the HTML parser, renderer

frontend, or JS engine without modifying the DOM im-

plementation or layout algorithm. To make such drop-in

replacements feasible, C3 shares no data structures be-

tween modules when possible (i.e., each module is heap-

disjoint). This design decision also simpliﬁes threading

disciplines, and is further discussed in Section 2.7.

Simple implementation-agnostic interfaces describe the

operations of the DOM implementation, HTML parser,

CSS parser, JS engine, layout engine, and front-end ren-

derer modules. Each module is implemented as a separate

.Net assembly, which prevents modules from breaking ab-

stractions and makes swapping implementations simple.

Parsers could be replaced with parallel [

] or speculative

versions; layout might be replaced with a parallel [

] or

incrementalizing version, and so on. The default module

implementations are intended as straightforward, unopti-

mized reference implementations. This permits easy per-

module evaluations of alternate implementation choices.

2.3 DOM implementation

The DOM API is a large set of interfaces, methods and

properties for interacting with a document tree. We high-

light two key design choices in our implementation: what

the object graph for the tree looks like, and the bindings

of these interfaces to C

classes. Our choices aim to

minimize overhead and “boilerplate” coding burdens for

extension authors.

Object trees:

The DOM APIs are used throughout the

browser: by the HTML parser (Section 2.4) to construct

the document tree, by JS scripts to manipulate that tree’s

structure and query its properties, and by the layout engine

to traverse and render the tree efﬁciently. These clients

use distinct but overlapping subsets of the APIs, which

means they must be exposed both to JS and to C

, which

in turn leads to the ﬁrst design choice.

One natural choice is to maintain a tree of “imple-

mentation” objects in the C

heap separate from a set

of “wrapper” objects in the JS heap

containing point-

ers to their C

counterparts: the JS objects are a “view”

of the underlying C

“model”. The JS objects contain

stubs for all the DOM APIs, while the C

objects contain

implementations and additional helper routines. This de-

sign incurs the overheads of extra pointer dereferences

(from the JS APIs to the C

helpers) and of keeping

the wrappers synchronized with the implementation tree.

However, it permits specializing both representations for

their respective usages, and the extra indirection enables

http://hsivonen.iki.fi/speculative-html5-parsing/

Expert readers will recognize that “objects in the JS heap” are

implemented by C

“backing” objects; we are distinguishing these from

objects that do not “back” any JS object.

multiple views of the model: This is essentially the tech-

nical basis of Chrome extensions’ “isolated worlds” [

where the indirection is used to ensure security proper-

ties about extensions’ JS access to the DOM. Firefox

also uses the split to improve JS memory locality with

“compartments” [15].

By contrast, C3 instead uses a single tree of objects

visible to both languages, with each DOM node being a

subclass of an ordinary JS object, and each DOM API

being a standard C

method that is exposed to JS. This de-

sign choice avoids both overheads mentioned above. Fur-

ther, Spur [

], the tracing JS engine currently used by C3,

can trace from JS into DOM code for better optimization

opportunities. To date, no other DOM implementation/JS

engine pair can support this optimization.

DOM language bindings:

The second design choice

stems from our choice for the ﬁrst: how to represent DOM

objects such that their properties are callable from both

and JS. This representation must be open: extensions

such as XML3D must be able to deﬁne new types of

DOM nodes that are instantiable from the parser (see

Section 3.1) and capable of supplying new DOM APIs to

both languages as well. Therefore any new DOM classes

must subclass our C

DOM class hierarchy easily, and

be able to use the same mechanisms as the built-in DOM

classes. Our chosen approach is a thin marshaling layer

around a C

implementation, as follows:

•

All Spur JS objects are instances of C

classes de-

riving from

ObjectInstance

. Our DOM class hi-

erarchy derives from this too, and so DOM objects

are JS objects, as above.

•

All JS objects are essentially property bags, or

key/value dictionaries, and “native” objects (e.g.

Math

Date

) may contain properties that are im-

plemented by the JS runtime and have access to

runtime-internal state. All DOM objects are native,

and their properties (the DOM APIs) access the in-

ternal representation of the document.

•

The JS dictionary is represented within Spur as a

TypeObject

ﬁeld of each

ObjectInstance

. To ex-

pose a native method on a JS object, the implemen-

tation simply adds a property to the

TypeObject

mapping the (JS) name to the (C

) function that

implements it.

This means that a single C

function

can be called from both languages, and need not be

implemented twice.

The

ObjectInstance

and

TypeObject

classes are pub-

lic Spur APIs, and so our DOM implementation is readily

extensible by new node types.

Technically, to a C

function that unwraps the JS values into

strongly-typed C

values, then calls a second C

function with them.

2.4 The HTML parser

The HTML parser is concerned with transforming HTML

source into a DOM tree, just as a standard compiler’s

parser turns source into an AST. Extensible compilers’

parsers can recognize supersets of their original language

via extensions; similarly, C3’s default HTML parser sup-

ports extensions that add new HTML tags (which are im-

plemented by new C

DOM classes as described above;

see also Section 3.1).

An extensible HTML parser has only two dependen-

cies: a means for constructing a new node given a tag

name, and a factory method for creating a new node and

inserting it into a tree. This interface is far simpler than

that of any DOM node, and so exists as the separate

INode

interface. The parser has no hard dependency on

a speciﬁc DOM implementation, and a minimal imple-

mentation of the

INode

interface can be used to test the

parser independently of the DOM implementation. The

default parser implementation is given a DOM node fac-

tory that can construct

INode

s for the built-in HTML tag

names. Extending the parser via this factory is discussed

in Section 3.1.

2.5 Computing visual structure

The layout engine takes a document and its stylesheets,

and produces as output a layout tree, an intermediate data

structure that contains sufﬁcient information to display

a visual representation of the document. The renderer

then consults the layout tree to draw the document in a

platform- or toolkit-speciﬁc manner.

Computing a layout tree requires three steps: ﬁrst,

DOM nodes are attributed with style information accord-

ing to any present stylesheets; second, the layout tree’s

structure is determined; and third, nodes of the layout tree

are annotated with concrete styles (placement and sizing,

fonts and colors, etc.) for the renderer to use. Each of

these steps admits a na

ıve reference implementation, but

both more efﬁcient and more extensible algorithms are

possible. We focus on the former here; layout extensibil-

ity is revisited in Section 3.3.

Assigning node styles

The algorithm that decorates

DOM nodes with CSS styles does not depend on any

other parts of layout computation. Despite the top-down

implementation suggested by the name “cascading style

sheets”, several efﬁcient strategies exist, including recent

and ongoing research in parallel approaches [11].

Our default style “cascading” algorithm is self-

contained, single-threaded and straightforward. It deco-

rates each DOM node with an immutable calculated style

object, which is then passed to the related layout tree

node during construction. This immutable style sufﬁces

thereafter in determining visual appearance.

Determining layout tree structure

The layout tree is

generated from the DOM tree in a single traversal. The

two trees are approximately the same shape; the layout

tree may omit nodes for invisible DOM elements (e.g.

hscript/i

), and may insert “synthetic” nodes to simplify

later layout invariants. For consistency, this transforma-

tion must be serialized between DOM mutations, and so

runs on the DOM thread (see Section 2.7). The layout tree

must preserve a mapping between DOM elements and

the layout nodes they engender, so that mouse movement

(which occurs in the renderer’s world of screen pixels and

layout tree nodes) can be routed to the correct target node

(i.e. a DOM element). A na

ıve pointer-based solution

runs afoul of an important design decision: C3’s archi-

tectural goals of modularity require that the layout and

DOM trees share no pointers. Instead, all DOM nodes

are given unique numeric ids, which are preserved by the

DOM-to-layout tree transformation. Mouse targeting can

now be deﬁned in terms of these ids while preserving

pointer-isolation of the DOM from layout.

Solving layout constraints

The essence of any layout

algorithm is to solve constraints governing the placement

and appearance of document elements. In HTML, these

constraints are irregular and informally speciﬁed (if at

all). Consequently the constraints are typically solved

by a manual, multi-pass algorithm over the layout tree,

rather than a generic constraint-solver [

]. The manual

algorithms found in production HTML platforms are often

tightly optimized to eliminate some passes for efﬁciency.

C3’s architecture admits such optimized approaches,

too; our reference implementation keeps the steps separate

for clarity and ease of experimentation. Indeed, because

the layout tree interface does not assume a particular

implementation strategy, several layout algorithm variants

have been explored in C3 with minimal modiﬁcations to

the layout algorithm or components dependent on the

computed layout tree.

2.6 Accommodating Privileged UI

Both Firefox and Chrome implement some (or all) of

their user interface (e.g. address bar, tabs, etc.) in declar-

ative markup, rather than hard-coded native controls. In

both cases this gives the browsers increased ﬂexibility; it

also enables Firefox’s extension ecosystem. The markup

used by these browsers is trusted, and can access inter-

nal APIs not available to web content. To distinguish

the two, trusted UI ﬁles are accessed via a different

URL scheme: e.g., Firefox’s main UI is loaded using

chrome://browser/content/browser.xul.

We chose to implement our prototype browser’s UI in

HTML for two reasons. First, we wanted to experiment

with writing sophisticated applications entirely within the

HTML/CSS/JS platform and experience ﬁrst-hand what

challenges arose. Even in our prototype, such experi-

ence led to the two security-related changes described

below. Secondly, having our UI in HTML opens the

door to the extensions described in Section 3; the en-

tirety of a C3-based application is available for extension.

Like Firefox, our browser’s UI is available at a privi-

leged URL: launching C3 with a command-line argument

chrome://browser/tabbrowser.html

will display

the browser UI. Launching it with the URL of any web-

site will display that site without any surrounding browser

chrome. Currently, we only permit HTML ﬁle resources

bundled within the C3 assembly itself to be given privi-

leged chrome:// URLs.

Designing this prototype exposed deliberate limitations

in HTML when examining the navigation history of child

windows (popups or

hiframe/i

s): the APIs restrict access

to same-origin sites only, and are write-only. A parent

window cannot see what site a child is on unless it is from

the same origin as the parent, and can never see what sites

a child has visited. A browser must avoid both of these

restrictions so that it can implement the address bar.

Rather than change API visibility, C3 extends the DOM

API in two ways. First, it gives privileged pages (i.e.,

from

chrome://

URLs) a new

childnavigated

noti-

ﬁcation when their children are navigated, just before

the

onbeforeunload

events that the children already

receive. Second, it treats

chrome://

URLs as trusted

origins that always pass same-origin checks. The trusted-

origin mechanism and the custom navigation event sufﬁce

to implement our browser UI.

2.7 Threading architecture

One important point of ﬂexibility is the mapping between

threads and the HTML platform components described

above. We do not impose any threading discipline be-

yond necessary serialization required by HTML and DOM

standards. This is made possible by our decision to pre-

vent data races by design: in our architecture, data is

either immutable, or it is not shared amongst multiple

components. Thus, it is possible to choose any thread-

ing discipline within a single component; a single thread

could be shared among all components for debugging, or

several threads could be used within each component to

implement worker queues.

Below, we describe the default allocation of threads

among components, as well as key concurrency concerns

for each component.

2.7.1 The DOM/JS thread(s)

The DOM event dispatch loop and JS execution are

single-threaded within a set of related web pages

. “Sep-

arate” pages that are unrelated

can run entirely paral-

lel with each other. Thus, sessions with several tabs or

windows open simultaneously use multiple DOM event

dispatch loops.

In C3, each distinct event loop consists of two threads:

a mutator to run script and a watchdog to abort run-away

scripts. Our system maintains the invariant that all mu-

tator threads are heap-disjoint: JS code executing in a

task on one event loop can only access the DOM nodes of

documents sharing that event loop. This invariant, com-

bined with the single-threaded execution model of JS

(from the script’s point of view), means all DOM nodes

and synchronous DOM operations can be lock-free. (Op-

erations involving local storage are asynchronous and

must be protected by the storage mutex.) When a window

hiframe/i

is navigated, the relevant event loop may

change. An event loop manager is responsible for main-

taining the mappings between windows and event loops

to preserve the disjoint-heap invariant.

Every DOM manipulation (node creation, deletion, in-

sertion or removal; attribute creation or modiﬁcation; etc.)

notiﬁes any registered DOM listener via a straightforward

interface. One such listener is used to inform the layout

engine of all document manipulations; others could be

used for testing or various diagnostic purposes.

2.7.2 The layout thread(s)

Each top-level browser window is assigned a layout

thread, responsible for resolving layout constraints as

described in Section 2.5. Several browser windows might

be simultaneously visible on screen, so their layout com-

putations must proceed in parallel for each window to

quickly reﬂect mutations to the underlying documents.

Once the DOM thread computes a layout tree, it transfers

ownership of the tree to the layout thread, and begins

building a new tree. Any external resources necessary for

layout or display (such as image data), are also passed

to the layout thread as uninterpreted .Net streams. This

isolates the DOM thread from any computational errors

on the layout threads.

2.7.3 The UI thread

It is common for GUI toolkits to impose threading restric-

tions, such as only accessing UI widgets from their creat-

ing thread. These restrictions inﬂuence the platform inso-

We ignore for now web-workers, which are an orthogonal concern.

Deﬁning when pages are actually separate is non-trivial, and is a

reﬁnement of the same-origin policy, which in turn has been the subject

of considerable research [7, 2]

far as replaced elements (such as buttons or text boxes)

are implemented by toolkit widgets.

C3 is agnostic in choosing a particular toolkit, but

rather exposes abstract interfaces for the few widget prop-

erties actually needed by layout. Our prototype currently

uses the .Net WinForms toolkit, which designates one

thread as the “UI thread”, to which all input events are

dispatched and on which all widgets must be accessed.

When the DOM encounters a replaced element, an actual

WinForms widget must be constructed so that layout can

in turn set style properties on that widget. This requires

synchronous calls from the DOM and layout threads to

the UI thread. Note, however, that responding to events

(such as mouse clicks or key presses) is asynchronous,

due to the indirection introduced by numeric node ids: the

UI thread simply adds a message to the DOM event loop

with the relevant ids; the DOM thread will process that

message in due course.

3 C3 Extension points

The extension mechanisms we introduce into C3 stem

from a principled examination of the various semantics of

HTML. Our interactions with webapps tacitly rely on ma-

nipulating HTML in two distinct ways: we can interpret

it operationally via the DOM and JS programs, and we

can interpret it visually via CSS and its associated layout

algorithms. Teasing these interpretations apart leads to

the following two transformation pipelines:

• JS global object + HTML source

1,2

HTML parsing

−−−−−−−−−→

DOM subtrees

onload

−−−−→ DOM document

JS events

−−−−−−→ DOM document . . .

• DOM document + CSS source

CSS parsing

−−−−−−−−→ CSS content model

layout

−−−−→ CSS box model

The ﬁrst pipeline distinguishes four phases of the docu-

ment lifecycle, from textual sources through to the event-

based running of JS: the initial

onload

event marks the

transition point after which the document is asserted to

be fully loaded; before this event ﬁres, the page may

be inconsistent as critical resources in the page may not

yet have loaded, or scripts may still be writing into the

document stream.

Explicitly highlighting these pipeline stages leads to

designing extension points in a principled way: we can

extend the inputs accepted or the outputs produced by

each stage, as long as we produce outputs that are accept-

able inputs to the following stages. This is in contrast to

public interface IDOMTagFactory {

IEnumerable<Element> TagTemplates { get; }

}

public class HelloWorldTag : Element {

string TagName { get { return "HelloWorld"; } }

...

}

public class HelloWorldFactory : IDOMTagFactory {

IEnumerable<Element> TagTemplates { get {

yield return new HelloWorldTag();

} }

}

Figure 2: Factory and simple extension deﬁning new tags

the extension models of existing browsers, which support

various extension points without relating them to other

possibilities or to the browser’s behavior as a whole. The

extension points engendered by the pipelines above are

(as numbered):

Before beginning HTML parsing, extensions may

provide new tag names and DOM-node implementa-

tions for the parser to support.

Before running any scripts, extensions may modify

the JS global scope by adding or removing bindings.

Before inserting subtrees into the document, exten-

sions may preprocess them using arbitrary C

code.

Before ﬁring the

onload

event, extensions may

declaratively inject new content into the nearly-

complete tree using overlays.

Once the document is complete and events are run-

ning, extensions may modify existing event handlers

using aspects.

Before beginning CSS parsing, extensions may pro-

vide new CSS properties and values for the parser

to support.

Before computing layout, extensions may provide

new layout box types and implementations to affect

layout and rendering.

Some of these extension points are simpler than others

due to regularities in the input language, others are more

complicated, and others are as yet unimplemented. Points

(1) and (5) are novel to C3. C3 does not yet implement

points (6) or (7), though they are planned future work;

they are also novel. We explain points (1), (3) and (4) in

Section 3.1, points (2) and (5) in Section 3.2, and ﬁnally

points (6) and (7) in Section 3.3.

3.1 HTML parsing/document construction

Point (1): New tags and DOM nodes

The HTML

parser recognizes concrete syntax resembling

htagName

attrName=“val”

and constructs new

DOM nodes for each tag. In most browsers, the choices

of which tag names to recognize, and what corresponding

objects to construct, are tightly coupled into the parser. In

C3, however, we abstract both of these decisions behind a

factory, whose interface is shown in the top of Figure 2.

Besides simplifying our code’s internal structure, this

approach permits extensions to contribute factories too.

Our default implementation of this interface provides

one “template” element for each of the standard HTML

tag names; these templates inform the parser which tag

names are recognized, and are then cloned as needed

by the parser. Any unknown tag names fall back to re-

turning an

HTMLUnknownElement

object, as deﬁned by

the HTML speciﬁcation. However, if an extension con-

tributes another factory that provides additional templates,

the parser seamlessly can clone those instead of using the

fallback: effectively, this extends the language recognized

by the parser, as XML3D needed, for example. A trivial

example that adds support for a

hHelloWorld/i

tag is

shown in Figure 2. A more realistic example is used by

C3 to support overlays (see Figure 4 and below).

The factory abstraction also gives us the ﬂexibility

to support additional experiments: rather than adding

new tags, a researcher might wish to modify existing tags.

Therefore, we permit factories to provide a new template

for existing tag names—and we require that at most one

extension does so per tag name. This permits extensions

to easily subclass the C3 DOM implementation, e.g. to

add instrumentation or auditing, or to modify existing

functionality. Together, these extensions yield a parser

that accepts a superset of the standard HTML tags and

still produces a DOM tree as output.

Point (3): Preprocessing subtrees

The HTML 5 pars-

ing algorithm produces a document tree in a bottom-up

manner: nodes are created and then attached to parent

nodes, which eventually are attached to the root DOM

node. Compiler-authors have long known that it is use-

ful to support semantic actions, callbacks that examine

or preprocess subtrees as they are constructed. Indeed,

the HTML parsing algorithm itself speciﬁes some behav-

iors that are essentially semantic actions, e.g., “when an

himg/i

is inserted into the document, download the ref-

erenced image ﬁle”. Extensions might use this ability to

collect statistics on the document, or to sanitize it dur-

ing construction. These actions typically are local—they

examine just the newly-inserted tree—and rarely mutate

Firefox seems not to use a factory; Chrome uses one, but the choice

of factory is ﬁxed at compile-time. C3 can load factories dynamically.

public interface IParserMutatorExtension {

IEnumerable<string> TagNamesOfInterest { get; }

void OnFinishedParsing(Element element);

}

Figure 3: The interface for HTML parser semantic actions

Base constructions

hoverlay/i Root node of extension document

hinsert

selector=“selector”

where=“before

after”

Insert new content adjacent to all

nodes matched by CSS selector

hreplace

selector=“selector”/i

Replace existing subtrees matching

selector with new content

hself

attrName=“value”. . .

Used within

hreplace/i

, refers to

node being replaced and permits

modifying its attributes

hcontents/i

Used within

hreplace/i

, refers to

children of node being replaced

Syntactic sugar

hbefore . . . /i hinsert where=“before”. . . /i

hafter . . . /i hinsert where=“after”. . . /i

hmodify

selector=“sel”

where=“before”i

hself new attributesi

new content

h/selfi

h/modifyi

hreplace selector=“sel”i

hself new attributesi

new content

hcontents/i

h/selfi

h/replacei

and likewise for where=“after”

Figure 4: The overlay language for document construction

extensions. The bottom set of tags are syntactic sugar.

the surrounding document. (In HTML in particular, be-

cause inline scripts execute during the parsing phase, the

document may change arbitrarily between two successive

semantic-action callbacks, and so semantic actions will

be challenging to write if they are not local.)

Extensions in C3 can deﬁne custom semantic actions

using the interface shown in Figure 3. The interface sup-

plies a list of tag names, and a callback to be used when

tags of those names are constructed.

Point (4): Document construction

Firefox pioneered

the ability to both deﬁne application UI and deﬁne ex-

tensions to that UI using a single declarative markup

language (XUL), an approach whose success is witnessed

by the variety and popularity of Firefox’s extensions. The

fundamental construction is the overlay, which behaves

like a “tree-shaped patch”: the children of the

hoverlay/i

select nodes in a target document and deﬁne content to

be inserted into or modiﬁed within them, much as hunks

within a patch select lines in a target text ﬁle. C3 adapts

and generalizes this idea for HTML.

Our implementation adds eight new tags to HTML,

hoverlayi

hmodify selector=“head” where=“after”i

hselfi

hstylei

li > #bullet { color: blue; }

h/stylei

h/selfi

h/modifyi

hbefore selector=“li > *:ﬁrst-child”i

hspan class=“bullet”i•h/spani

h/beforei

h/overlayi

Figure 5: Simulating list bullets (in language of Fig. 4)

shown in Figure 4, to deﬁne overlays and the various ac-

tions they can perform. As they are a language extension

to HTML, we inform the parser of these new tags using

the

IDOMTagFactory

described above.

Overlays can

hinsert/i

hreplace/i

elements, as matched by CSS

selectors. To support modifying content, we give over-

lays the ability to refer to the target node (

hself/i

) or its

hcontents/i

. Finally, we deﬁne syntactic sugar to make

overlays easier to write.

Figure 5 shows a simple but real example used dur-

ing development of our system, to simulate bulleted lists

while generated content support was not yet implemented.

It appends a

hstyle/i

element to the end of the

hhead/i

subtree (and fails if no

hhead/i

element exists), and in-

serts a hspan/i element at the beginning of each hli/i.

The subtlety of deﬁning the semantics of overlays lies

in their interactions with scripts: when should overlays

be applied to the target document? Clearly overlays must

be applied after the document structure is present, so a

strawman approach would apply overlays “when pars-

ing ﬁnishes”. This exposes a potential inconsistency, as

scripts that run during parsing would see a partial, not-yet-

overlaid document, with nodes

and

adjacent, while

scripts that run after parsing would see an overlaid docu-

ment where

and

may no longer be adjacent. However,

the HTML speciﬁcation offers a way out: the DOM raises

a particular event,

onload

, that indicates the document

has ﬁnished loading and is ready to begin execution. Prior

to that point, the document structure is in ﬂux—and so

we choose to apply overlays as part of that ﬂux, imme-

diately before the

onload

event is ﬁred. This may break

poorly-coded sites, but in practice has not been an issue

with Firefox’s extensions.

We apply the overlays using just one general-purpose callback

within our code. This callback could be factored as a standalone, ad-hoc

extension point, making overlays themselves truly an extension to C3.

3.2 JS execution

Point (2): Runtime environment

Extensions such as

Maverick may wish to inject new properties into the

JS global object. This object is an input to all scripts,

and provides the initial set of functionality available

to pages. As an input, it must be constructed before

HTML parsing begins, as the constructed DOM nodes

should be consistent with the properties available from

the global object: e.g.,

document.body

must be an in-

stance of

window.HTMLBodyElement

. This point in the

document’s execution is stable—no scripts have executed,

no nodes have been constructed—and so we permit ex-

tensions to manipulate the global object as they please.

(This could lead to inconsistencies, e.g. if they modify

window.HTMLBodyElement

but do not replace the im-

plementation of

hbody/i

tags using the prior extension

points. We ignore such buggy extensions for now.)

Point (5): Scripts themselves

The extensions de-

scribed so far modify discrete pieces of implementation,

such as individual node types or the document structure,

because there exist ways to name each of these resources

statically: e.g., overlays can examine the HTML source

of a page and write CSS selectors to name parts of the

structure. The analogous extension to script code needs

to modify the sources of individual functions. Many JS

idioms have been developed to achieve this, but they all

suffer from JS’s dynamic nature: function names do not

exist statically, and scripts can create new functions or

alias existing ones at runtime; no static inspection of the

scripts’ sources can precisely identify these names. More-

over, the common idioms used by extensions today are

brittle and prone to silent failure.

C3 includes our prior work [

], which addresses this

disparity by modifying the JS compiler to support aspect

oriented programming using a dynamic weaving mecha-

nism to advise closures (rather than variables that point

to them). Only a dynamic approach can detect runtime-

evaluated functions, and this requires compiler support

to advise all aliases to a function (rather than individual

names). As a side beneﬁt, aspects’ integration with the

compiler often improves the performance of the advice:

in the work cited, we successfully evaluated our approach

on the sources of twenty Firefox extensions, and showed

that they could express nearly all observed idioms with

shorter, clearer and often faster code.

3.3 CSS and layout

Discussion

An extensible CSS engine permits incre-

mentally adding new features to layout in a modular, clean

way. The CSS 3 speciﬁcations themselves are a step in

this direction, breaking the tightly-coupled CSS 2.1 spec-

iﬁcation into smaller pieces. A true test of our proposed

extension points’ expressiveness would be to implement

new CSS 3 features, such as generated content or the

ﬂex-box model, as extensions. An even harder test would

be to extricate older CSS 2 features, such as ﬂoats, and re-

implement them as compositional extensions. The beneﬁt

to successfully implementing these extensions is clear: a

stronger understanding of the semantics of CSS features.

We discovered the possibility of these CSS extension

points quite recently, in exploring the consequences of

making each stage of the layout pipeline extensible “in the

same way” as the DOM/JS pipeline is. To our knowledge,

implementing the extension points below has not been

done before in any browser, and is planned future work.

Point (6): Parsing CSS values

We can extend the

CSS language in four ways: 1) by adding new prop-

erty names and associated values, 2) by recognizing new

values for existing properties, 3) by extending the set of

selectors, or 4) by adding entirely new syntax outside of

style declaration blocks. The latter two are beyond the

scope of an extension, as they require more sweeping

changes to both the parser and to layout, and are better

suited to an alternate implementation of the CSS parser

altogether (i.e., a different conﬁguration of C3).

Supporting even just the ﬁrst two extension points is

nontrivial. Unlike HTML’s uniform tag syntax, nearly

every CSS attribute has its own idiosyncratic syntax:

font: italic bold 10pt/1.2em "Gentium", serif;

margin: 0 0 2em 3pt;

display: inline-block;

background-image: url(mypic.jpg);

...

However, a style declaration itself is very regular, being a

semicolon-separated list of colon-separated name/value

pairs. Moreover, the CSS parsing algorithm discards

any un-parsable attributes (up to the semicolon), and then

parse the rest of the style declaration normally.

Supporting the ﬁrst extension point—new property

names—requires making the parser table-driven and reg-

istering value-parsing routines for each known property

name. Then, like HTML tag extensions, CSS property ex-

tensions can register new property names and callbacks to

parse the values. (Those values must never contain semi-

colons, or else the underlying parsing algorithm would

not be able to separate one attribute from another.)

Supporting the second extension point is subtler. Un-

like the HTML parser’s uniqueness constraint on tag

names, here multiple extensions might contribute new

values to an existing property; we must ensure that the

syntaxes of such new values do not overlap, or else pro-

vide some ranking to choose among them.

Point (7): Composing layout

The CSS layout algo-

rithm describes how to transform the document tree (the

content model) into a tree of boxes of varying types, ap-

pearances and positions. Some boxes represent lines of

text, while others represent checkboxes, for example. This

transformation is not obviously compositional: many

CSS properties interact with each other in non-trivial

ways to determine precisely which types of boxes to con-

struct. Rather than hard-code the interactions, the layout

transformation must become table-driven as well. Then

both types of extension above become easy: extensions

can create new box subtypes, and patch entries in the

transformation table to indicate when to create them.

4 Evaluation

The C3 platform is rapidly evolving, and only a few ex-

tensions have yet been written. To evaluate our platform,

we examine: the performance of our extension points,

ensuring that the beneﬁts are not outweighed by huge

overheads; the expressiveness, both in the ease of “port-

ing” existing extensions to our model and in comparison

to other browsers’ models; and the security implications

of providing such pervasive customizations.

4.1 Performance

Any time spent running the extension manager or conﬂict

analyses slows down the perceived performance of the

browser. Fortunately, this process is very cheap: with one

extension of each supported type, it costs roughly 100ms

to run the extensions. This time includes: enumerating all

extensions (27ms), loading all extensions (4ms), and de-

tecting parser-tag conﬂicts (3ms), mutator conﬂicts (2ms),

and overlay conﬂicts (72ms). All but the last of these

tasks runs just once, at browser startup; overlay conﬂict

detection must run per-page. Enumerating all extensions

currently reads a directory, and so scales linearly with

the number of extensions. Parser and mutator conﬂict

detection scale linearly with the number of extensions

as well; overlay conﬂict detection is more expensive as

each overlay provides more interacting constraints than

other types of extensions do. If necessary, these costs

can be amortized further by caching the results of conﬂict

detection between browser executions.

4.2 Expressiveness

Figure 6 lists several examples of extensions available

for IE, Chrome, and Firefox, and the corresponding C3

extension points they would use if ported to C3. Many of

these extensions simply overlay the browser’s user inter-

face and require no additional support from the browser.

Some, such as Smooth Gestures or LastTab, add or revise

UI functionality. As our UI is entirely script-driven, we

support these via script extensions. Others, such as the

various Native Client libraries, are sandboxed programs

that are then exposed through JS objects; we support the

JS objects and .Net provides the sandboxing.

Figure 6 also shows some research projects that are not

implementable as extensions in any other browser except

C3. As described below, these projects extend the HTML

language, CSS layout, and JS environment to achieve

their functionality. Implementing these on C3 requires

no hacking of C3 , leading to a much lower learning

curve and fewer implementation pitfalls than modifying

existing browsers. We examine some examples, and how

they might look in C3, in more detail here.

4.2.1 XML3D: Extending HTML, CSS and layout

XML3D [

] is a recent project aiming to provide

3D scenes and real-time ray-traced graphics for web

pages, in a declarative form analogous to

hsvg/i

for two-

dimensional content. This work uses XML namespaces to

deﬁne new scene-description tags and requires modifying

each browser to recognize them and construct special-

ized DOM nodes accordingly. To style the scenes, this

work must modify the CSS engine to recognize new style

attributes. Scripting the scenes and making them inter-

active requires constructing JS objects that expose the

customized properties of the new DOM nodes. It also

entails informing the browser of a new scripting language

(AnySL) tailored to animating 3D scenes.

Instead of modifying the browser to recognize new tag

names, we can use the new-tag extension point to deﬁne

them in an extension, and provide a subclassed

hscript/i

implementation recognizing AnySL. Similarly, we can

provide new CSS values and new box subclasses for

layout to use. The full XML3D extension would consist

of these four extension hooks and the ray-tracer engine.

4.2.2 Maverick: Extensions to the global scope

Maverick [

] aims to connect devices such as webcams

or USB keys to web content, by writing device drivers in

JS and connecting them to the devices via Native Client

(NaCl) [

]. NaCl exposes a socket-like interface to web

JS over which all interactions with native modules are

multiplexed. To expose its API to JS, Maverick injects an

actual DOM

hembed/i

node into the document, stashing

state within it, and using JS properties on that object to

communicate with NaCl. This object can then transliterate

the image frames from the webcam into Base64-encoded

src

URLs for other scripts’ use in

himg/i

tags, and so

reuse the browser’s image decoding libraries.

There are two main annoyances with Maverick’s im-

plementation that could be avoided in C3. First, NaCl

Extensions Available from C3-equivalent extension points used

IE:

Explorer bars (4) overlay the main browser UI

Context menu items (4) overlay the context menu in the browser UI

Accelerators (4) overlay the context menu

WebSlices (4) overlay browser UI

Chrome:

Gmail checkers https://chrome.google.com/

extensions/search?q=gmail

(4) overlay browser UI, (5) script advice

Skype http://go.skype.com/dc/

clicktocall

(4) overlay browser UI, (2) new JS objects, (5) script

advice

Smooth Gestures http://goo.gl/rN5Y (4) overlay browser UI, (5) script advice

Native Client libraries http://code.google.com/p/

nativeclient/

(2) new JS objects

Firefox:

TreeStyleTab https://addons.mozilla.org/

en-US/firefox/addon/5890/

(4) overlay tabbar in browser UI, inject CSS

LastTab https://addons.mozilla.org/

en-US/firefox/addon/112/

(5) script advice

Perspectives [16] (5) script extensions, (4) overlay error UI

Firebug http://getfirebug.com/ (4) overlays, (5) script extensions, (2) new JS objects

Research projects:

XML3D [14]

(1) new HTML tags, (6) new CSS values, (7) new layouts

Maverick [12] (2) new JS objects

Fine [6] (1) HTML hscript/i tag replacement

RePriv [5] (2) new JS objects

Figure 6: Example extensions in IE, Firefox, and Chrome, as well as research projects best implemented in C3, and the

C3 extension points that they might use

isolates native modules in a strong sandbox that prevents

direct communication with resources like devices; Maver-

ick could not be implemented in NaCl without modifying

the sandbox to expose a new system call and writing

untrusted glue code to connect it to JS; in C3, trusted

JS objects can be added without recompiling C3 itself.

Second, implementing Maverick’s exposed API requires

carefully managing low-level NPAPI routines that must

mimic JS’s name-based property dispatch; in C3, expos-

ing properties can simply reuse the JS property dispatch,

as in Section 2.3.

Ultimately, using a DOM node to expose a device is

not the right abstraction: it is not a node in the document

but rather a global JS object like

XMLHttpRequest

. And

while using Base64-encoded URLs is a convenient imple-

mentation trick, it would be far more natural to call the

image-decoding libraries directly, avoiding both overhead

and potential transcoding errors.

4.2.3 RePriv: Extensions hosting extensions

RePriv [

] runs in the background of the browser and

mines user browsing history to infer personal interests. It

carefully guards the release of that information to web-

sites, via APIs whose uses can be veriﬁed to avoid un-

wanted information leakage. At the same time, it offers its

own extension points for site-speciﬁc “interest miners” to

use to improve the quality of inferred information. These

miners are all scheduled to run during an

onload

event

handler registered by RePriv. Finally, extensions can be

written to use the collected information to reorganize web

pages at the client to match the user’s interests.

While this functionality is largely implementable as a

plug-in in other browsers, several factors make it much

easier to implement in C3. First and foremost, RePriv’s

security guarantees rely on C3 being entirely managed

code: we can remove the browser from RePriv’s trusted

computing base by isolating RePriv extensions in an App-

Domain and leveraging .Net’s freedom from common

exploits such as buffer overﬂows. Obtaining such a strong

security guarantee in other browsers is at best very chal-

lenging. Second, the document construction hook makes

it trivial for RePriv to install the

onload

event handler.

Third, AppDomains ensure the memory isolation of every

miner from each other and from the DOM of the doc-

ument, except as mediated by RePriv’s own APIs; this

makes proving the desired security properties much eas-

ier. Finally, RePriv uses Fine [

] for writing its interest

miners; since C3, RePriv and Fine target .Net, RePriv can

reuse .Net’s assembly-loading mechanisms.

4.3 Other extension models

4.3.1 Extensions to application UI

Internet Explorer 4.0 introduced two extension points per-

mitting customized toolbars (Explorer Bars) and context-

menu entries. These extensions were written in native C++

code, had full access to the browser’s internal DOM repre-

sentations, and could implement essentially any function-

ality they chose. Unsurprisingly, early extensions often

compromised the browser’s security and stability. IE 8

later introduced two new extension points that permit-

ted self-updating bookmarks of web-page snippets (Web

Slices) and context-menu items to speed access to repeti-

tive tasks (Accelerators), providing safer implementations

of common uses for Explorer Bars and context menus.

The majority of IE’s interface is not modiﬁable by ex-

tensions. By contrast, Firefox explored the possibility

that entire application interfaces could be implemented

in a markup language, and that a declarative extension

mechanism could overlay those UIs with new construc-

tions. Research projects such as Perspectives change the

way Firefox’s SSL connection errors are presented, while

others such as Xmarks or Weave synchronize bookmarks

and user settings between multiple browsers. The UI for

these extensions is written in precisely the same declar-

ative way as Firefox’s own UI, making it as simple to

extend Firefox’s browser UI as it is to design any website.

But the single most compelling feature of these ex-

tensions is also their greatest weakness: they permit im-

plementing features that were never anticipated by the

browser designers. End users can then install multiple

such extensions, thereby losing any assurance that the

composite browser is stable, or even that the extensions

are compatible with each other. Indeed, Chrome’s care-

fully curtailed extension model is largely a reaction to the

instabilities often seen with Firefox extensions. Chrome

permits extensions only minimal change to the browser’s

UI, and prevents interactions between extensions. For

comparison, Chrome directly implements bookmarks

and settings synchronization, and now permits extension

context-menu actions, but the Perspectives behavior re-

mains unimplementable by design.

Our design for overlays is based strongly on Firefox’s

declarative approach, but provides stronger semantics for

overlays so that we can detect and either prevent or correct

conﬂicts between multiple extensions. We also general-

ized several details of Firefox’s overlay mechanism for

greater convenience, without sacriﬁcing its analyzability.

4.3.2 Extensions to scripts

In tandem with the UI extensions, almost the entirety

of Firefox’s UI behaviors are driven by JS, and again

extensions can manipulate those scripts to customize

those behaviors. A similar ability lets extensions modify

or inject scripts within web pages. Extensions such as

LastTab change the tab-switching order from cyclic to

most-recently-used, while others such as Ghostery block

so-called “web tracking bugs” from executing. Firefox

exposes a huge API, opening basically the entire plat-

form to extension scripts. This ﬂexibility also poses a

problem: multiple extensions may attempt to modify the

same scripts, often leading to broken or partially-modiﬁed

scripts with unpredictable consequences.

Modern browser extension design, like Firefox’s Jet-

pack or Chrome’s extensions, are typically developed

using HTML, JS, and CSS. While Firefox “jetpacks” are

currently still fully-privileged, Chrome extensions run

in a sandboxed process. Chrome extensions cannot ac-

cess privileged information and cannot crash or hang the

browser. While these new guarantees are necessary for the

stability of a commercial system protecting valuable user

information, they also restrict the power of extensions.

One attempt to curtail these scripts’ interactions with

each other within web pages is the Fine project [

]. In-

stead of directly using JS, the authors use a dependently-

typed programming language to express the precise read-

and write-sets of extension scripts, and a security policy

constrains the information ﬂow between them. Exten-

sions that satisfy the security policy are provably non-

conﬂicting. The Fine project can target C3 easily, either

by compiling its scripts to .Net assemblies and loading

them dynamically (by subclassing the

hscript/i

tag), or

by statically compiling its scripts to JS and dynamically

injecting them into web content (via the JS global-object

hook). Guha et al. successfully ported twenty Chrome

extensions to Fine and compiled them to run on C3 with

minimal developer effort.

As mentioned earlier, C3 includes our prior work on

aspect-oriented programming for JS [

], permitting ex-

tensions clearer language mechanisms to express how

their modiﬁcations apply to existing code. Beyond the

performance gains and clarity improvements, by elimi-

nating the need for brittle mechanisms and exposing the

intent of the extension, compatibility analyses between

extensions become feasible.

4.4 Security considerations

Of the ﬁve implemented extension points, two are written

in .Net and have full access to our DOM internals. In

particular, new DOM nodes or new JS runtime objects

that subclass our implementation may use protected DOM

ﬁelds inappropriately and violate the same-origin policy.

We view this ﬂexibility as both an asset and a liability:

it permits researchers to experiment with alternatives to

the SOP, or to prototype enhancements to HTML and

the DOM. At the same time, we do not advocate these

extensions for web-scale use. The remaining extension

points are either limited to safe, narrow .Net interfaces

or are written in HTML and JS and inherently subject to

the SOP. Sanitizing potentially unsafe .Net extensions to

preserve the SOP is itself an interesting research problem.

Possible approaches include using .Net AppDomains to

segregate extensions from the main DOM, or static analy-

ses to exclude unsafe accesses to DOM internals.

5 Future work

We have focused so far on the abilities extensions have

within our system. However, the more powerful exten-

sions become, the more likely they are to conﬂict with one

another. Certain extension points are easily amenable to

conﬂict detection; for example, two parser tag extensions

cannot both contribute the same new tag name. However,

in previous work we have shown that deﬁning conﬂicts

precisely between overlay extensions, or between JS run-

time extensions, is a more challenging task [9] .

Assuming a suitable notion of extension conﬂict exists

for each extension type, it falls to the extension loading

mechanism to ensure that, whenever possible, conﬂicting

extensions are not loaded. In some ways this is very sim-

ilar to the job of a compile-time linker, ensuring that all

modules are compatible before producing the executable

image. Such load-time prevention gives users a much bet-

ter experience than in current browsers, where problems

never surface until runtime. However not all conﬂicts are

detectable statically, and so some runtime mechanism is

still needed to detect conﬂict, blame the offending exten-

sion, and prevent the conﬂict from recurring.

6 Conclusion

We presented C3, a platform implementing of HTML,

CSS and JS, and explored how its design was tuned for

easy reconﬁguration and runtime extension. We presented

several motivating examples for each extension point,

and conﬁrmed that our design is at least as expressive as

existing extension systems, supporting current extensions

as well as new ones not previously possible.

References

[1]

BARTH, A., FELT, A. P., SAXENA, P., AND BOODMAN, A.

Protecting browsers from extension vulnerabilities. In NDSS

(2010).

[2]

BARTH, A., WEINBERGER, J., AND SONG, D. Cross-origin

JavaScript capability leaks: Detection, exploitation, and defense.

In SSYM’09: Proceedings of the 18th conference on USENIX secu-

rity symposium (Berkeley, CA, USA, 2009), USENIX Association,

pp. 187–198.

[3]

BEBENITA, M., BRANDNER, F., FAHNDRICH, M., LOGOZZO,

F., SCHULTE, W., TILLMANN, N., AND VENTER, H. SPUR:

A trace-based JIT compiler for CIL. In OOPSLA/SPLASH ’10:

Proceedings of the 25th ACM SIGPLAN conference on Object-

Oriented Programming Systems, Languages and Applications

(New York, NY, USA, 2010), ACM.

[4]

AHNDRICH, M., BARNETT, M., AND LOGOZZO, F. Embedded

contract languages. In SAC ’10: Proceedings of the 2010 ACM

Symposium on Applied Computing (New York, NY, USA, 2010),

ACM, pp. 2103–2110.

[5]

FREDRIKSON, M., AND LIVSHITS, B. RePriv: Re-envisioning

in-browser privacy. Tech. rep., Microsoft Research, Aug. 2010.

[6]

GUHA, A., FREDRIKSON, M., LIVSHITS, B., AND SWAMY, N.

Veriﬁed security for browser extensions. MSR-TR to be available

11/01, September 2010.

[7]

JACKSON, C., AND BARTH, A. Beware of ﬁner-grained origins.

In In Web 2.0 Security and Privacy (W2SP 2008) (2008).

[8]

JONES, C. G., LIU, R., MEYEROVICH, L., ASANOVIC, K.,

AND BOD

IK, R. Parallelizing the Web Browser. In HotPar ’09:

Proceedings of the Workshop on Hot Topics in Parallelism (March

2009), USENIX.

[9]

LERNER, B. S., AND GROSSMAN, D. Language support for

extensible web browsers. In APLWACA ’10: Proceedings of the

2010 Workshop on Analysis and Programming Languages for Web

Applications and Cloud Applications (New York, NY, USA, 2010),

ACM, pp. 39–43.

[10]

LERNER, B. S., VENTER, H., AND GROSSMAN, D. Support-

ing dynamic, third-party code customizations in JavaScript using

aspects. In OOPSLA ’10: Companion of the 25th annual ACM

SIGPLAN conference on Object-oriented programming, systems,

languages, and applications (New York, NY, USA, 2010), ACM.

[11]

MEYEROVICH, L. A., AND BODIK, R. Fast and parallel webpage

layout. In Proceedings of the 19th International Conference on

the World Wide Web (2010), WWW ’10, pp. 711–720.

[12]

RICHARDSON, D. W., AND GRIBBLE, S. D. Maverick: Pro-

viding web applications with safe and ﬂexible access to local

devices. In Proceedings of the 2011 USENIX Conference on Web

Application Development (June 2011), WebApps’11.

[13] RUDERMAN, J. Same origin policy for javascript, Oct. 2010.

[14]

SONS, K., KLEIN, F., RUBINSTEIN, D., BYELOZYOROV, S.,

AND SLUSALLEK, P. XML3D: interactive 3d graphics for the

web. In Web3D ’10: Proceedings of the 15th International Confer-

ence on Web 3D Technology (New York, NY, USA, 2010), ACM,

pp. 175–184.

[15]

WAGNER, G., GAL, A., WIMMER, C., EICH, B., AND FRANZ,

M. Compartmental memory management in a modern web

browser. In Proceedings of the International Symposium on Mem-

ory Management (June 2011), ACM. To appear.

[16]

WENDLANDT, D., ANDERSEN, D. G., AND PERRIG, A. Per-

spectives: Improving ssh-style host authentication with multi-path

probing. In Proceedings of the USENIX Annual Technical Confer-

ence (Usenix ATC) (June 2008).

[17]

YEE, B., SEHR, D., DARDYK, G., CHEN, J., MUTH, R., OR-

MANDY, T., OKASAKA, S., NARULA, N., AND FULLAGAR, N.

Native client: A sandbox for portable, untrusted x86 native code.

In Security and Privacy, 2009 30th IEEE Symposium on (May

2009), pp. 79 –93.