In August 2003, a jury awarded Eolas Technologies Inc. more than
$500 million in damages for Microsoft's alleged infringement of US Patent No.
5,838,906. The '906 patent, which covers a method of embedding interactive
content in Web pages, is unfortunately the type of patent that contributes to
the negative reputation of software patents and, by extension, the USPTO.
Moments after the ubiquitous anti-Microsoft celebration ended, the realization
sunk in: this is not good for the Web.
In response, Tim Berners-Lee, father of the Web, author of the
initial HTTP and HTML specifications and the director of the W3C, and others in
the community urged the USPTO to re-examine the patent in light of a set of
prior art herein referred to as the Raggett references. Berners-Lee cautioned
that the patent would cause "substantial economic and technical damage to the
operation of the World Wide Web," the impact of which "...will be felt not only
by those who are alleged to directly infringe, but all whose web pages and
applications rely on the stable, standards-based operation of browsers
threatened by this patent."[1]
Like the '906 patent, the Raggett references teach a method of
embedding interactive content in Web pages using interactive external
applications. Based upon the Raggett references, the USPTO re-examined the '906
patent and issued an office action that rejected the '906 claims as being
unpatentable.
In response, Eolas prepared and submitted rebuttal arguments to
the USPTO. The bulk of the technical argument was presented through Dr. Edward
Felten, a professor at Princeton and the paid expert for Eolas during the
trial. While Dr. Felten framed his arguments in patent jargon, particularly
with regard to the understanding of those "having ordinary skill in the art,"
in reality, the acceptance of his arguments required that one in the art
suspend the application of common sense. Moreover, Dr. Felten's arguments as to
how one of skill would have understood Dave Raggett's publications seemed
directly contrary to how those working on the Web actually did understand Dave
Raggett's words at the time of their publication.
Unfortunately, those who were concerned about the impact this
patent could have on the Web may have reason to be concerned about the
subsequent events in the USPTO. Despite the plain technical disclosures of the
Raggett references, the USPTO returned a second office action which, although
it upheld the rejection[2] of the '906
patent claims, seemed swayed at least in part by Dr. Felten's arguments. Eolas
and Dr. Felten now seek to build on that through another recent filing in the
USPTO that again seeks to suspend the application of common sense to any
technical analysis. Those who urged this reexamination should be concerned that
the USPTO may act upon this one-sided record.
The following is a summary of the main points addressed in the
document. The summary outlines each topic in this document, and each topic
links to the sections of the document that describe it thoroughly. The intent
is that the reader of this document is able to digest the arguments quickly,
but at any time may drill down into a given argument to access a more rigorous
discussion.
1) Background (Section III) - Patent
infringement disputes tend to make things more complicated than they need to
be. Because of the potential for confusion, it is often a useful exercise to
take a step back from the problem and apply a measure of common sense.
A large portion of Eolas' argument is based on the interactivity provided by
the VIS application, however, the external application - including the type and
amount of interactivity it provides - is not claimed by the '906 patent.
Interactivity on the Web existed before the '906 patent.
Mosaic helper applications provided the same type and amount of interactivity
as well as the ability for the browser to support non-native formats using
external applications. The difference between helper applications and the '906
patent was simply where the output of the programs was displayed.
Methods for embedding interactive objects were well known in 1993.
Raggett teaches a method for object level embedding enabling the in-line
display of foreign data formats without modification of the browser.
Raggett teaches an EMBED tag, including a type attribute identical to that
described in the '906 patent.
Raggett teaches the use of external applications to provide interactivity with
embedded objects.
Those of skill in the art understood the teachings of Raggett as a method to
embed fully interactive content within Web pages.
Methods for displaying the output of applications in any window, as well as
methods for any client program to reparent another client program's main window
inside of its own window were functions of the X-Window system and were present
since its inception in 1984. Furthermore, these methods were used by programs
in and before 1993.
Mark Andreessen proposed using external applications to embed MIME documents in
browser windows in 1993, including embedded controls that enabled interactivity
with the embedded object.
Those of skill in the art understood the potential to embed interactive content
such as drawing objects, animations, spreadsheets, message objects, and
clickable areas within images.
2) Felten I (Section IV) - The
Felten I declaration presents two major arguments in an attempt to convince the
USPTO that the first office action is incorrect and the '906 patent should
stand. First, Dr. Felten contends that the Raggett references are a "slight"
extension of Berners-Lee's ability to display static images, and therefore
teach away from interactivity.[3] Second,
he argues that Raggett doesn't teach interactivity within a browser window, and
that this type of interactivity is claimed by the '906 patents.[4]
Browsers in this time period provided the same level and types of interactivity
as the '906 invention. Simply placing the output of external applications in
the browser window did not change the level of interactivity on the Web.
The '906 patent claims were adjusted to distinguish the invention from existing
interactivity on the Web. The limitations that the application providing
interactive processing must be external to the browser, running on the client
computer, and display its output within the browser window were added to
overcome forms, scripts, and helper applications.
Interactivity within Web browsers was known in 1993.
Dr. Felten describes Mosaic's ability to launch an external HTML editor in
order to edit the contents of a Web page. His argument is misleading; the
external application must enable interactive processing with the embedded
object, not the Web page itself.
Dr. Felten is also incorrect in asserting that there was no realistic way to
edit a Web page from a client workstation. This technology existed in 1993 and
is a popular feature of the Web today.
The use of static images, including rendering an image of an object
server-side, does not preclude interactivity.
Dr. Felten misleadingly applies the term static to the input and output of the
external application without regard to the fact that the '906 invention uses
the same "static" object data and image data.
The '906 patent does not claim enabling interactive processing of the embedded
object by clicking the object's display window inside the browser.
The Raggett references teach provision of interactivity within a browser
window, support for non-native data types using external applications, addition
of support for new types without modifying the browser, and the use of the
EMBED tag, including MIME content-type information.
Dr. Felten's arguments against Raggett describing server-side rendering are
misleading; his arguments against Raggett also describe how the '906 invention
works.
Although Dr. Felten declares the provision of "Rich Interactivity," a
differentiator, the type and level of interactivity and the external
application providing that interactivity are not claimed by the '906 patent.
Embedding interactive content and enabling interactivity within the browser
window was well known by those of skill in the art.
Contrary to Dr. Felten's declaration that Berners-Lee was never implemented,
Mosaic, admitted prior art, embodies the portions of the Berners-Lee reference
that pertain to the '906 patent.
Dr. Felten's argument that Raggett teaches away from interactivity is based on
highly-selective definitions of the terms filter, render, and return.
Filters do not preclude interactivity.
Filtering is consistent with the '906 patent.
Although Dr. Felten limits his discussion to filters, Raggett taught DLLs,
shared libraries, and separate programs in addition to filters.
The use of the terms "render" and "static images" do not preclude
interactivity. In fact, they are consistent with the '906 patent.
Dr. Felten defines "return" in a manner other than that known in the art.
"Returning" a rendered image does not preclude interactivity, and is consistent
with the '906 invention.
Raggett teaches embedded interactivity, as well as types of "non-static" data
such as MPEG movies. This was understood by those of skill in the art.
Dr. Felten falsely declares that interactivity must be provided with the
embedded object.
It is not important whether or not Raggett was implemented; it was disclosed
to, and well understood by, those in the field. Furthermore, '906 patent admits
that implementing these features is conventional and within the ability of a
person of skill in the art.
One of skill would not implement Raggett to handle only single frame image
data. Raggett teaches supporting content types registered in the MIME database.
An implementer would implement a type-agnostic support mechanism to handle any
MIME-compliant data and be general enough to accommodate future MIME data. The
Raggett references support this statement by describing APIs and binding and
calling mechanisms.
At the time, MPEG movies were a registered MIME content type, so an implementer
would at least include support for multi-frame MPEG movies.
These ideas were understood by those of skill in the art.
Raggett teaches more than the ability to embed equations and postscript
documents. Raggett teaches support for data types specified by MIME content
types. At the time, MIME included support for more applications than eqn and
postscript, and also specified that a MIME compliant application should be
scalable to accommodate future registered MIME types.
Dr. Felten misleadingly describes the importance of windows DLLs and
inter-process communication. The model he describes as an argument against
Raggett is consistent with the invention described in the'906 patent.
3) Felten II (Section V) - Dr.
Felten declares that the combination of the Berners-Lee and Raggett references
does not teach the claim element of automatically invoking an executable
application to enable interactive processing. He is incorrect. Furthermore, the
Toye reference also teaches this claim element.
Dr. Felten misleadingly restricts Berners-Lee to "static pages." In reality, an
implementation of Berners-Lee, Mosaic, provided the same type and level of
interactivity as the '906 patent.
Dr. Felten misleadingly defines "distributed hypermedia environment" to support
a future argument that Toye does not teach a hypermedia environment.
Dr. Felten misleadingly restricts hyperlinks to "simple, unidirectional
hyperlinks" to support a future argument that Toye does not teach a hypermedia
browser. Hyperlinks are not limited in direction. Furthermore, the '906 patent
does not claim any kind of hyperlink.
c) Toye (Section V.c) - Assuming
that the Raggett references lack the claim element of automatically invoking an
executable application to enable interactive processing of an embedded object,
the '906 invention is unpatentable over a combination of Berners-Lee, Raggett
I, Raggett II, and Toye. Any arguments against Toye that don't pertain to this
claim element are irrelevant, as they are taught by the Berners-Lee and Raggett
references.
Use of centralized servers for content storage is consistent with the idea of a
distributed environment. For example, Web sites may be located on a single
server, but it is the presence of disparate servers that makes the Web a
distributed environment.
Toye does teach distributed documents within distributed hypermedia
environments.
This claim element is taught by Berners-Lee and Raggett.
Dr. Felten's argument is misleading. It does not matter that Toye teaches more
than the claims, the fact is that it teaches enabling interactive processing of
an embedded object.
Furthermore, the goal of NoteMail, "a tool for organizing, manipulating, and
viewing information on the Internet. Foreseeable enhancements include: a low
level hypermedia browser... and a simple pen-based interface for manipulating
objects" is consistent with the '906 patent.
The Toye reference teaches automatic invocation to display the embedded object.
As stated by Dr. Felten in his testimony, the patent does not claim that
interactivity must be provided without a mouse click.
Toye's method of interactive processing is consistent with the '906 patent.
Toye supports "seamless embedding" which Dr. Felten referred to as a principal
benefit of the '906 patent.
Patent infringement disputes tend to make things more complicated
than they need to be. Because of the potential for confusion, it is often a
useful exercise to take a step back from the problem and apply a measure of
common sense.
a. Eolas' focus on "rich interactivity" is
misleading [^>]
The Eolas team focuses the bulk of their argument on what they
refer to as "rich interactivity" provided by the VIS[5]
application.[6] Although VIS was an
impressive application for its time, Eolas' focus on interactivity is a
smokescreen that obscures the fact that the patent doesn't claim specific
implementations of external applications that provide interactivity, nor does
it require any specific degree of interactivity with the embedded object. In
reality, interactivity on the Web existed before the '906 patent, and continues
to exist independently of the '906 patent today.
While the '906 patent claims a browser that launches an
executable application, the application itself is not a part of Eolas' claimed
invention. This fact is supported by Dr. Felten's direct testimony during the
trial.
Q. Okay. Does infringement depend on the nature of the executable
application that is run when the browser runs?
A. It does not depend on the executable application that's invoked by
any particular user. [Felten Testimony at p. 1128, 7-10]
---
Q. In your opinion, is there any need to disclose that information in
order to make and use the browser that is disclosed in the '906 patent?
A. ...there's no need for the person who wants to implement a browser
to know how those other applications work.
Q. And why is that, doctor?
A. Because the description that's in the patent and the appendices is
enough to tell the person who wants to build a browser how to do it, and those
other applications are separate programs. [Felten Testimony at p. 946, 7-18]
This is an important distinction. In the most commonly cited
embodiment of the '906 patent, it is VIS that renders views of the three
dimensional data and presents them to the user.[7]
It is VIS that generates the displayed object and allows interactivity with the
object via an external control panel.[8] Although
VIS is an external application, and therefore not claimed by the '906 patent,[9][10]
it is precisely the "rich interactivity" provided by VIS that is now being
argued by advocates of the '906 patent to distract the USPTO from the invention
actually claimed by the patent.[11]
Multidimensional, real-time image manipulation software was
neither novel nor revolutionary in 1994. Despite limitations such as
insufficient bandwidth and computational capacity,[12]
a variety of visualization techniques existed prior to the '906 patent. In the
section of the '906 patent describing the background of the invention, Doyle
states:
"Much digital data available today exists in the form of
high-resolution multi-dimensional images (e.g., three dimensional images) which
is viewed on a computer while allowing the user to perform real time viewing
transformations on the data in order for the user to better understand the
data." ['906 at Col 5, 60 - 65]
The patent describes the field of medical imaging, specifically
referring to applications that embody "a variety of visualization techniques
and real time computer graphics methods," examples of which are applications
that present Magnetic Resonance Imaging (MRI) and Computer Tomography (CT)
information to the user using a visualization technique.
"An example of such type of data is in medical imaging where advanced
scanning devices, such as Magnetic Resonance Imaging (MRI) and Computed
Tomography (CT), are widely used in the fields of medicine, quality assurance
and meteorology to present physicians, technicians and meteorologists with
large amounts of data in an efficient way. Because visualization of the data is
the best way for a user to grasp the data's meaning, a variety of visualization
techniques and real time computer graphics methods have been developed." ['906
at Col 5, 66 - Col 6, 8]
Despite the fact that image manipulation software was well-known,
and despite the fact that the type of interactivity was inconsequential with
regard to the patent claims, Dr. Felten focuses primarily on this provision of
"rich interactivity" as a differentiator between the '906 invention and the
prior art. Briefly stated, the '906 patent claims a method for running an
application program in a computer network environment, comprising[13]:
A workstation and server coupled to a "distributed hypermedia environment."
Executing on the client workstation, a browser application that parses a
distributed hypermedia document to:
identify text formats in the document
respond to the text formats to initiate processing specified by the formats
display, on the client workstation, a portion of the hypermedia document within
a browser-controlled window
wherein said hypermedia document includes an embed text format that
specifies a location of an object external to the hypermedia document, the
object having:
type information used to identify and locate an executable application external
to the hypermedia document
wherein the embed text format is parsed by the browser to automatically invoke
the external application to:
execute on the client workstation and
display said object and
enable interactive processing of said object within a display area created at
the location within the portion of the hypermedia document displayed in the
browser controlled window.[14]
The fact is that there were admitted prior art browsers, e.g.,
Mosaic, that embodied the first seven elements listed above, and the Raggett
references specifically suggest modifying such browsers to include the final
six elements. Nonetheless, the '906 patent advocates focus their rebuttal
argument on the behavior of an external application that was known in the art
but not claimed by the patent.
Although originally created for Internet mailing systems,
Multipurpose Internet Mail Extensions (MIME) was adopted by the Web community
as a means of client-side data format negotiation.[15][16][17][18]
The heart and soul of the MIME specification was the ability to "...represent
non-textual material such as images and audio fragments, and generally to
facilitate later extensions defining new types of Internet mail for use by
cooperating mail agents."[19] To achieve
this goal, the MIME specification defined a Content-Type field, the purpose of
which was "to describe the data... fully enough that the receiving user agent
can pick an appropriate agent or mechanism to present the data to the user, or
otherwise deal with the data in an appropriate manner."[20]
An initial set of Content-Types was defined by RFC1521. MIME was
carefully designed as an extensible mechanism, with the expectation that the
set of content-types would increase significantly over time.[21]
At the time, it was expected that additions to the larger set of supported
types could generally be accomplished by the creation of new subtypes of these
initial types. In order to ensure that the official set of content types was
developed in an orderly, well-specified, and public manner, MIME defined a
registration process which uses the Internet Assigned Numbers Authority (IANA)
as a central type registry.[22]
The following are examples of content types in 1993:
image -- image data. Image requires a display device (such as a graphical
display, a printer, or a FAX machine) to view the information. Initial subtypes
are defined for two widely-used image formats, jpeg and gif.
audio -- audio data, with initial subtype "basic". Audio requires an audio
output device (such as a speaker or a telephone) to "display" the contents.
video -- video data. Video requires the capability to display moving images,
typically including specialized hardware and software. The initial subtype is
"mpeg".
application -- some other kind of data, typically either uninterpreted binary
data or information to be processed by a mail-based application. The primary
subtype, "octet-stream", is to be used in the case of uninterpreted binary
data, in which case the simplest recommended action is to offer to write the
information into a file for the user.
Content types had sub-types that represented the format of the
type of data that was being provided. Examples of content-type/sub-type
combinations that were defined in RFC1251 were image/jpeg, audio/basic,
video/mpeg, and application/octet-stream. A content-type value beginning with
the characters "X-" represents a private (non-registered) value that could only
be used by consenting systems by mutual agreement.[23]
Prior to the '906 patent, the content-type/sub-type concept was
used by the Web community as a means for identifying data formats so that they
could be appropriately processed by the client computer. Mosaic, for example,
used MIME types to handle both native and non-native data.
"Based on a file's MIME type, Mosaic knows whether to handle the file
itself (i.e., if it is an HTML file) or to pass it off to an external viewer.
Some servers (e.g., FTP servers) do not use MIME types. In that case,
Mosaic tries to infer the MIME type from the file extension. For example, a
file ending in .gif is assumed to be a GIF file; one that ends in .html is
assumed to be an HTML file.
You are able to modify the MIME types in the following ways:
change which external viewer is used for a given MIME type
define a new MIME type for a new file type
add a new file extension and specify which MIME type it corresponds to"
[Mosaic User Guide]
MIME content types were used by Raggett in the EMBED tag of his
HTML+ specification to identify types of embedded data so that the appropriate
application could be invoked to embed that data in the page.[24][25]
MIME content types were also used in the EMBED tag of the '906 invention.[26]
c. Helper Applications provided the same
type and level of interactivity as the '906 patent [<^>]
Prior to the '906 patent[27],
Mosaic provided a means by which a user could add support for data formats
without modifying the browser's code.[28]
Helper applications, as they were known, allowed the same kind and types of
interactivity as the '906 patent, with two distinctions: 1) applications were
not launched automatically, and 2) applications displayed their data in
external windows.
In Mosaic, users could map external applications to non-native
data content types in a manner such that said applications were invoked to
process the data in response to the user clicking a hyperlink corresponding to
the data object.[29] These "helper"
applications enabled browsers to view any type of content available on the
Internet (provided that a viewer for it existed), and to accommodate future
formats and enhancements without modifications to the browser.[30]
Mosaic's mechanism for determining data formats and mapping them
to their appropriate helper applications was built on MIME's content type
specification.
"Based on a file's MIME type, Mosaic knows whether to handle the file
itself (i.e., if it is an HTML file) or to pass it off to an external viewer.
Some servers (e.g., FTP servers) do not use MIME types. In that case,
Mosaic tries to infer the MIME type from the file extension. For example, a
file ending in .gif is assumed to be a GIF file; one that ends in .html is
assumed to be an HTML file."
[Mosaic Users Guide]
The main advantage of helper applications was that support for
new data types was not built into a browser release or tied to a standards
upgrade. A Mosaic user could add new mappings of file types to helper
applications without modifying the browser code by editing two configuration
files: the mailcap file and the extension map file.[31]
The mailcap file is a user configurable file that maps MIME types
to external applications. To associate a helper application with mpeg video
data, for example, the user would add the following line to her mailcap file:
"video/mpeg; mpeg_play %s" where "video/mpeg" is the MIME content type and
subtype, "mpeg_play" is the helper application that can handle mpeg data, and
"%s" is a variable that passes the name of the data object to the helper
application.[32]
The extension map file is a user configurable file that maintains
a mapping of MIME types to file extensions. To associate a file extension to a
particular MIME type, the user would add an entry into her extension map file
that lists the MIME type followed by one or more file extensions that are
associated with files of that MIME type. To associate the mpeg file extensions
to their MIME type, for example, the user would add the following line to her
extension map file: "video/mpeg mpeg mpg mpe" where "video/mpeg" is the MIME
content type and sub-type and "mpeg" "mpg" and "mpe" are the common file
extensions for mpeg video files.[33]
After these files are properly configured, when the user clicks
on a hyperlink whose target is an mpeg file, e.g., file.mpeg, Mosaic uses the
extension map to determine the file's MIME type, "video/mpeg." Using the
mailcap file, Mosaic will construct the command used to launch the helper
application, "mpeg_play file.mpeg," and execute the command on the system. The
execution of this command invokes the mpeg_play executable, passing it the
movie file, file.mpeg.
At the time of the '906 patent, the VIS application existed as a
stand-alone program that could be used by Mosaic as a helper application to
view multi-dimensional data objects. In its non-embedded form, VIS displayed a
control panel in a separate window. [See Figure 1] To use VIS as a helper
application, a user would associate the VIS application with the
application/x-vis MIME type, and associate the MIME type with the file
extension ".hdf" in her mailcap and extension map files, respectively.[34][35]
From the user's perspective, the '906 invention was not a great
leap forward in browser technology. There were just two differences between VIS
as a helper application and VIS as described in the '906 invention. First, in
the '906 invention, the user interacted with a control panel similar to the one
shown on the right in Figure 1 and the visual output of the program was
displayed within the browser window. As a helper application, the embryo
visualization is displayed in an external window rather than a window embedded
inside the Web page.[36][37]
Second, a browser embodying the '906 claims would launch VIS
automatically when the page was loaded, whereas a user must click a link to
launch the helper application.[38] As a
technical innovation, removing the mouse click and displaying the output of the
program in a different window was in no way groundbreaking.
Methods for embedding external applications in arbitrary sub
windows and enabling interactive processing within those sub windows were well
known in the art and were implemented by many popular applications well before
1993. [See Section III.e]
On July 23, 1993, Dave Raggett proposed a superset of the HTML
specification that he called HTML+.[39] In
this document, Raggett specifies a mechanism for providing a form of object
level embedding to enable the in-line display of foreign data formats without
modification of the browser.[40][41]
In the specification, Raggett proposed a new tag, EMBED, which contained type
information used by the browser to identify the appropriate external
application to render the embedded data.[42][43]
Raggett indicated that external applications could be implemented as separate
programs, DLLs, shared libraries, or filters.[44][45]
Browsers can then be upgraded to display new formats without changing
their code at all. All you would need is a way of binding the MIME content type
to the function name for that format, e.g. via X resources or a config file.
The functions could be implemented as separate programs driven via pipes and
stdin/stdout or as dynamically linked library modules (Windows DLLs). [Raggett
II]
Like Mosaic helper applications and the '906 invention, HTML+
leveraged the MIME content-type and sub-type specification to determine how to
appropriately process data of various types. Also like Mosaic, Raggett
indicated that the MIME mappings are user configurable resources, e.g., using a
configuration file. Because Raggett based his content negotiation scheme on the
MIME standard, HTML+ had the same extensibility for future formats as MIME.[46][47][48]
"The type attribute specifies a registered MIME content type
and is used by the browser to identify the appropriate shared library or
external filter to use to render the embedded data..." [Raggett I at p. 6,
emphasis in original]
---
"This approach avoids the danger of HTML being continuously extended
to support an every increasing variety of needs. By decoupling HTML from
special purpose formats, I believe that the latter can evolve faster and more
effectively, than if they were tied to revisions to HTML itself.
Raggett further explained that, in sophisticated browsers,
interactivity with the embedded object could be provided through the use of
external editors ("for creating or revising embedded data"),[49]
and the foreign data could be located in a separate file referenced by a URL.[50]
It was clear that the Web community understood the intention of
Raggett: to leverage external applications to enable any (present and
future) data to be embedded in a browser window and, when appropriate, allow
interaction with that embedded data. In response to Raggett II, Bill Janssen
presented an implementation that leverages the extensibility of the X-Window
system.
Yes, that sounds good. My favorite way to handle this is to have the
browser create and manage an X sub-window over the area where the inset is to
be displayed, and pass the window ID of the sub-window to the subprogram which
understands the inset format, with the understanding that that program is to
handle all events and refresh on the sub-window, but the browser gets to handle
configuration and window movement.
Furthermore, displaying content in an arbitrary window was a
trivial matter, and was well known in the art.[51][52][53]
In his testimony at trial, Dr. Felten states that it would be obvious how to
implement not only this feature but also enabling interaction with the embedded
object within the browser window.
Q. What significance does that have to one of ordinary skill in the
art in terms of how you to build these controls either in a separate panel
outside of the browser window or within the browser window or within the image
window?
A. Motif is a big software package that lets people write interactive
graphical programs, and the Motif Programming Manual is a thick book that
describes all the different facilities and programs that you might use to do
that. So one of skill in the art reading this would recognize that the
possibilities and the methods that are discussed in the Motif Programming
Manual could be used in this invention, and the facilities in Motif allow you
to place the display areas wherever you like.
[Felten Testimony at p. 953, 15 - 954, 1]
As quoted above, Bill Janssen showed that this idea was not only
well-known, but was well-known as a preferred method for embedding interactive
content in Web pages. In this post, dated June 14, 1993, Janssen demonstrated
that the mechanism to display embedded data was well known, i.e., by displaying
embedded content using X sub-windows to manage display areas in browser windows
and passing the window ID of the sub-window to the external application that
renders the data.[54] The phrase,
"...the program is to handle all events and refresh on the sub-window..."
indicates that a method of enabling interactive processing of an object within
the embedded window was also known.[55]
For example, mouse clicks in windows generate events that are passed to the
owner of the window, in this case the external application.[56]
Janssen demonstrates that the external application can monitor the sub-window
for events, e.g., mouse clicks, keyboard strokes, etc., and will "refresh on
the sub-window" or update the display accordingly. Note that Janssen's method
for providing support for interactive embedded content within a Web page is
consistent with that of the '906 invention as described in the '906 patent and
Dr. Felten's testimony.[57][58]
Janssen's preferred method of sharing sub window IDs between
client programs leveraged a standard capability of the X-Window system well
before 1993. From its inception, the X-Window system was built on an open,
device-independent, distributed architecture that provides many fundamental
advantages over systems in which strict user interface guidelines were
enforced. In X-Window, one application, the X server, is responsible for all
input and output devices, creating and manipulating the windows on the screen,
and producing text and graphics.[59] Each
window is a resource of the server and is identified by a unique window ID.[60]
The X server maintains these and other resources and allows client programs to
use and share them transparently. [61]
According to Scheifler, "The X protocol does not define an
external user interface at all."[62] Rather,
the protocol provides mechanisms with which a variety of external interfaces
can be built. This lack of policy consequently enables client programs to
control the resources of other client programs in any way they see fit.
Although the server creates each window at the request of a specific client,
any client can request the server to manipulate the window provided it has
access to the window's ID.[63]
For example, a single client, the window manager, can provide the
external user interface independent of all the other clients.
[64] When client programs are launched, the window manager typically
creates a new window to which it adds title bars, borders, uniform icons, and a
uniform means of moving, resizing, iconifying, and closing the window.[65]
The window manager then uses the XReparentWindow function to frame the client
program's top-level window (and, therefore all of its children) inside the
newly created window. [66][67]
As a result, the client program is seamlessly embedded in the window manager's
control window, and, while it handles its own events and display, it is also
controlled by its parent window that can resize it, minimize it, or terminate
it at its discretion. Window managers prior to 1993, including the default
window manager for the X-Window system, used this method to provide a
consistent graphical user interface independent of the design or capability of
the client programs.
The reparenting mechanism works with any client program, whether
or not it has knowledge of any specific window manager. In fact, client
programs in the X Window system should be written in a manner sufficiently
general to accommodate any type of window manager including no window manager
at all.[68] Although typically used by
the window manager, XReparentWindow is available to all client programs. This
is due to the policy that, in X, "a window manager is a client, no different
than any other client" [69] A Web
browser, for example, could be easily modified to reparent an external
application into an embedded sub window.
In 1993, there were many other well-known means by which one
could use the features of the X-Window system to display the visual output of
one client program into a window owned by another client program. For example,
a rendering program could obtain the target window's ID either through
inter-process communication or by leveraging the resources maintained by the X
server.[70] Once the ID of the target
window was obtained, the rendering program could modify the contents of that
window however it wished.[71]
Moreover, it was understood by those skilled in the art that this
functionality could be leveraged to provide distributed, interactive
processing. In his book, "The X Window System, Programming and Applications
with Xt," Young describes an example of an application that teaches the use of
interactive content in a distributed X-Window environment. In his example, an
interactive teaching program executing on a school's main computer can display
information on personal computers located at each student's desk. Each student
interacts with the program through a window on his or her local machine,
wherein this window is owned by the client application. The same program is
also connected to another display located at the teacher's desk, allowing the
teacher to check the progress of any individual student or the class as a
whole. One window on each student machine can provide interfaces to the remote
teaching program, as well as other clients running locally on the user's
machine and remotely on other machines.[72]
Kevin Altis also supported Raggett's suggestion for embedding
content, and provided examples of embedded interactive content including movies
and spreadsheets.
This is the only sane way to deal with "foreign" elements that a
browser can't understand or render itself. The number of object types
(pictures, equations, spreadsheets, movies, etc., open and platform specific
picture formats, different ways of specifying equations, etc. will quickly
overwhelm anyone writing browsers. It also clutters the very simple HTML we now
have. In some cases it may even be possible to translate certain images, such
as FAX data into text, that a browser can handle.
Although the HTML+ specification described in the Raggett
references existed more than a year before the '906 patent was filed,[73]
the references teach an EMBED tag that is very similar to that of the '906
specification.[74] Moreover, Raggett
teaches a type element that is identical to that of the '906 preferred
embodiment. Both have the form type="type" where "type" specifies a MIME
content type used by the browser to identify an appropriate external
application to handle the data.
"The EMBED tag provides a simple form of object level embedding."
[Raggett I at p. 6]
"...you could also put the foreign data in a separate file referenced by a
URL."
[Raggett II]
"HTML tag format used by the invention to embed a link to an application
program within a hypermedia document."
['906 Patent at Col 12, 54-56]
"The type attribute specifies a registered MIME content type and is used
by the browser to identify the appropriate shared library or external filter to
use to render the embedded data..."
[Raggett I at p. 6]
"The browser identifies the format of the embedded data from the "type"
attribute, specified as a MIME content type."
[Raggett II]
"The TYPE element is a Multipurpose Internet Mail Extensions (MIME) type.
Examples of values for the TYPE element are 'application x-vis' or
'video/mpeg.'"
The USPTO re-examined the '906 patent in light of the Raggett
references and issued an office action that rejected the '906 claims as being
unpatentable over a combination of Mosaic, Raggett I (The Proposed HTML+
Specification), Raggett II (post from the WWW-Talk archives), and the
Berners-Lee reference.[75] The office
action declares, "It would have been obvious to a skilled artisan to combine
the teachings of Berners-Lee regarding the processing of HTML documents
performed by a browser with the HTML browser of the admitted prior art..."[76]
The following table briefly describes how this combination of
references clearly shows each element of the'906 claims.
'906, Claim 1
Raggett I
"A method for running an application program in a computer network environment,
comprising: providing at least one client workstation and one network server
coupled to said network environment, wherein said network environment is a
distributed hypermedia environment; executing, at said client workstation, a
browser application, that parses a first distributed hypermedia document to
identify text formats included in said distributed hypermedia document..."
This simple Web browser functionality is taught by Berners-Lee and is present
in prior browsers, such as Mosaic.[77]
"...and for responding to predetermined text formats to initiate processing
specified by said text formats; utilizing said browser to display, on said
client workstation, at least a portion of a first hypermedia document received
over said network from said server, wherein the portion of said first
hypermedia document is displayed within a first browser-controlled window on
said client workstation ..."
This simple Web browser functionality is taught by Berners-Lee and Mosaic.[78]
"...wherein said first distributed hypermedia document includes an embed text
format, located at a first location in said first distributed hypermedia
document..."
Raggett I teaches an embed text format, the EMBED tag, located within a
hypermedia document.
"HTML+ represents a substantial improvement over the existing format: HTML,
offering... embedded data in foreign formats" [e.g., Raggett I at p. 1]
"...that specifies the location of at least a portion of an object external to
the first distributed hypermedia document..."
Raggett II teaches specifying the location of the external object.
"The EMBED tag provides a simple form of object level embedding." [e.g.,
Raggett I at p. 6]
"...wherein said object has type information associated with it utilized by
said browser to identify and locate an executable application external to the
first distributed hypermedia document..."
The EMBED tag has a type attribute that contains type information that is used
by the browser to identify and locate an executable application.
"The type attribute specifies a registered MIME content type and is used
by the browser to identify the appropriate shared library or external filter to
use to render the embedded data..." [e.g., Raggett I at p. 6]
"The browser identifies the format of the embedded data from the "type"
attribute, specified as a MIME content type." [e.g., Raggett II]
"...and wherein said embed text format is parsed by said browser to
automatically invoke said executable application to execute on said client
workstation..."
The parsing element of the claim is taught by Berners-Lee and Mosaic.[79]
Raggett I teaches automatically invoking an executable application on said
client workstation.
"The type attribute specifies a registered MIME content type and is used
by the browser to identify the appropriate shared library or external filter to
use to render the embedded data..." [e.g., Raggett I at p. 6]
["Browsers can then be upgraded to display new formats without changing their
code at all. All you would need is a way of binding the MIME content type to
the function name for that format, e.g., via X resources or a config file."
[e.g., Raggett II]
"The functions could be implemented as separate programs driven via pipes and
stdin/stdout or as dynamically linked library modules (Windows DLLs)." [e.g.,
Raggett II]
"...in order to display said object and enable interactive processing of said
object within a display area created at said first location within the portion
of said first distributed hypermedia document being displayed in said first
browser-controlled window."
Raggett I teaches displaying the object within the browser window.
"The type attribute specifies a registered MIME content type and is used
by the browser to identify the appropriate shared library or external filter to
use to render the embedded data..." [e.g., Raggett I at p. 6]
Raggett I teaches enabling interactive processing of said object within a
display area within the browser controlled window.
"Sophisticated browsers can link to external editors for creating or revising
embedded data." [e.g., Raggett I at p. 6]
e. Embedding interactive objects was
well-known in 1993 [<^]
The concept of embedding active objects in Web pages was not novel even when
the Raggett references were first posted. For example, Mark Andreessen, creator
of the Mosaic browser, proposed using external applications to embed MIME
documents in browser windows in January of 1993. Furthermore, he proposed the
use of widgets to provide interactivity with the object, in this case, a MPEG
movie.
"Fellow WWW'ers,
If all goes as planned, X Mosaic will be expanded in the
not-too-distant future to be able to handle heterogeneous MIME documents. This
capability will probably take the form of a MIME "master widget" that will
create child widgets to display various data elements -- text, GIF images, JPEG
images, MPEG movies, and so on.
The child widgets will in turn be able to have active elements of
their own... a widget represent[ing] audio/basic will have a 'play' button, a
readout giving the duration of the audio clip, and so on. Widgets representing
unknown types will have controls to allow saving the data to a local file or
possibly passing it to an external program.
So, we're going to need widgets to display as many different common
datatypes as possible -- for example, we plan to use Dan Connolly's RTF widget
for RTF data elements -- and we'd like to solicit pointers to or donations of
widgets that can display common data formats. (Of course, we're distributing
full source for X Mosaic and its respective parts every step of the way.)
The more widgets we don't have to write ourselves, the faster we can
get multimedia X Mosaic out the door and the more capabilities it can have...
so pointers or donations will be most appreciated.
Mark Andreessen also started a discussion thread on the WWW-Talk
mailing list proposing the use of embedded images in an upcoming release of
Mosaic. His post aggravated a debate among the community over whether the IMG
tag, used for embedding images, should be superceded by the EMBED tag, which
could provide client-side format negotiation to embed arbitrary content in Web
pages.
"In 1993, a debate was exploding on the fledgling HTML mailing list,
and finally a college student named Marc Andreessen added <img> to his
Mosaic browser. People objected, saying it was too limited. They wanted
<include> or <embed>, which would allow you to add any sort of
medium to a Web page with the much-touted content negotiation used on the
client. That was too big a project, according to Marc, and he need to ship
ASAP." [http://webmonkey.wired.com/webmonkey/97/17/index0a.html?tw=authoring]
---
"I don't think we should add idiosyncratic hooks for media one at a
time. Whatever happened to the enthusiasm for using the MIME typing mechanism?
I made a concrete proposal a few months ago, where HREFs can point to other
parts in a MIME multipart (and thereby to an "external-body"), and I've seen a
similar idea recently regarding embedding media clips in a "simplemail"
format."
During the course of the discussion, a variety of interactive
embedded object types were proposed, including clickable areas within images,
drawing objects, animation objects, message objects, and spreadsheet objects.
"The Macintosh and Microsoft Windows systems are moving toward
compound document architectures, as is X11R6. This shouldn't be an
insurmountable problem. With X, one could look at an [e]mbedded object, try to
determine its appropriate size, then ask a renderer to image a file into a
sub-window inside the browser's viewing area."
[http://www.webhistory.org/www.lists/www-talk.1993q2/0402.html]
---
"My objection was to the discussion of 'how are we going to support
embedded images' rather than 'how are we going to support embedded objections
in various media'.
Otherwise, next week someone is going to suggest "lets put in a new
tag <AUD SRC="file://foobar.com/foo/bar/blargh.snd">" for audio.
There shouldn't be much cost in going with something that
generalizes."
"I want to consider a whole range of possible image/line art types,
along with the possibility of format negotiation. Tim's note on supporting
clickable areas within images is also important."
"Other systems to look at which have this (fairly valuable) notion
are Andrew and Slate. Andrew is built with _insets_, each of which has some
interesting type, such as text, bitmap, drawing, animation, message,
spreadsheet, etc. The notion of arbitrary recursive embedding is present, so
that an inset of any kind can be embedded in any other kind which supports
embedding.
For example, an inset can be embedded at any point in the text of the
text widget, or in any rectangular area in the drawing widget, or in any cell
of the spreadsheet. Each 'embedding' consists of some direct information which
specifies the display area of the embedded information, and a pointer to the
actual data object (actually, in most current usage the embedded data object is
directly contained, but references do in fact work)."
The '906 advocates submitted the first Felten declaration in May,
2004. The Felten I declaration presents two major arguments in an attempt to
convince the USPTO that the first office action is incorrect and the '906
patent should stand. First, Dr. Felten contends that the Raggett references are
a "slight" extension of Berners-Lee's ability to display static images, and
therefore teach away from interactivity.[80]
Second, he argues that Raggett doesn't teach interactivity within a browser
window, and that this type of interactivity is claimed by the '906 patents.[81]
a. Dr. Felten misinterprets the state of
the art in 1993-4 (Paragraphs 8-12) [^>]
In paragraph 8, Dr. Felten declares that, in 1994, browsers
offered a very limited view of interactivity, citing three specific examples:
hyperlinks, forms, and helper applications. While it may not have been
commercially implemented, the concept of embedded interactive data and methods
for its implementation was well known at the time.[82]
Although Dr. Felten discussed server-side scripting in his
testimony,[83] he omitted it from his
enumeration of types of interactivity available in 1994. Scripting was well
known in this time frame. Because it was similar in concept to the '906
invention, Doyle had to adjust the scope of the '906 claims to distinguish
himself from this technology. Specifically, he added the element that specifies
that the executable application must execute on the client workstation.[84]
In paragraph 8, Dr. Felten describes forms as elements that
"could be filled out by the user, with a 'submit' button which, when clicked,
caused the user to see another page." In this timeframe, forms were fairly
sophisticated; they allowed the user to interact directly with the form objects
in the browser. That is, the user entered data on an input device, e.g., a
keyboard, and the data was interactively updated in the form. The data could be
collected, manipulated by scripts server-side, and used to process content
within the browser.[85] The '906 claim
language was once amended to distinguish forms.
Dr. Felten declares that, while helper applications can link to
an external program, the program could not provide interactivity within the
browser window.[86] Again, Doyle had to
adjust the '906 claim language to distinguish his invention from helper
applications.[87] Automatic invocation
was added to distinguish from the mouse click required to launch a helper
application. The "display said object... within a display area" element also
distinguishes the invention from helper apps by specifying where the display
was located, i.e., in the browser window.
In paragraph 11, Dr. Felten declares that no existing technology
allowed fully interactive objects within the confines of a Web page's display.
As described above, any difference between the '906 invention and prior
interactivity options was not the amount or nature of the interactivity
provided. For example, with forms and server-side scripts, the difference is
where the application ran. With helper applications, the sole difference was
where the output was displayed.
Mosaic's helper applications were a powerful mechanism for
negotiating non-native formats and viewing them client-side without modifying
the browser. They worked with any external viewer, (See
Section III.c) as opposed to the '906 invention, which worked with only
a handful of proprietary software types.[88]
Because all or more of the content provided by the '906 invention could be
provided by helper applications (including VIS), putting the display in-line
did not enhance the level of interactivity on the Web.
b. Dr. Felten understates the vision and
contribution of Berners-Lee (Paragraphs 25-28) [<^>]
In paragraph 26, Dr. Felten declares, "The Berners-Lee reference
teaches a model in which Web pages are... viewed as a static item by the
browser's user. The user views a page, and then clicks a hyperlink or a button,
or enters some text, to select another page to view." Based on this definition,
he asserts in paragraph 27 that the user "interacts with the Web by moving from
one static page to another," then concludes that Berners-Lee "teaches away from
the provision of rich interactivity within a page." This understates of the
vision and achievement of Berners-Lee.
From the inception of the World Wide Web in 1991, Berners-Lee
stated that a main goal of his was to enable collaboration in the scientific
community.[89] Tim Berners-Lee had a
clear appreciation of the power of the new medium he was creating. In a posting
to the WWW-Talk mailing list dated October 31, 1991, Berners-Lee recognizes
that his hypertext language can help filter the overabundance of information
available to scholars in ways that will be "a very valuable contribution to the
world of knowledge."[90]
Berners-Lee also did not see the Web as being limited to static
text and hyperlinks. In a WWW-Talk posting from March 23, 1992, he responds to
a question regarding the presentation of the emerging world of cyberspace as
maps rather than textual links with the mixture of excitement, elaboration, and
concern illustrative of a person who has considered the deeper possibilities
offered by the Web.[91] He goes on to
observe that if generating graphical representations of the network of links to
Web sites is "computationally intensive it could be done off-line." Obviously,
users of such a system will expect a degree of real-time interactivity.
Finally, Berners-Lee recognized that the concept of executing
programs on a user's computer in reaction to events within a Web browser had
considerable merit. He addresses a question from a WWW-Talk contributor on May
21, 1992 with regard to launching local applications from a button press in the
browser by citing the lack of an acceptable public domain language that is true
to the spirit of the Web as the major hindrance with the words "It isn't here
yet."[92] He goes on to describe the key
features that would be required of such a language. His choice of features,
e.g., object-oriented inheritance, clean syntax, interpretable and compilable,
demonstrate a high level of insight into the future of the Web. Interestingly,
the features that Berners-Lee describes are the core features of today's Java
programming language, which is often used to embed applications, called
Applets, into Web pages. After consideration, Berners-Lee refers to the obvious
security issues inherent in this proposal and describes the early efforts to
address this technology gap. At no point does he indicate that there are
insurmountable technical problems that would make this extension of the browser
impracticable.
In paragraph 28, Dr. Felten also declares that Berners-Lee
teaches a language for authoring Web pages, but not how to build a browser or
how a browser works. The Mosaic browser, admitted prior art to the '906 patent,
was an implementation of a Web browser described by the Berners-Lee reference.[93]
c. Dr. Felten's argument with regard to
external editors and Mosaic is misleading and incorrect (Paragraphs 19-24) [<^>]
In paragraphs 20-24, Dr. Felten makes a somewhat unintelligible
argument around Mosaic's ability to launch an external HTML editor in order to
edit the contents of a Web page. Dr. Felten's argument is based on using an
external editor to edit the HTML content of a live page that had been
previously downloaded using the Mosaic browser. In contrast, the patent claims
"...invoking an executable application to execute on said client workstation in
order to display said object and enable interactive processing of said
object..."[94] Since the patent defines
"object" as "an object external to the first distributed hypermedia document,"[95]
why is Dr. Felten crafting an argument around editing the distributed
hypermedia document itself?
However, while Dr. Felten refers to editing the HTML page,
Raggett I refers to launching an editor for "creating or revising embedded
data," i.e., data specified by an EMBED tag, not the Web page.[96]
According to the patent, the EMBED tag is an "embed text format, located at a
first location in said first hypermedia document,"[97]
which is consistent with the '906 claim language of processing the external
object, not the document itself.
Dr. Felten's HTML editor argument is based on false assumptions.
In paragraphs 20-22, Dr. Felten inserts a chasm in between the Web page as it
exists on the server and the client's copy of the Web page after it has been
downloaded by the Mosaic browser. Based on this argument, he concludes that, in
addition to the fact that it wouldn't make sense to let an arbitrary user edit
the contents of somebody else's Web page, there was no realistic way for the
user to edit the Web page on the client workstation. [Felten I at ¶23] This
assertion is both misleading and incorrect.
In an effort to support workgroup collaboration, Mosaic
implemented the concept of group annotations in which any user could post an
annotation to a Web page, which would be displayed at the bottom of the page as
a hyperlink.[98] As a convenience for
Web editors, Mosaic also allowed users to edit the source of a document by
choosing "Edit Source" from the "File" menu. Furthermore, the use of editors
was user configurable.[99]
Even if you conceded that it didn't make sense to let a user edit
the contents of somebody else's page[100],
it would certainly be desirable, given the distributed nature of the Web, to
allow him to edit his own Web page from a remote client. Alternatively, if the
data embedded in the page represented a loan calculator or a spreadsheet, for
example, then it would be perfectly logical for users to edit the data for
their own purposes regardless of whether they intend to repost data to the Web
server.[101][102]
In paragraph 24, Dr. Felten asserts that, because Web pages were
written in one format (HTML) and viewed in another (visual representation), it
did not make sense to talk about editing and viewing in the same window. While
the '906 patent requires that the interactive object be displayed within the
browser window, the external application can exist outside of the browser in
its own window.[103][104]
In the VIS application, for example, the user interacts with the external
control panel to update the display of the embedded visual representation of
the embryo.[105] Anyone in this field
would have known that the external editor could display, for example, a textual
representation of the HTML comprising the page, the input fields for a loan
calculator, or the filtering options for the spreadsheet, and that the visual
representation of the object embedded within the Web browser could be updated
interactively by manipulating the data in the editor.[106][107][108]
d. Dr. Felten's argument against the
Raggett references is misleading, incorrect, and contradicts both his testimony
and the '906 patent (Paragraphs 29-55) [<^]
Throughout his declaration, Dr. Felten repeatedly uses the term
"static" to craft a collection of misleading arguments that support the
assertion that the Raggett references teach away from interactivity. In Felten
I alone, the modifier "static" is applied to Web pages [Felten I at ¶31, 33,
61], images [¶34, 36, 41, 42, 46], information [¶38, 39], data [¶40], documents
[¶41], content [¶57], and pixmaps [¶53] with little or no explanation of the
meaning of the terms or the implications of those meanings.
The '906 patent requires an executable application to "...display
said object and enable interactive processing of said object within a display
area..."[109] Restated, the external
application must provide some interactivity to the user that results in the
modification of the view of the object, wherein that view is displayed within
the browser.
When Dr. Felten is building his position that Raggett teaches
away from interactivity, he repeatedly and inappropriately applies "static" to
the terms data, images, and pixmaps. In contrast, when he states his
conclusions that Raggett teaches away from interactivity, he opines that Web
pages, content, and information are static entities. In other
words, Dr. Felten's argument that Raggett teaches away from providing
interactive processing of an object is based on a misleading use of the word
"static" to the input to the external application, i.e., the object data itself,
and the output of the external application, i.e., the images/pixmaps to
be displayed in the browser.
As described in the patent, the inputs and outputs to the
external application are naturally static content. The object specified by the
EMBED tag does not change, rather the viewpoint of the object is changed
through processing.[110] Images are
static representations of a single view of a set of data.[111]
In the '906 invention, for example, the embedded view of the embryo is never
displayed by using more than a single "static" image at time.
[112] Movies in the MPEG format (and animations in general) are a
series of 30 "static" images displayed sequentially per second.[113]
By applying the word "static" to these items, Dr. Felten is not technically
incorrect. However, his use of "static" in this way is misleading. Certainly,
his use of the term static to describe displayed images within the browser
window does not support his conclusion that Raggett teaches away from
interactivity. In fact, the '906 patent discloses several embodiments of
external applications that accept object data input and return image frame data
back to the browser in the same way as that taught by Raggett.[114]
Dr. Felten also declares that Raggett teaches away from
interactivity because Raggett does not teach interactivity within the embedded
object's display area within the browser window.[115]
This statement is misleading and directly contradicts Dr. Felten's own
testimony.[116] The ability to
initiate interactive processing of the embedded object by clicking in its
visual display window inside the browser is not a requirement of the '906
patent.[117]
i. Raggett was a significant improvement
over Berners-Lee (Paragraphs 29-31) [^>]
Felten begins his deconstruction of the Raggett I reference by
asserting that its overall teaching was "very similar to that of Berners-Lee"
in that it "teaches away from rich interactivity within a page." [Felten I at
¶29, 31] As discussed above, interactivity is a relatively small part of the
'906 claims, yet it's the major focus of Felten's argument. In reality, the
Raggett references, Berners-Lee, and Mosaic teach all of the independent claim
elements of the '906 patent to those of skill in the art. [See
Section III.d.ii] After considering the admitted prior art with the
Raggett references, the USPTO agreed, and declared the invention unpatentable.[118]
The HTML+ specification described in Raggett I was an enhancement
over the HTML specification described in Berners-Lee. One improvement was the
addition of the EMBED tag, to allow browsers to invoke external applications in
order to display, embedded in the Web page, non-native content without
modification to the browser.[119][120]
With the EMBED tag, the Raggett references teach using type information to
specify a registered MIME content type that is used by the browser to identify
the appropriate external application to use to display the data, automatically
launch such an application, and provide its display in the context of the Web
page.[121] This model allowed support
for new formats without changing the browser's code, so long as the external
application supported the HTML+ common calling mechanism and name binding
scheme.[122][123]
In contrast to Felten's assertion that this enhancement teaches
away from the provision of interactivity within a page, the EMBED tag taught by
Raggett I is nearly identical to that of the '906 patent. [See
Section III.d] The only difference between the Raggett references and
the '906 specification is the specific example of embedded data and associated
interactivity that was used as an example. While the '906 specification
disclosed a viewer and control panel for multidimensional image frames,[124]
the Raggett references used as an example a simpler, but also interactive,
mathematical equation viewer with external editor.[125][126]
[127][128] The difference in
the type of interactivity provided is insignificant, because there is no
requirement in the claims for the type of interactivity that must be provided,
or, for that matter, the type of objects to be displayed. External applications
are not claimed by the '906 patent and specific types of interactivity are a
function of the external application, not the browser.[129]
At the time of Raggett, the ability to view and manipulate
multidimensional objects over the Internet was "limited largely by bandwidth
constraints in the various communication links in the Internet and localized
networks, and by the limited processing power, or computing constraints, of
small computer systems normally provided to most users."[130]
Whereas Doyle attempted to provide "rich interactivity," by focusing on
advancing applications to address these limitations,[131]
Raggett and the authors of the HTML+ specification focused primarily on the
forming of the standard that would enable Web browsers to support the embedded
interactivity of the time while also being extensible to support future types
of embedded interactivity as the state of the art progressed.[132]
This is inconsequential because the '906 patent does not claim any particular
type of interactivity.
In paragraph 32, Dr. Felten suggests that "Raggett is motivated
by problems of Web page authors..." who "...want to include in their pages
information in a wide variety of formats." He continues, "Web page authors had
noted a need for the display of static pages in more, and more varied, data
formats." [Felten I at ¶33] It is not likely that Web authors of the time were
begging, "give us more static pages!" In fact, interactivity on the Web
was blossoming. [133] [See, e.g.,
Section III] Dr. Felten is attempting to limit the scope of Raggett I
to the display of single-frame image data. As described above, Raggett provided
a mechanism by which browsers could embed visual representations of arbitrary
data types.[134]
[135][136]
[137] At the time, members of the Internet community recognized the
ability for Raggett I to embed dynamic, interactive content. They also
understood that Raggett I demonstrated the ability to interact with external
objects within the embedded window.[138][139]
[140] [Section III.d]
In paragraph 35, Dr. Felten says that a known method for enabling
the display of more formats was to add support into the browser itself, which
required a new version of the browser. Although the purpose of this paragraph
is not clear, the Raggett references overcome this limitation, using type
information to specify a registered MIME content type that us used by the
browser to identify the appropriate application to use to display the data.[141]
With this method, the Raggett references taught adding support for new formats
without changing their code at all.[142][143]
In paragraph 34, Dr. Felten describes server-side translation,
which he states was another known method for allowing a browser to display more
formats.
"In this method, a web page author would take a document in some
format, and generate a static image file from it. For example, an author might
take a file describing a diagram, and generate from that file a static image,
in GIF format, depicting the diagram. The web server could then deliver the GIF
file to the browser, which would know how to render it within a web page."
[Felten I at ¶34]
In paragraph 36, Dr. Felten declares that Raggett I proposes only
a "slight extension" to the server-side translation method for embedding data
formats. He states, "...Rather than receiving an image, the browser received
information in some foreign format, and then uses an external program to render
that information into an image, which the browser displays within the page."
While presented as an argument against Raggett, the functionality Dr. Felten
describes, i.e., using an external program to render foreign data into an
image, is precisely that which is described and claimed in the '906 patent.
"In the present example where a multidimensional image object
representing medical data for an embryo is being viewed, application server 220
could perform much of the viewing transformation and volume rendering
calculations to allow a user to interactively view the embryo data at their
client computer display screen... application server 220 performs the
mathematical calculations to compute a new view for the embryo image. Once the
new view has been computed, the image data for the new view is sent over
network 206 to application client 210 so that application client 210 can update
the viewing window currently displaying the embryo image. In a preferred
embodiment, application server 220 computes a frame buffer of raster display
data, e.g., pixel values, and transfers this frame buffer to application client
210." ['906 at Col 10, 49 - Col 11, 3]
---
"The invention allows a program to execute on a remote server or
other computers to calculate the viewing transformations and send frame data to
the client computer thus providing the user of the client computer with
interactive features and allowing the user to have access to greater computing
power than may be available at the user's client computer." ['906 Abstract]
---
"Once an image representing a new viewpoint is computed the frame
image is transmitted over the network to the user's client computer where it is
displayed at a designated position within a hypermedia document." ['906 at Col
7, 21-25]
This functionality was precisely the purpose of Raggett's
EMBED tag: to allow arbitrary data to be embedded in Web browsers by using an
external program to render object information into an image that is displayed
in the browser.[144]
[145][146] Despite the fact
that the preferred embodiment of the '906 invention uses this same technique,[147]
Dr. Felten concludes that Raggett teaches "a simple and natural extension of
the browser's ability to display static images." [Felten I at ¶36]
Additionally, Dr. Felten does not address the possibility that the external
application described in the Raggett references could also paint the contents
directly into the browser window and handle events on the window rather than
"returning" the image to the client application.[148][149]
In paragraph 30, Dr. Felten says that Raggett I does not teach
how a browser works, or how to build one. But, Mosaic, admitted prior art,
embodies the portions of the Berners-Lee reference that pertain to the '906
patent.[150]
ii. The use of "static" images does not
preclude interactivity (Paragraphs 37-44) [<^>]
Dr. Felten declares that the USPTO's office action analysis of
the Raggett references is incorrect because they cite what he describes as
displaying "static images." [Felten I at ¶59] This is misleading as it falsely
implies that the rendering a series of static images precludes interactivity.
In paragraph 39, Dr. Felten states, "...the use of static
information is consistent with the teaching of the remainder of Raggett I and
with the teaching of Berners-Lee that preceded it." In other words, Dr. Felten
is arguing that the portions of Raggett I other than the EMBED tag, as
well as Berners-Lee, are consistent with regard to the use of "static
information," so he says that Raggett I teaches away from interactivity. The
fact that interactivity is not present before or outside of the teachings of
the EMBED tag (which teaches interactivity) is no evidence that Raggett teaches
away from interactivity. Dr. Felten's argument is also inconsistent with the
WWW-Talk discussion at the time concerning the proposed embed tag for HTML+.[151][152]
[153]
In paragraph 40, Dr. Felten states, "...Raggett I motivates its
proposed embed tag by referring to two types of data that one might want to
display: 'mathematical equations and simple drawings.' These are types of data
that one would want to display statically." He continues by describing TeX and
eqn as formats for describing the display of static data. [Felten I at ¶41]
Here, Dr. Felten is misapplying the term "static" to the output of the external
application, e.g., a rendered view of an equation. An equivalent argument could
be proposed that the '906 invention cites examples of information, e.g.,
multidimensional images, the results of a spreadsheet program, or a
MetaMAP-processed image,[154] which
are also types of data that one would want to display statically.[155]
Additionally, Raggett I teaches other types of "non-static" data,
e.g., mpeg movies, as a format that could be displayed in browsers.[156]
Raggett's use of the MIME content type further suggests support for the entire
range of MIME types that, at the time, were not limited to "static" data.[157]
During a discussion on WWW-Talk concerning HTML+, others acknowledged the
potential for interactive embedded objects.[158][159]
[160]
In paragraph 42, Dr. Felten declares, "...Raggett I teaches to
the invocation of a 'shared library or external filter to render the embedded
data, e.g. by returning a pixmap.'" By misconstruing a number of terms of art,
(filter, render, return, and pixmap) Dr. Felten crafts an argument to support
his position that the Raggett references teach away from interactivity.
First, Dr. Felten defines "filter" as "a term of art that refers
to a type of non-interactive program that translates data from one format to
another." The use of the term "non-interactive" to describe a filter is
misleading. When it comes to a software program that performs filtering, either
as its sole function or as an ancillary feature, it is important to
differentiate between the configuration of the filter's parameters and the
actual execution of the filter. A filter which is not configurable or does not
provide for user interaction during the configuration phase of the filter's
governing parameters would be limited to performing only a single predefined
function and would be of consequently limited use. Consider a spreadsheet
application that includes a function to filter the contents of a row or column.
The filter is only useful if the user can configure the parameters by which the
data is filtered.
A filter can be interactive even if it does not provide a
configuration phase. When the filter program is invoked, it responds by taking
the input data and generating filtered output. It follows that a user can
dynamically interact with a filter by providing dynamic content to the filter.
The idea of filtering is consistent with the invention described
in the '906 patent. At the time, bandwidth and computing constraints made it
difficult for small, cheap client computers to handle large amounts of visual
data.[161] Therefore, it was desirable
to have a powerful system, e.g., an application server, which generates
individual views of large data objects.[162]
This model is consistent with that of a filtering application, as a substantial
portion of the data object is discarded while converting the data to an image
before sending that image to the client, thus overcoming the constraints of
viewing and rendering large data objects on small client machines.
"In the present example where a multidimensional image object
representing medical data for an embryo is being viewed, application server 220
could perform much of the viewing transformation and volume rendering
calculations to allow a user to interactively view the embryo data at their
client computer display screen... Once application server 220 receives the
information in the form of, e.g., a coordinate transformation for a new viewing
position, application server 220 performs the mathematical calculations to
compute a new view for the embryo image." ['906 at Col. 11, 47-63]
'906 further supports filtering by suggesting that the
application server transmit only enough information to update the image, i.e.,
it filters out the data from the rendered view of the object that is not needed
by the client to update the image.
"Once an image representing a new viewpoint is computed the frame
image is transmitted over the network to the user's client computer where it is
displayed at a designated position within a hypermedia document. By
transmitting only enough information to update the image, the need for a high
bandwidth data connection is reduced." ['906 at Col. 7, 21-26]
In any event, Raggett is not limited to using filters. In
addition to "external filters," Raggett I and Raggett II also cite the use of
separate programs, shared libraries, and DLLs, none of which are limited by Dr.
Felten's alleged constraints of a filter.
In paragraph 42, Dr. Felten defines "render" as "generation of a
static image to be displayed." The term "static image" is redundant and does
not preclude interactivity within an application. A "static image" as Dr.
Felten describes, is no more than a frame of visual data generated by
processing a view of the interactive object. The idea of using static images is
consistent with the '906 patent, which uses the term "render" to describe
generating a view of the interactive object data.
"The viewing transformation and volume rendering calculations may be
performed by remote distributed computer systems." ['906 at Col 7, 18-20]
---
"In the present example where a multidimensional image object
representing medical data for an embryo is being viewed, application server 220
could perform much of the viewing transformation and volume rendering
calculations to allow a user to interactively view the embryo data at their
client computer display screen." ['906 at Col 10, 49 - Col 11, 3]
---
"In a preferred embodiment, application server 220 computes a frame
buffer of raster display data, e.g., pixel values, and transfers this frame
buffer to application client 210." ['906 at Col 10, 49 - Col 11, 3]
---
"Thus, by using MEAPI a server process communicates to a client
application program to let the client application know when the server has
finished updating information, such as an image frame buffer, or pixmap..."
['906 at Col 12, 30-34]
The '906 patent also describes rendering as an interactive
process, i.e., one that responds to user activity. With the '906 invention, a
user could initiate a rendering request by modifying the settings of the
external panel, thereby requesting a new view of the data object be rendered
and displayed.
"VRServer processes respond to requests such as rendering requests to
generate image segments. The image segments are sent to VIS and combined into a
pixmap, or frame image, by VIS. The frame image is then transferred to the
Mosaic screen via communications between VIS, Panel and Mosaic." ['906 at Col
16, 38-43]
Although Dr. Felten's definition of pixmap is accurate, the
storage of a rendered image in a pixmap does not preclude interactivity. The
'906 embodiment, for example, describes rendering frame data into pixmaps as a
method for providing embedded interactive content.[163]
In paragraph 42, Dr. Felten declares, "'Return' is a term of art
that refers to the information produced by a program when that program
terminates." He continues, "A program that has returned something cannot do
anything else; for example it cannot provide interactive processing." Felten I
contrasts the term "return" to interactivity: "In particular, the program
cannot provide any interactive functionality, since the program would have
stopped running before the browser even painted the returned static pixmap onto
the screen." [Felten I at ¶53]
While the programming language return statement can, indeed,
represent the final, high-level, executable statement of a program, there is
nothing inherent in the return statement, as implemented in any number of
programming languages including C and C++, that requires the program within
which the return statement has executed to terminate upon completion of that
statement. For example, in the case where the caller of the function is within
the same program as the function, then the return statement does not terminate
the program as a whole. Rather, the execution of the return statement
represents a change in the path of execution of the program that is not
inherently terminal. Moreover, there is nothing in the context of the Raggett
references that suggests that it uses "return" in the narrow, unique context
Dr. Felten advocates.
Raggett I teaches the use of a "shared library or external filter
to render the embedded data, e.g. by returning a pixmap." Raggett II teaches
the use of separate programs and Dynamic Linked Libraries (DLLs).
The very packaging of shared libraries and DLLs[164]
lends to facilitating interactivity. "A dynamic link library is brought into
action only when another module calls one of the functions in the library."[165]
Unlike an executable that is invoked, executes, and ultimately terminates, a
DLL remains loaded with its exported functions ready to be invoked until the
parent process terminates or requests that the library be unloaded.
[166]
When a separate program relies on functions provided by a DLL to
enable interactive processing with an embedded object, the program can call
these functions repeatedly and with different parameters. The fact that the
DLL's functions remain available, once initially invoked, means that the DLL is
not subject to whatever lifespan constraints may be exist for a standard
executable program. In fact, most of the services provided to programs in the
Microsoft Windows environment are implemented as DLLs.[167]
Since the very purpose of DLLs and shared libraries is to provide
a service to the caller, if the return statement in one of these libraries
caused the entire application to exit, the very purpose of the library function
would be invalidated since the caller presumably must continue executing in
order to make meaningful use of the library's functionality. Indeed, it is not
uncommon in constructing programs to call a function or method of a DLL
repeatedly, perhaps to incrementally adjust an operating property of the
program. The MotifApp library, for example, contains a "stage" class that
consists of pixmap drawing canvases upon which animations can be displayed.[168]
The stage class supports animation by providing a function, nextframe, that
renders the next frame of animation data, stores it in a pixmap buffer, then
returns.[169] A separate program that
wishes to use the MotifApp library can display an animation by calling the
nextframe function at least thirty times per second and displaying each pixmap
at the same rate.[170]
In paragraph 43, Dr. Felten combines his misleading definitions
of "filter," "render," and "return," along with his fabricated association
between "static" and "non-interactive" to imply that Raggett teaches nothing
more than transforming a single unit of data into a single image to be
displayed in a browser. Dr. Felten declares, "...the only specific example of
the use of Raggett's proposed embed tag that is given in Raggett I involves the
use of a non-interactive filter which renders static data and returns." In
contrast to Dr. Felten's argument, Raggett I teaches an editor program that
provides interactivity,[171] e.g., by
changing the representation of the embedded object in an editor, causing the
external application to respond by updating the corresponding visual
representation of that data displayed in the Web page.[172][173]
[174][175] In addition to the
fact that his argument is supported by misconstruing the terms above in a
manner that is inconsistent with the definitions known in the art, Dr. Felten's
conclusion is contradictory to the history captured in the WWW-Talk archives,
which shows that those of skill in the art understood Raggett as teaching the
use of any conceivable external application. [See, e.g.,
Section III.d]
In paragraph 44, Dr. Felten states, "...the discussion of the FIG
and ISMAP features in Raggett I is inconsistent with the proposition that
Raggett's proposed embed tag allowed interaction with an embedded object." He
continues, "...any mouse clicks made by the user within the visual depiction of
the embedded data will be interpreted by the browser as pertaining to the
image-map feature, and will therefore be intercepted by the browser and sent by
the browser to the web server."
This argument is based on the false assumption that interactivity
with the embedded object must occur within the object area of the browser
window. Dr. Felten's own testimony and the '906 patent clearly state that the
user does not have to interact with the object within the browser window, e.g.,
by clicking the embedded representation of the object.[176][177]
"Note that the image window 352 is within Mosaic window 350 while
panel window 354 is external to mosaic window 350... By using the controls in
panel window 354 the user is able to manipulate the image within image window
352 in real time do perform such operations as scaling, rotation, translation,
color map selection, etc." ['906 at Col 16, 15-22; Figure 9, 354; Figure 10]
--
Q Okay. Now, can the user do that by grabbing with the mouse?
A Yes.
Q How about using a separate panel Window?
A That would also be within the scope of the claim.
[Felten Testimony at p. 1023, 24 - 1024, 3]
Without this limitation, it doesn't matter whether the browser
intercepts mouse click events on the embedded object window. Moreover, those
skilled in the art at the time understood how to implement Raggett's EMBED tag
to provide interactivity directly with objects embedded in the Web browser.[178]
iii. Dr. Felten's statements regarding the
implementation of Raggett are irrelevant (Paragraph 45) [<^>]
In paragraph 45, Dr. Felten declares that Raggett I was never
implemented. This is inconsequential; the Raggett specification and the concept
of embedding interactive objects was disclosed to, and well understood by,
those in the field.[179]
[180][181] In fact, the '906
patent actually admits that implementing features such as these is entirely
conventional and certainly well within the ability of a person of skill in the
art.
"...routines may be implemented by any means as is known in the art.
For example, any number of computer programming languages, such as "C", Pascal,
FORTRAN, assembly language, etc., may be used. Further, various programming
approaches such as procedural, object oriented or artificial intelligence
techniques may be employed." ['906 at Col. 13, 51]
iv. Implementation of Raggett I would
not be limited to single frame data (Paragraph 46) [<^>]
In paragraph 46, Dr. Felten states, "...if one of ordinary skill
in the art (at the time) were asked to implement the Raggett I feature, he
would do so by... starting with the existing code for handling IMG tags, and
modifying that code." Dr. Felten argues that the IMG tag paints static images
into the body of the page based on an input file that describes the image, so
it follows that the IMG code would be modified to invoke an external program
that returns a static image that is pasted into the Web page in the same manner
as in an IMG tag.
The implementation of Raggett I that Dr. Felten proposes here is
based on incorrect assumptions that ignore the teachings of the Raggett I
reference, i.e., the ability to embed a visual representation of new data
formats into a Web browser without changing the browser code.[182]
To implement the EMBED tag described in Raggett I, one of skill in the art
would attempt to provide a mechanism that is general enough to support the
largest possible number of data types. Because Raggett I teaches the use of
type information in the form of registered MIME types, it follows that one
would implement a type-agnostic support mechanism for the entire registered
MIME database. At the time, there were registered MIME types that contained
multiple frame rendered data, e.g., mpeg,[183][184]
therefore it is likely that the implementer of the Raggett I functionality
would not limit the browser's support to single-frame, or (as Dr. Felten
describes) "static," data. And, again, this is precisely how others on the
www-talk list understood Raggett at the time. [See, e.g.,
Section III.d]
Raggett I & II teach adding support for new formats without
having to change the browser code through the use of an application programming
interface (API) that could be employed by the browser to interact with external
applications "through using a common calling mechanism and name binding
scheme."[185]
[186] Because existing registered MIME types include both single and
multiple framed data, an implementer of Raggett I would design a calling and
binding mechanism and an API that supported the exchange of multiple visual
frames.
v. Dr. Felten's argument with regard to
external editors is misleading and incorrect (Paragraphs 47-48) [<^>]
Raggett I teaches the use of an EMBED tag to handle interactive
content by "linking to external editors for creating or revising embedded
data." In paragraph 47, Dr. Felten offers a contradictory theory, that
"...'linking to external editors for creating or revising embedded data' refers
to the use of external programs by a Web page's author to edit or revise the
external data before it is published..."
Nothing in the Raggett I reference suggests that the editor is
limited to external, unpublished data. Using the example from Raggett I, the
external application eqn displays visual representations of equation data. It
would be useful, for example, to be able to open an external editor that
allowed the user to edi