Digital Law Online: Abstraction, Filtration, Comparison

A few years after Whelan, the Second Circuit examined the scope of software copyrights in Computer Associates International v. Altai. {FN53: 982 F.2d 693, 23 USPQ2d 1241 (2d Cir. 1992)} Computer Associates had developed a computer program named Adapter, which it used with its other programs to handle the differences in operating system calls and services between the various IBM mainframe operating systems. A former employee of Computer Associates had gone to work for Altai, a competitor of Computer Associates. When he saw the overhead Altai had maintaining different versions of its programs for the different IBM operating systems, he thought of how Adapter simplified that for Computer Associates. Though he was supposed to have returned all program listings when he left Computer Associates, he had kept a listing of Adapter. Working from that listing, he put together a program similar to Adapter, which he called Oscar, for Altai. About 30 percent of the first version of Oscar came from Adapter.

When Computer Associates learned that Oscar was similar to Adapter and had been developed at Altai by one of its own former programmers, it registered its copyright in Adapter and filed suit for both copyright infringement and misappropriation of trade secrets. Altai conceded the copyright infringement and began an effort to reimplement Oscar using programmers who had not seen the Adapter program based on the functional specifications of Oscar. Not surprisingly, the new version had a structure quite similar to the infringing version. The issue before the Second Circuit was whether the new version of Oscar infringed the copyright for Adapter.

The case was one well-situated to make new law. The Second Circuit, because it hears appeals from the New York center of publishing, music, and drama, has long been regarded (along with the Ninth Circuit, which includes Hollywood in its jurisdiction) as a copyright court, with decisions of Learned Hand and others still defining the bounds of copyright. At the district court level, the case was tried before Second Circuit Judge Pratt, sitting by designation, who had been provided with an independent expert (Randall Davis, a professor of computer science from MIT) to help him understand the technical aspects of computer software. His decision formed the basis for the Second Circuit’s decision.

III.B.1. The Second Circuit’s Altai Decision

Declining to adopt the broad Whelan test of the Third Circuit, the Second Circuit said:

We think that Whelan’s approach to separating idea from expression in computer programs relies too heavily on metaphysical distinctions and does not place enough emphasis on practical considerations. As the cases that we shall discuss demonstrate, a satisfactory answer to this problem cannot be reached by resorting, a priori, to philosophical first principals [sic.].

As discussed herein, we think that district courts would be well-advised to undertake a three-step procedure, based on the abstractions test utilized by the district court, in order to determine whether the non-literal elements of two or more computer programs are substantially similar. This approach breaks no new ground; rather, it draws on such familiar copyright doctrines as merger, scènes à faire, and public domain. In taking this approach, however, we are cognizant that computer technology is a dynamic field which can quickly outpace judicial decisionmaking. Thus, in cases where the technology in question does not allow for a literal application of the procedure we outline below, our opinion, should not be read to foreclose the district courts of our circuit from utilizing a modified version.

In ascertaining substantial similarity under this approach, a court would first break down the allegedly infringed program into its constituent structural parts. Then, by examining each of these parts for such things as incorporated ideas, expression that is necessarily incidental to those ideas, and elements that are taken from the public domain, a court would then be able to sift out all non-protectable material. Left with a kernel, or possibly kernels, of creative expression after following this process of elimination, the court’s last step would be to compare this material with the structure of an allegedly infringing program. The result of this comparison will determine whether the protectable elements of the programs at issue are substantially similar so as to warrant a finding of infringement. It will be helpful to elaborate a bit further. {FN54: 982 F.2d at 706, 23 USPQ2d at 1252-1253 (citations omitted)}

Well, that sounds simple enough, and it certainly doesn’t have the single-idea problem of Whelan. The first step is to abstract the elements of the computer program at different levels. This idea originally came from a Second Circuit decision written by Judge Learned Hand, in which he stated:

Upon any work . . . a great number of patterns of increasing generality will fit equally well, as more and more of the incident is left out. The last may perhaps be no more than the most general statement of what the work is about, and at times might consist only of its title; but there is a point in this series of abstractions where they are no longer protected, since otherwise the author could prevent the use of his “ideas,” to which, apart from their expression, his property is never extended. {FN55: Nichols v. Universal Pictures, 45 F.2d 119, 121, 7 USPQ 84, 86 (2d Cir. 1930)}

The Second Circuit felt that this provided the framework it needed for computer programs:

As applied to computer programs, the abstractions test will comprise the first step in the examination for substantial similarity. Initially, in a manner that resembles reverse engineering on a theoretical plane, a court should dissect the allegedly copied program’s structure and isolate each level of abstraction contained within it. This process begins with the code and ends with an articulation of the program’s ultimate function. Along the way, it is necessary essentially to retrace and map each of the designer’s steps – in the opposite order in which they were taken during the program’s creation.

As an anatomical guide to this procedure, the following description is helpful:

At the lowest level of abstraction, a computer program may be thought of in its entirety as a set of individual instructions organized into a hierarchy of modules. At a higher level of abstraction, the instructions in the lowest-level modules may be replaced conceptually by the functions of those modules. At progressively higher levels of abstraction, the functions of higher-level modules conceptually replace the implementations of those modules in terms of lower-level modules and instructions, until finally, one is left with nothing but the ultimate function of the program. . . . A program has structure at every level of abstraction at which it is viewed. At low levels of abstraction, a program’s structure may be quite complex; at the highest level it is trivial. {FN56: 982 F.2d at 707, 23 USPQ2d at 1253}

III.B.2. The Tenth Circuit’s Elaboration On Altai

In Gates Rubber v. Bando Chemical, {FN57: 9 F.3d 823, 28 USPQ2d 1503 (10th Cir. 1993)} a case decided following Computer Associates v. Altai, the Tenth Circuit added a little meat to the sparse description from the Second Circuit on how to abstract a computer program, taking much of its description of the possible levels of abstraction from a law review article written by a computer scientist and law student: {FN58: John W.L. Ogilvie, Defining Computer Program Parts Under Learned Hand’s Abstractions Test in Software Copyright Infringement Cases, 91 Mich. L. Rev. 526 (1992)}

Application of the abstractions test will necessarily vary from case-to-case and program-to-program. Given the complexity and ever-changing nature of computer technology, we decline to set forth any strict methodology for the abstraction of computer programs. Indeed, in most cases we foresee that the use of experts will provide substantial guidance to the court in applying an abstractions test. However, a computer program can often be parsed into at least six levels of generally declining abstraction: (i) the main purpose, (ii) the program structure or architecture, (iii) modules, (iv) algorithms and data structures, (v) source code, and (vi) object code.

The main purpose of a program is a description of the program’s function or what it is intended to do. When defining a program’s main purpose, the court must take care to describe the program’s function as specifically as possible without reference to the technical aspects of the program.

The program’s architecture or structure is a description of how the program operates in terms of its various functions, which are performed by discrete modules, and how each of these modules interact with each other. The architecture or structure of a program is often reduced to a flowchart, which a programmer uses visually to depict the inner workings of a program. Structure exists at nearly every level of a program and can be conceived of as including control flow, data flow, and substructure or nesting. Control flow is the sequence in which the modules perform their respective tasks. Data flow describes the movement of information through the program and the sequence with which it is operated on by the modules. Substructure or nesting describes the inner structure of a module whereby one module is subsumed within another and performs part of the second module’s task.

The next level of abstraction consists of the modules. A module typically consists of two components: operations and data types. An operation identifies a particular result or set of actions that may be performed. For example, operations in a calculator program might include adding or printing data. A data type defines the type of item that an operator acts upon such as a student record or a daily balance.

Algorithms and data structures are more specific manifestations of operations and data types, respectively. An algorithm is a specific series of steps that accomplish a particular operation. Data structure is a precise representation or specification of a data type that consists of (i) basic data type groupings such as integers or characters, (ii) values, (iii) variables, (iv) arrays or groupings of the same data type, (v) records or groupings of different data types, and (vi) pointers or connections between records that set aside space to hold the record’s values.

The computer program is written first in a programming language, such as Pascal or Fortran, and then in a binary language consisting of zeros and ones. Source code is the literal text of a programs’ instructions written in a particular programming language. Object code is the literal text of a computer program written in a binary language through which the computer directly receives its instructions.

These generalized levels of abstraction will not, of course, fit all computer codes. Ordinarily, expert testimony will be helpful to organize a particular program into various levels of abstraction. In any event, as pointed out earlier, the organization of a program into abstraction levels is not an end in itself, but it is only a tool that facilitates the critical next step of filtering out unprotectable elements of the program. {FN59: 9 F.3d at 834-836, 28 USPQ2d at 1509-1510 (citations omitted)}

Once the abstraction has been performed, those abstracted elements need to be filtered to separate out those things that are not protectable by copyright. Clearly, the highest level of abstraction – the purpose of the program – is not protectable and would be filtered out. At the lowest levels – the source and object code – total or even substantial copying is likely to be an infringement. It is at the mid-levels where the filtering is generally performed.

An observation here: for a program of any complexity, there are going to be a lot of abstracted elements at every level. This procedure sounds good in theory but is very difficult to apply in practice. We will discuss this later.

III.B.3. Filtration

There are a number of reasons why something might be filtered, and whether something is filtered may depend on the particular level of abstraction being considered.

III.B.3.a. Efficiency

The first of these is that the element is dictated by efficiency considerations:

The portion of Baker v. Selden, discussed earlier, which denies copyright protection to expression necessarily incidental to the idea being expressed, appears to be the cornerstone for what has developed into the doctrine of merger. The doctrine’s underlying principle is that “when there is essentially only one way to express an idea, the idea and its expression are inseparable and copyright is no bar to copying that expression.” Under these circumstances, the expression is said to have “merged” with the idea itself. In order not to confer a monopoly of the idea upon the copyright owner, such expression should not be protected. . . .

Furthermore, when one considers the fact that programmers generally strive to create programs “that meet the user’s needs in the most efficient manner,” the applicability of the merger doctrine to computer programs becomes compelling. In the context of computer program design, the concept of efficiency is akin to deriving the most concise logical proof or formulating the most succinct mathematical computation. Thus, the more efficient a set of modules are, the more closely they approximate the idea or process embodied in that particular aspect of the program’s structure.

While, hypothetically, there might be a myriad of ways in which a programmer may effectuate certain functions within a program, – i.e., express the idea embodied in a given subroutine:efficiency concerns may so narrow the practical range of choice as to make only one or two forms of expression workable options. Of course, not all program structure is informed by efficiency concerns. It follows that in order to determine whether the merger doctrine precludes copyright protection to an aspect of a program’s structure that is so oriented, a court must inquire “whether the use of this particular set of modules is necessary efficiently to implement that part of the program’s process” being implemented. If the answer is yes, then the expression represented by the programmer’s choice of a specific module or group of modules has merged with their underlying idea and is unprotected. {FN60: 982 F.2d at 707-708, 23 USPQ2d at 1254 (citations omitted)}

III.B.3.b. External Factors

Next, an element may be excluded from copyright protection because its use is dictated by external factors:

We have stated that where “it is virtually impossible to write about a particular historical era or fictional theme without employing certain ‘stock’ or standard literary devices,” such expression is not copyrightable. For example, the Hoehling case was an infringement suit stemming from several works on the Hindenberg disaster. There we concluded that similarities in representations of German beer halls, scenes depicting German greetings such as “Heil Hitler,” or the singing of certain German songs would not lead to a finding of infringement because they were “indispensable, or at least standard, in the treatment of” life in Nazi Germany. This is known as thescènes à faire doctrine, and like “merger,” it has its analogous application to computer programs.

Professor Nimmer points out that “in many instances it is virtually impossible to write a program to perform particular functions in a specific computing environment without employing standard techniques.” This is a result of the fact that a programmer’s freedom of design choice is often circumscribed by extrinsic considerations such as (1) the mechanical specifications of the computer on which a particular program is intended to run; (2) compatibility requirements of other programs with which a program is designated to operate in conjunction; (3) computer manufacturers’ design standards; (4) demands of the industry being serviced; and (5) widely accepted programming practices within the computer industry.

Courts have already considered some of these factors in denying copyright protection to various elements of computer programs. In the Plains Cotton case, the Fifth Circuit refused to reverse the district court’s denial of a preliminary injunction against an alleged program infringer because, in part, “many of the similarities between the . . . programs were dictated by the externalities of the cotton market.” {FN61: 982 F.2d at 709-710, 23 USPQ2d at 1255 (citations omitted)}

An example of an element dictated by external factors is the calling sequence for a library routine or operating system function. If one is to use the routine or function, it must be called in a particular way, so that any expressive aspects of the calling sequence have essentially merged with its utilitarian function. But that does not mean that you will never look at routine or function calls when determining whether there is a substantial similarity between two programs. At the abstraction level looking at the overall organization of the programs, the selection and arrangement of library routine or operating system function calls may be precisely the expression that is being considered. So, while you can call the same routines and functions using the same calling sequences without infringing, you may not be able to copy the structure of the program as evidenced by how those routines and functions are used.

III.B.3.c. Material in the Public Domain

The Second Circuit also filtered out elements taken from the public domain, since they are not protectable by copyright. In Gates v. Bando, the Tenth Circuit added two filters to be used. The first was processes:

When considering utilitarian works such as computer programs one of the most important of these elements is process. Section 102(b) denies protection to procedures, processes, systems and methods of operation. The legislative history of the Copyright Act clarifies any ambiguity about the status of processes.

Some concern has been expressed lest copyright in computer programs should extend protection to the methodology or processes adopted by the programmer, rather than merely to the “writing” expressing his ideas. Section 102(b) is intended, among other things, to make clear that the expression adopted by the programmer is the copyrightable element in a computer program, and that the actual processes or methods embodied in the program are not within the scope of the copyright law.

The Supreme Court addressed the copyrightability of a utilitarian process in Baker v. Selden. There, the Court considered whether the author of a book describing an accounting system could obtain protection over the system itself through copyright of the book. The Court distinguished the “art” or process from the author’s explanation thereof and found the former unprotectable. Other courts have similarly found processes unprotectable. Certain processes may be the subject of patent law protection under Title 35 of the United States Code. Although processes themselves are not copyrightable, an author’s description of that process, so long as it incorporates some originality, may be protectable.

Returning then to our levels of abstraction framework, we note that processes can be found at any level, except perhaps the main purpose level of abstraction. Most commonly, processes will be found as part of the system architecture, as operations within modules, or as algorithms. {FN62: 9 F.3d at 836-837, 28 USPQ2d at 1511 (citations omitted)}

III.B.3.d. Facts

The Tenth Circuit also added filtering out facts:

The Copyright Act also denies protection to discoveries. The Supreme Court squarely addressed the issue of the protectability of facts in its recent opinion in Feist Publications, Inc. v. Rural Telephone Services Co. In Feist, the Court considered the copyrightability of a telephone directory comprised merely of names, addresses, and phone numbers organized in alphabetical order. The Court rejected the notion that copyright law was meant to reward authors for the “sweat of the brow,” and instead concluded that protection only extends to the original components of an author’s work. As to facts, the Court found that

No one may claim originality as to facts. This is because facts do not owe their origin to an act of authorship. The distinction is one between creation and discovery: the first person to find and report a particular fact has not created the fact; he or she has merely discovered its existence. . . . One who discovers a fact is not its maker or originator. The discoverer merely finds and records.

Like ideas and processes, facts themselves are not protectable; however, an author’s original compilation, arrangement or selection of facts can be protected by copyright. However, “the copyright is limited to the particular selection or arrangement. In no event may copyright extend to the facts themselves.” In computer programs facts may be found at a number of levels of abstraction, but, will most often be found as part of data structures or literally expressed in the source or object codes. {FN63: 9 F.3d at 837, 28 USPQ2d at 1511 (citations omitted)}

Of course, while it is reasonable to say that facts are not entitled to copyright protection and should be filtered out, a particular selection of facts or arrangement of facts may have sufficient originality to warrant copyright protection. But if you filter out all the facts before you look to see if there is sufficient originality in a selection or arrangement of facts, you have nothing left to consider. This is one of the problems with the abstraction-filtration-comparison test – key information may be filtered before it can be considered. For that reason, it may be necessary to look at the overall similarity before considering things at the various levels of abstraction.

III.B.4. Comparison

Finally, after all the unprotectable abstracted elements have been removed by the filtering, the comparison is made to see if what is left is substantially similar:

The third and final step of the test for substantial similarity that we believe appropriate for non-literal program components entails a comparison. Once a court has sifted out all elements of the allegedly infringed program which are “ideas” or are dictated by efficiency or external factors, or taken from the public domain, there may remain a core of protectable expression. In terms of a work’s copyright value, this is the golden nugget. At this point, the court’s substantial similarity inquiry focuses on whether the defendant copied any aspect of this protected expression, as well as an assessment of the copied portion’s relative importance with respect to the plaintiff’s overall program. {FN64: 982 F.2d at 710, 23 USPQ2d at 1256 (citations omitted)}

Most circuits have adopted the abstraction-filtration-comparison test for determining if there is copyright infringement when an exact copy has not been made. But for complex software, the test can be very cumbersome, as we will discuss later. Also, as mentioned above, the test may filter out important information to be considered and then find that there is little similarity because there is nothing meaningful left to compare.

Fortunately, because of the complexity of today’s commercial software packages compared with those in Whelan v. Jaslow and Computer Associates v. Altai, it is far more likely that an infringer will simply copy the commercial program onto a machine he is selling or a CD-ROM to sell on the black market, rather than just copy a protected portion of the copyrighted software. Those who do copy the structure of a program, or its processing techniques, most likely will have had access to the source code for that program as an employee of the software vendor or under a trade secret agreement, and will be violating their duty to that vendor in their design and implementation of a competing product. So, in many instances, the complex abstraction-filtration-comparison test will not be necessary.

Next section: Methods of Operation