Publishing, Technology, and the Future of the Academy

lockss, clockss, portico

1 Leave a comment on paragraph 1 0 By this argument, libraries must share in the responsibility for preservation by ensuring that the files they need remain accessible, in the event that publishers fail to do so. However, the difficulties involved in each and every library creating and maintaining a full archive of the materials to which it subscribes would be insurmountable; institution-specific, or even consortium-specific, archiving projects would produce too much duplication of effort, which might be better distributed and shared. The Mellon Foundation has thus taken the lead in funding a series of prototype projects, and then more substantially funding the establishment of two large cooperative projects focusing on the production and maintenance of digital journal archives. The first of these projects, centered at the Stanford University Libraries, is LOCKSS (or Lots Of Copies Keeps Stuff Safe). LOCKSS describes itself as an “international community initiative” (LOCKSS, “Home”), bringing together hundreds of university libraries worldwide. Each library installs the open-source, freely available LOCKSS system on an inexpensive desktop computer, which is then referred to as a “LOCKSS Box”; this LOCKSS box crawls the websites of publishers who have given the LOCKSS system access, capturing the presentation files (as opposed to the source files) of the journals to which the library subscribes. The LOCKSS box then maintains communication with the full network of such LOCKSS boxes, comparing the content that it has collected with other libraries’ archives and repairing any difference or damage that is found; as the project’s name suggests, the redundancy of its distributed files creates a safety net for the material. The archives that are created via LOCKSS are referred to as “light archives,” meaning that their files are immediately accessible when needed (as opposed to a “dark archive,” which remains inaccessible except under certain specific circumstances). Additionally, the LOCKSS system “preserves the content in its original format and dynamically migrates the content to a newer format, if required, when a reader requests the preserved content” (LOCKSS, “How It Works”); migration-on-access allows the files to be preserved in their original presentation formats (thus meeting archival requirements), while providing a means of preventing the files’ formats from becoming technologically obsolete that absolves individual libraries of migration responsibilities.

2 Leave a comment on paragraph 2 1 The LOCKSS project is emblematic of a community-driven preservation program; the hardware is inexpensive, the software is free, and the network is self-correcting. The system is maintained, and the direction for its future development set, by the subscribing members of the LOCKSS Alliance. Alliance members thus have more input over the system’s functioning than do unpaid users, and they have the ability to “collect and preserve premium content not available to the general LOCKSS community” (LOCKSS, “LOCKSS Alliance”). But the basic functionality of the system is made available to any interested library. As Don Waters has described it in “Good Archives Make Good Scholars,” an archive such as this one, like Robert Frost’s wall, is of necessity a communal endeavor, but it’s also the endeavor that builds the community: “what makes good neighbors is the very act of keeping good the common resource between them – the act of making and taking the time together to preserve and mend the resource” (91). LOCKSS is thus grounded in the alliance that it has forged amongst libraries, recognizing that strength of preservation rests in the numbers of and connections among entities doing the preserving. Moreover, the LOCKSS Alliance has begun a second cooperative project named CLOCKSS (for Controlled LOCKSS), through which a select number of member libraries archive not just the journals to which they subscribe, but all journal content to which publishers allow access, both presentation files and source files, thus seeking to provide a “sustainable, geographically distributed dark archive with which to ensure the long-term survival of Web-based scholarly publications for the benefit of the greater global research community” (CLOCKSS, “Home”). In the case of a select number of “trigger events” (including a publisher going out of business, a publication being discontinued, and so forth), CLOCKSS will make its archives of that preserved material available not just to its members but to the entire scholarly community.[4.32] CLOCKSS will not, however, provide post-cancellation access to its archives, and is thus not a substitute for the local archive provided by LOCKSS.

3 Leave a comment on paragraph 3 0 The second such preservation project originally funded by the Mellon Foundation is Portico, a project of Ithaka, the parent organization of J-STOR; Portico is now governed by an advisory committee of librarians and publishers, and is supported by publisher contributions and annual library payments. Portico is a centralized system that produces dark archives of electronic journal literature; rather than maintaining locally installed software, libraries participate in Portico via an annual subscription. Rather than libraries doing their own archiving, Portico archives the publisher source files for all approved content, “normalizing” these files into a standard archival format that will permit their long-term management (see “Portico’s Archival Approach”). This initial migration can be followed by future migrations as technological formats become obsolete. Like CLOCKSS, Portico’s archives remain dark until the occurrence of a trigger event; unlike CLOCKSS, those archives are then opened only to Portico subscribers.

4 Leave a comment on paragraph 4 0 The differences between LOCKSS and Portico are thus in part the difference between a co-op and a subscriber service, with very different implications for the libraries involved. As Karen Schneider wrote in Library Journal:

5 Leave a comment on paragraph 5 0 LOCKSS is attractive to libraries already comfortably maintaining servers and open source software; for these institutions, Portico’s proprietary software and annual licensing fees are less appealing. Librarians using Portico counter that LOCKSS has fewer publishers participating (one librarian at an institution with a large e-journal collection reported that LOCKSS had 12 percent of its titles and Portico 33 percent) and stress Portico’s ease of use, as Portico maintains the content on its own servers. (“Lots of Librarians”)

6 Leave a comment on paragraph 6 0 Additionally, the two projects espouse different archiving and migration philosophies, as LOCKSS maintains the original presentation files while Portico maintains standardized content in non-proprietary formats.[4.33] A study published by the Joint Information Systems Committee in 2008 comparing the performance of the two systems, among a range of other such programs, acknowledges that the preservation landscape presents “a confusing and not wholly reassuring picture to those professionals trying to make sense of what is happening and looking for simple, clear-cut guidelines” (Morrow et al 7). While LOCKSS provides immediate availability of publisher files that suddenly become inaccessible, it has lower publisher buy-in than Portico, as some publishers feel their intellectual property rights threatened by having content archived in multiple locations. By contrast Portico, with its centralized, dark archives, has obtained far greater publisher participation, but has a much higher threshold for the release of its files, and thus subscribers may face a potentially longer delay before archived material can be made available. LOCKSS requires a relatively small investment from libraries, primarily for staff and equipment, but it does require some ongoing technical maintenance; Portico eliminates the need for such in-house maintenance, but does so by imposing significant annual subscription costs on libraries.[4.34] Don Waters has suggested that Mellon’s decision to fund the startup of both projects was meant “to give the marketplace of scholarly institutions an opportunity to vote with their own investments” (quoted in Schneider, “Lots of Librarians”). However, the JISC report concludes that neither project can as yet be considered to provide complete insurance against the potential disappearance of the digital scholarly record: “None of the current initiatives is likely to yet fulfil all the access and archival needs of a modern library” (Morrow et al 7). That said, the report strongly suggests that both approaches “deserve support,” and that libraries should invest in “well thought through and sustainable archiving solutions” (7).

7 Leave a comment on paragraph 7 0 Of course, these programs are for the most part journal-specific, and the more digital our publishing systems become, the more we’re going to need to think about these same questions with respect to digital books, as well as a wide variety of forms of born-digital scholarship.[4.35] As Borgman has pointed out, and as the Kindle-Orwell incident confirms, the business model for e-book publishing remains in flux; sales of digital monographs

8 Leave a comment on paragraph 8 0 may follow the leased bundles models of journals. Libraries and individuals could subscribe to digital books, much as they subscribe to movies with Netflix. Rather than borrow or purchase individual titles, they may have access to a fixed number of titles at a time, or ‘check books out’ for a fixed period of time. These models raise a host of questions about relationships among publishers, libraries, and readers with regard to the privacy of reading habits, the continuity of access, preservation, and censorship. (Borgman 113)

9 Leave a comment on paragraph 9 0 Some subset of that “host of questions” was raised by Clifford Lynch as far back as 2001:

10 Leave a comment on paragraph 10 1 * Can you loan or give an e-book (or access to a digital book) to someone else as you can a physical book? To what extent to digital books mimic (and perhaps even improve upon) physical books, and to what extent do they break with that tradition? What other constraints on usage (for example, printing) exist?

11 Leave a comment on paragraph 11 0 * Do you own objects or access? If your library of e-books is destroyed or stolen, can you replace it without purchasing the content again simply by providing proof of license or purchase? One very interesting service is a registry that allows you to replace your e-books if you lose your appliance.

12 Leave a comment on paragraph 12 0 * From whom are you really obtaining content — the e-book reader vendor, a publisher, or some other party? Who has to stay in business in order to ensure your continued ability to use that content? What happens if the source of your content goes out of business?

13 Leave a comment on paragraph 13 0 * 
Can you copy an e-book for private, personal use? If you own two readers, can you move a digital book from one to the other without having to purchase it again?

14 Leave a comment on paragraph 14 0 * 
Do you have the right and the ability to reformat an e-book or a digital book in response to changes in standards or technologies or do you need to repurchase it? What happens when you upgrade or replace your e-book reader with another one? What happens when you replace the PC that might house your “library”? What happens if you replace one brand of e-book reader with another, perhaps because your reader vendor goes out of business?

15 Leave a comment on paragraph 15 0 * Do you have to obtain e-books on a pay-per-view or other limited time rental basis or do you buy a perpetual license to the content, or ownership of a copy?

16 Leave a comment on paragraph 16 0 * What are the policies of the content provider with regard to your privacy and to usage monitoring? What limitations does your book reader technology place on the ability of a content supplier to collect usage data? (Lynch)

17 Leave a comment on paragraph 17 0 Few of these questions have as yet been adequately answered, and the answers that we do have, most of which point to increasing levels of digital rights management and decreasing user control, are unsatisfying. And other questions have continued to crop up alongside these: What plans exist for archiving multimodal scholarship? Are our open access journals and repositories adequately backed up? What provisions are being made for preserving access to data sets and other digital source material? Such questions about access will no doubt proliferate as new modes of scholarly work expand; it seems clear, however, that whatever long-term solutions to problems of preserving digital scholarly content arise will of necessity be social in origin, requiring the input and commitment of many individuals and institutions in order to succeed.

  • Questions have been raised, for the obvious reasons, about the sustainability of a system that does not require participation in order to receive its benefits (see, for instance, Morrow et al 17). CLOCKSS, however, believes that it will be able to reduce fees at the end of five years, once an endowment has been raised (see “CLOCKSS FAQ”).
  • The JISC report mentioned in the following system describes the benefits and drawbacks of each of these philosophies as follows: “The advantages of source file preservation [as used by Portico] is that it is very complete (and likely to include more content than appears in the journal); is received directly from the publisher and is frequently delivered or converted to a few normalized formats facilitating long-term preservation. The disadvantages are that it requires a large upfront investment; there is no assurance that the archive will actually be needed; and the presentation will almost certainly differ from that of the publisher.
       The advantages of harvesting presentation files (rendition archiving)[the LOCKSS approach] are that it is possible to retain the look and feel of the publication and initial costs are likely to be lower. The disadvantages of this technique are that it may be more difficult to preserve the content over time (for example, a strategy for the large scale migration of presentation files from one format to another is still untested)” (Morrow et al 9).
  • See Morrow et al 16-18.
  • Portico is moving toward the preservation of e-book holdings, with hundreds of titles (primarily published by Elsevier and Walter de Gruyter) listed as “queued” on their website.
  • Page 40

    Source: http://mcpress.media-commons.org/plannedobsolescence/four-preservation/lockss-clockss-portico/