database-driven scholarship

¶ 1 Leave a comment on paragraph 1 1 One key element in building such a network will be a shift in our understanding of the relationship between the individual text and the many other texts to which it might potentially connect. Lev Manovich has convincingly argued in The Language of New Media that the constitutive features of computerized media forms include the modularity of the media elements they involve, the automated processes that can be used to bring them together, and the variable nature of the texts that such processes create. If this is so, it stands to reason that digital publishing structures designed to facilitate work within the database logic of new media, in which textual and media objects can be created, combined, remixed, and reused, might help scholars to produce exciting new projects of the kind that I discussed near the end of the last chapter. Such a platform, for instance, might fruitfully allow authors to create complex publications by drawing together multiple pre-existing texts along with original commentary, thus giving authors access to the remix tools that can help foster curation as a sophisticated digital scholarly practice. Curated texts produced in such a platform might resemble edited volumes, whether by single or multiple authors, or they might take as yet unimagined forms, but they would share the ability to access and manipulate a multiplicity of objects contained in a variable, extensible database, that could then be processed in a wide range of ways, as well as allowing users the ability to add to the database and to create their own texts from its materials.

¶ 2 Leave a comment on paragraph 2 0 Numerous such databases exist, of course; extensive digital projects focused on the creation of archives and repositories have developed since the early days of popular computing. The oldest and most famous such archive may be Project Gutenberg, founded by Michael Hart in 1971. Hart’s philosophy in beginning the production of this archive was that “anything that can be entered into a computer can be reproduced indefinitely” (Hart); perhaps more importantly, anything so entered can also be processed in a wide variety of ways. The potential value involved in creating a full archive, in “Plain Vanilla ASCII,” of the wealth of texts available in the public domain, is evident: these texts can not only be read on a wide variety of platforms, but also repurposed in a range of other projects. The scholarly value of Project Gutenberg, however, may be open to a bit of question; as Michael Hart has noted, “Project Gutenberg has avoided requests, demands, and pressures to create ‘authoritative editions.’ We do not write for the reader who cares whether a certain phrase in Shakespeare has a ‘:’ or a ‘;’ between its clauses. We put our sights on a goal to release etexts that are 99.9% accurate in the eyes of the general reader” (Hart). Scholars, however, do care about the authoritativeness of the objects with which they work, and therefore a range of authoritative digital archives of work by and about a number of authors have been created, including The William Blake Archive, The Walt Whitman Archive, The Swinburne Project, and so on. These projects are grounded in the large-scale digitization of published and unpublished texts, images, and other materials related to the work and lives of these authors, creating extensive searchable databases of digital objects that can potentially be reused in a wide range of scholarly projects.

¶ 3 Leave a comment on paragraph 3 2 The problem in developing such new forms of publication as these databases, however, is what Jerome McGann has referred to as one of the crises facing the digital humanities: such “scholarship — even the best of it — is all more or less atomized”; the various digital texts and collections that have been created are “idiosyncratically designed and so can’t talk to each other,” and there are no authoritative, systemic, searchable bibliographies of these projects that enable scholars to find the digital objects they’d like to reuse (McGann 112). In response to these problems, McGann and the Applied Research in ‘Patacriticism group at the University of Virginia began developing NINES, the Networked Infrastructure for Nineteenth-century Electronic Scholarship, as “a three-year undertaking initiated in 2003 by myself and a group of scholars to establish an online environment for publishing peer-reviewed research in nineteenth-century British and American studies” (McGann 116). NINES has since become an aggregator for peer-reviewed digital objects published in a range of venues. This project, which has received significant funding from the Mellon Foundation, was established as a means of averting atomization in the digital humanities, bringing separate projects into dialogue with one another. The NINES goals, as described on the site, are:

¶ 4 Leave a comment on paragraph 4 0 * to serve as a peer-reviewing body for digital work in the long 19th-century (1770-1920), British and American;

¶ 5 Leave a comment on paragraph 5 0 * to support scholars’ priorities and best practices in the creation of digital research materials;

¶ 6 Leave a comment on paragraph 6 0 * to develop software tools for new and traditional forms of research and critical analysis. (“What is NINES?”)

¶ 7 Leave a comment on paragraph 7 0 Among the tools that NINES has developed are Juxta, a system for online textual collation and analysis, and Collex, which forms the core of the NINES site today. Collex is an aggregator tool that searches multiple scholarly databases and archives, with 58 federated sites represented, including library and special collection catalogs, repositories, journals, and other projects; Collex allows a user to find objects in a wide range of such locations, and then to “collect” and tag such items, structuring them into exhibits [see screenshot 3.2].

¶ 8 Leave a comment on paragraph 8 0 The tagging function of Collex serves to add user-generated metadata to that which has already been expert-created within the various collections and archives that NINES draws together, but the key aspect of this “folksonomy” arises when the user then re-shares the tagged objects; as Kim Knight has argued, “Collex’s folksonomical characteristics only take on interpretive importance as the community of users develops and collections and exhibits are shared” (Knight). As NINES/Collex developer Bethany Nowviskie has noted, however, one of the project’s primary focuses is on precisely such an “expansion of interpretive methods in digital humanities,” through the connection and juxtaposition of digital objects and the production of commentary on and around them. The potential impact of such curatorial work could be enormous, as scholars find new ways to discover, manipulate, connect, and comment upon digital research objects. One problem facing the system, however, is that, as Elish and Trettien point out, “in reality, the information that NINES aggregates is quite shallow, most of it only metadata, or information about information” (Elish and Trettien 6). The majority of the “objects” that NINES is currently able to retrieve in a search are in reality only citations or catalog entries rather than the objects themselves. However, as access to primary objects alongside this metadata is increased, the functionality of Collex as a research and publishing tool will no doubt grow.

¶ 9 Leave a comment on paragraph 9 2 Other such collection- and exhibit-building projects are in production as well; the Center for History and New Media, most notably, is developing Omeka, a simple but extensible open-source platform that, once installed, enables the creation, organization, and publication of archival materials in a wide range of formats, producing sophisticated narratives through the combination of digital objects with text about them. Omeka’s ease of use and granular publishing structure resemble that of a blog engine, leading Dan Cohen to describe the project as “WordPress for your exhibits and collections” (Cohen). Were an engine like Omeka able to access already existing repositories of digital texts and objects, the platform could enable scholars to repurpose those objects in engaging ways, creating new forms of networked arguments driven by the interaction of their constituent elements, thus bringing together NINES’s database access with a rich networked structure for publishing new projects.

¶ 10 Leave a comment on paragraph 10 0 Beyond such collection and exhibit software, however, a wide range of tools are being developed to support what has been called “data-driven scholarship” in the humanities; these tools include SEASR, which allows scholars to perform sophisticated forms of textual analysis, to process the results of that analysis, and to create rich visualizations of the data that the analysis returns. Other tools such as Pliny allow scholars to create rich annotations for the objects they are studying and then to organize those annotations in ways that highlight the relationships among the objects. Annotation, organization, analysis, and visualization represent new, computer-native modes of academic work, all of which permit scholars to find and analyze patterns at a scale previously impossible. One problem tools such as these face, however, is uptake; as a report from a meeting entitled “Tools for Data-Driven Scholarship: Past, Present, Future” notes, “the vast majority of scholars who are not directly involved with the creation of digital tools and collections are not adopting these new applications and resources in the number one might anticipate this far into the digital revolution” (Cohen et al). To some extent, the report indicates, failures in uptake have to do with lapses in communication; scholars are too often unaware that such tools exist.[3.14] But even once found, there’s also a lingering uncertainty about what exactly one might do with such tools — what they’ll accomplish, what the resulting project will look like. The goal of scholarship, after all, is to communicate an idea, and it’s often less than clear how these tools will help scholars achieve that goal.

¶ 11 Leave a comment on paragraph 11 0 Each of the projects discussed above is focused on the interactions among texts that the modularity, automatization, and variability of computer-based media might enable. What hasn’t yet been fully realized in many of these projects, however, is the key aspect of interaction between the reader and the text; despite all of the wonderful work being done on NINES, through Omeka, and in a range of other exciting digital tools, that work remains largely author-centric. Given the discursive purposes of scholarship, it might be useful to explore the ways that, long before the development of the digital network, the circulation of texts operated within and was driven by the social networks of their readers.

Several excellent resources now exist designed to help scholars find the right tools for conducting new forms of digital scholarship; most notable among these may be the Digital Research Tools Wiki, which organizes a number of such tools by their potential use. See also the Transliteracies project, which houses a number of extensive reviews of such tools.

Page 31

Comments

0 Comments on the whole Page

1 Comment on paragraph 1

lydia 27 May 2014 at 9.48 am

This almost seems like a social networking for scholarly work.

Reply to lydia

0 Comments on paragraph 2

2 Comments on paragraph 3

Dorothea Salo 28 September 2009 at 2.41 pm

You may be planning to get to this point later, but the atomization you speak of has serious negative implications for the sustainable preservation of the work. A multiplicity of formats, content models, and technology platforms makes the digital preservationist’s job much harder. We have lost many early primary-text databases already, and stand to lose more, for this reason.

Reply to Dorothea Salo
m1215 26 May 2015 at 9.16 pm

The insularity of digital humanities was never really something that occurred to me before, even as I explored various, independent projects.There are obvious benefits to a more interconnected DH where searching and reuse can create new knowledge. I can also think of a number of reasons why atomization has occurred and why it’s still the landscape in 2015. The issue discussed in the paragraph of emerging technology,the inability to communicate across platforms, and an absence of standards, almost reminds me of how the Data Documentation Initiative (DDI) metadata scheme came about. Is that along the lines of something Digital Humanists should be looking at? Do many Digital Humanist see atomization as a problem? Feel that project integration and reuse could be beneficial? I’m curious to get a pulse on this now.

Reply to m1215

amandafrench 23 September 2009 at 5.12 am

Omeka now has OAI-PMH ingest capabilities (also a CSV importer), which does move it very much in the direction you’re pointing. It’s funny; I was talking with a archival management student about that very point earlier today. Omeka bills itself also as a means for creating an “online archive,” but whereas a physical archive is essentially defined by the fact that it contains unique items, a digital archive made with Omeka can in fact consist entirely of an aggregation of formerly dispersed materials. And part of the point of assembling such an “archive” is so that people may reuse its objects in other aggregations.

Reply to amandafrench
1. Kathleen Fitzpatrick 25 September 2009 at 1.02 am
  
  Very cool — thanks for the info!
  
  Reply to Kathleen Fitzpatrick

Lab #1: NINES | ENGL 386: Victorian Poetry 13 October 2010 at 10.29 am

[…] Kathleen Fitzpatrick, “Database Driven Scholarship”: http://mcpress.media-commons.org/plannedobsolescence/three-texts/database-driven-scho… This entry was posted in Uncategorized. Bookmark the permalink. ← Electronic Resources: […]

Cancel

Planned Obsolescence