Taking the IsisCB into a Dynamic Interdependent World

Much has been happening in last few months since we had our bootcamp in February. We’ve made enormous progress toward a new data infrastructure that will carry the IsisCB well into the next decade, and hopefully beyond. Since the bootcamp, I have created a new FileMaker database that will accommodate many more kinds of resources and many more sorts of attributes. Also, I am now using a Filemaker hosting service, which means that soon I will be able to make a beta version of the new system available to users.

Progress has not come without setbacks, however. This was not the system I had been planning to make public, but not long after the Melbourne team left Norman, the eScholarship Research Centre found itself undergoing some unexpected changes. The thing that most affected the IsisCB project was that the Centre’s primary programmer left for a startup elsewhere in Melbourne with “an offer that he couldn’t refuse.” Such is the life of digital projects in the academy.

Indeed, if there is one lesson that I’ve learned in the last couple of years is that the world of digital scholarship is not cheap, and because it is seldom as well capitalized as the corporate world, it cannot work at anything like the pace of even small private companies. Of all the resources, the one in the shortest supply is people. Human capital in the form of programmers are hard to keep at the academy because the best ones get quickly swept up by companies that have great offers. This means that we must be especially resourceful. And we need patience.

Since most of us are working in the open source world, the academy does offer a number of advantages. We have access to bright young talent with good ideals, and we are poised to collaborate with others around the globe. Openness and collaboration, in fact, are two of the most widely embraced values within the academy. And these are virtues that cannot be overlooked. The people that work with us may not make as much money, but they do participate in exciting projects and embrace good values.

Needless to say, the changes at the ESRC forced me to shift gears quickly. Already during the February bootcamp, it had become clear that the transforms we had been developing to get from my old Filemaker system to the new XML and HTML formats were not efficient and the entire input system would ultimately have to be abandoned. The coding was stretching us to our limits. What we needed most was a way of structuring the data so that it could operate a more flexible and relational database. Indeed, we needed to put the citations into that form from the very beginning. If we entered the data properly, it would be more flexible, complex, and relational.

I was lucky to find myself in a position where I could do a lot of work on my own with my project manager, Sylwester Ratowt. We spent a week together in April in Gainesville, Georgia, for a second bootcamp, during which time we reconceptualized the entire data structure. We created a new informatics that was relational and extensible.

After a couple of grueling months turning this idea into a fully functional database and migrating the data from the old system into the new one, we had a system built on a foundation that was not too different from the one that we had been developing with the ESRC. So when work slowed down in Melbourne, we were able to take over. I don’t see the Filemaker system as the final database solution; but I am very pleased that it now represents the new vision that has governed this project from the outset. It fully embodies the bibliographic infrastructure that we need to have in order to move our data into the new digital environment of the twenty-first century.

The main change was converting a relatively flat database into a highly relational and dynamic one. The best way that I can explain the transformation is to compare the new data structure with the traditional view of a book or journal article. When one thinks of a historical work, what immediately comes to mind is a physical or electronic entity, something self-contained that constitutes a thing that we can hold in our hand or store as single digital file. We think of it as something that exists independently. The standard bibliographical entry encourages us to think this way by describing each resource with enough detail so as to differentiate it from all the other entries in the bibliography. These bibliographical citations use conventions like author name, title, publisher, and date to describe this thing. (See the left side of Figure 1.)


Figure 1. An early view of the new Filemaker database. On the left one sees a standard citation format. On the right one can see the faceted lists that are links to the component parts of the citation. In the old system, this structure was simply impossible to create because almost all of the data was entered into fields in a single table. The new relational structure is at the center of this new dynamic interface.

Hidden within all of that description, however, are separate things that have contributed to the creation of that work. By separating the citation into its component parts (parts that the Chicago Manual of Style and its ilk describe in precise detail), then the object at its center—the work itself—no longer looks like a single independent entity. The work is something that emerges at the intersection of activities of various kinds: personal, institutional, intellectual, and so on. An author writes down her ideas; a publisher edits and disseminates text; the ideas themselves refer to other things, other ideas, and other texts. Broken apart this way, the citation shows us a different picture of the book: the network of activity that produced it. In other words, the citation becomes a node in a much larger and more complex web of interactions.

In the new system, books, articles, reviews, and other intellectual products are no longer independent objects isolated from the things that make them up. Instead, the data structure represents them as dynamic entities that exist in the midst of a collection of other objects. The world is a network, and each element gains meaning from its dependencies on other elements it is linked to. (See the right side of Figure 1, with its many blue, underlined links to authority objects.)

The new Filemaker data structure looks like an interdependent web of relationships and very much unlike a list of works published. Of course, we haven’t gotten rid of the citation itself—as you see in the figure, it is still there on the left—we’ve merely reconfigured it in a new way to make it do more. The next post will explain more about how the new system works.