The Voyages of a Digital Collections Audit: Episode 1 (Charting Our Course)

astronaut in space holding old books with earth in the background

The crew gathers known knowledge about the system, charts a course, and gains valuable shipmates for the adventures ahead

One of my favorite shows is Star Trek: The Next Generation, and like the crews on the show, my department has embarked on an adventure to learn not about distant galaxies but the bits and bytes that have been created over the past few decades (yes, decades) as the University of Michigan Library’s digital collections. Bear with me as I explain a bit of the background, and you’ll see why I am extremely excited (seriously) about our conducting a full audit of our digital collections.

Why Do We Need a Collections Audit?

The Digital Content & Collections (DCC) department is part of the Library Information Technology division. DCC and its sister department, Digital Library Applications (DLA), were both part of the Digital Library Production Service, focused on building the platform for hosting digital collections. DLA now has responsibilities for developing the resources for all of our repositories, and DCC focuses on creating and maintaining digital collections and collection stakeholder relationships. We share responsibility for U-M Library digital collections of preserving for the long-term and by providing access to these resources, equally important parts of our mission.

We rely upon our partners to curate the content for the digital collections - partners include the Bentley Historical Library, Michigan Publishing, the Special Collections Library, museums, and community partners. This means that as we proceed with the audit, we may need to form alliances with these curators in order to understand collections, add to metadata, or other, currently unknown needs.

At this point, we have created and maintain over 280 digital collections that contain texts, images, reference materials, finding aids, and other formats. Conducting an audit will be a large project, but it is an important for us in DCC and for the University in order to provide the excellent services that we strive for. In 2016, the items in these collections received almost 54 million views, so we have a lot of people relying on these collections!

An initiative is currently underway to create a new digital collection platform, which we’re currently calling ObjectClass and would replace the existing digital collection infrastructure (DLXS) and be built on Samvera/Fedora. Once it is implemented, it will vastly change and improve the viewing experience of our digital collections. The new interfaces will also include re-thought functionality and improved accessibility options. This new infrastructure was part of the reason for starting an audit of the collections; understanding what we need from a new platform and then potentially migrating collections to a new platform requires that we really dive into where we are, what could be better, and what we need or just want for the collections going forward. DLA and DCC are responsible for building ObjectClass, and hopefully the results of this audit will add to those conversations and then later potential migrations.

Surveying the Landscape

Before we are able to migrate any current collections to the new platform, though, we need to assess and prioritize them. Are they in a technical condition that they even can be migrated, or will they be lost in the transporter beam? Sometimes, too, it may be better to let a collection ride off into the sunset wearing that red shirt rather than to continue consuming resources, both human and technical.

We need to know where we are now before we go to warp drive with ObjectClass.

We decided to send in a landing party to survey the landscape and assess about a dozen collections in a pilot. The collections were selected to try to represent the range of content, infrastructure types, metadata practices, and other aspects that we may encounter in the full audit. This process and the lessons learned will be discussed in Episode 2 (coming soon!).  

Cementing Alliances

Part of initiating the pilot audit included realizing that our rights memos and documentation for collections could stand to be improved. Documentation policies and standards from the beginning of the collection creation to the present have ranged widely, and where that documentation resides varies. Maintaining the single source of archival material in someone’s email should be against federation policy, so we’re working to aggregate all rights and related documentation into a single environment that can be accessed by appropriate parties.

We also found a strong ally along our adventures, a friend we didn’t realize would be joining us in such a committed way when we first started. The library’s Copyright Office has committed resources to assisting us in the audit and will be providing a lot of time and energy to helping us, for which we’re very grateful. If they hadn’t been able to assist without more planning, we may have needed to hold off this phase of the project until a later time, but thankfully we’re able to proceed warp speed ahead.

We are working closely with the Copyright Office to not only make sure that they have copies of documentation they need, allowing them to maintain a separate copy of documents while we have also have copies, but also working with them to make sure that when we are able, we can apply Creative Commons and other rights language at the collection level as appropriate. Applying Creative Commons statements was not a possibility when many of the collections were created, so standardizing some of the rights language statements at the collection and item level should help us as well as users. While we have had a close working relationship with the Copyright Office for several years and have improved our workflow because of that involvement, we didn’t realize that they would be needed as much for this audit as they are.

To Be Continued…

The adventure has only begun. We have conducted a pilot audit on the collections assessing technical aspects, copyright documentation, and more. We found ways to gather information that we didn’t realize, found that we are less informed than we should be about much of our collections, and found strong allies to join us for the rest of the journey.  There is much work ahead of us, new areas to explore, much to learn and amaze us, but hopefully no new alien life lurking in that digital environment. I’m all for discoveries, but I would rather not locate a tribble in this process.

If there are others who have done this or similar work, I would love to hear about it. We haven’t yet located information about others conducting this kind of full-scale audit, so any feedback or information about other case studies would be very welcome, and if you’re comfortable with it, I would love to share your feedback in future posts as we proceed with this series.

I plan to post again later on progress, stumbling blocks, and wins as we continue. I’m looking forward to the process and welcome any feedback, questions, or suggestions.