RUcore, Digital Video, and the China Boom
Oct 18th, 2012 by Isaiah Beard

Recently, members of the Rutgers University Libraries at both integrated Information Systems and the Scholarly Communication Center began an auspicious collaboration with the Asia Society in New York City, in our first project to digitally preserve, to standards, their digital video archive for The China Boom Project.  It is the first time that RUcore has ingested a fully born-digital video archive, using the original source content and project files, and creating presentation video from those source files.

The China Boom Project’s goal is to seek an answer to the question, “Why did China Boom?” The site comprises taped interviews with individuals and experts with insights into China’s rapid economic expansion in recent decades. It offers to site visitors packaged video content from these interviews arranged by subject matter and relevant time periods in China’s history, in a very effective and attractive format that is described as a “mosaic explanation.

But while the China Boom site itself provides snippets and prepackaged commentary, an ancillary goal of the project has been to partner with educational institutions to make the full-length content available to researchers, and to have the video archived and preserved.  This is where Rutgers University Libraries, and RUcore, come into the picture.

The Cranberry Genome: RUcore’s first foray into research data sharing
Aug 23rd, 2011 by Isaiah Beard

Cranberry Harvest in New Jersey. Source: USDA


A few months back, I wrote about our efforts to leverage RUcore for the benefit of the academic research community at Rutgers. The result is RUresearch, a place for Rutgers researchers to share their data with the global scholarly community.  This data sharing is particularly important in light of a National Science Foundation mandate to openly share research data that has been funded through them.

Over the summer, the RUcore team has been working with a few researchers to better understand their needs, and to work on preserving and sharing our first samples of actual research data.  In collaboration with the Philip E. Marucci Center for Blueberry and Cranberry Research and Extension, our efforts – if you’ll pardon the pun – have begun to bear fruit.

As part of funded by the U.S. Department of Agriculture Specialty Crop Research Initiative, Marucci Center researchers have extracted a genome for a cultivar of the cranberry; a fruit for which New Jersey is the third-largest producer in the US, devoting some 3,600 acres to its cultivation.

The genome research is part of a study in genetics of fruit rot-rresistance, and the data generated (using Applied Bioscience’s SOLiD 3 Plus System) takes up over 60GB of storage when compressed.  Sharing of this data to researchers who would find it useful obviously requires a system that can not only spare the storage, but be robust enough to permit open access.  Enter RUcore.

Although further refinements are in progress, the result of our collaboration is one of our first research data records in RUcore, located at this link.  The PDF attached to that record describes the link to the download point for the data sets.

While the data itself isn’t something the general public will easily recognize and interpret, the ability to share this information with other researchers can benefit all of us, through continued study into which genetic factors can make certain fruits resistant to rotting.  And it’s also a learning experience for us, in how to make that sharing among researchers a little bit easier.

The Economics of Digital Preservation, Analyzed and Digested
Mar 26th, 2010 by Isaiah Beard

If there is one thing that every organization, institution, and individual curator learns as they delve into digitally preserving their collections, it’s that digital preservation isn’t cheap.  While there are very compelling reasons for digitizing, sometimes including it being cost-effective, there are still significant startup costs and an ongoing financial commitment required to sustain and keep your digital preservation projects viable.  Planning out the initial capital outlay and budgeting the ongoing maintenance costs requires a very different funding model from traditional, physical and analog collections.

In light of this, An NSF and Mellon Foundation-sponsored Blue Ribbon Task Force on Sustainable Digital Preservation and Access was convened in 2007, to explore the problem of economic sustainability of digital preservation platforms.  Their goal is to issue “specific recommendations that are economically viable and of use to a broad audience, from individuals to institutions and corporations to cultural heritage centers.”

Their final report has been issued. and is publicly available on their site.  I highly recommend reading through the report for any curator, business, library, or educational or heritage institution that is considering a long term preservation project and needs to get a grasp on the economic realities of such an endeavor:

They also have a complete listing of their publications, including preliminary and interim reports.  And, on April 1, a Symposium to celebrate the report’s release and open discussion is being held in Washington, DC.

Designing and Implementing a Center for Digital Curation Research
Nov 17th, 2009 by Isaiah Beard

The facility I work in at Rutgers, known as the Scholarly Communication Center (SCC), has a fairly short history in the grand scheme of academia, and yet a fairly long one when it comes to the rapid changes in technology it has seen in its lifetime.  It was originally started in the 1996, and meant to be a location for university students and faculty to access a growing body of the then-nascent collection of digital content.

Back then, the internet still wasn’t very fast and wasn’t nearly as media-rich as it it seems today.  And so, most of the data-heavy reference materials arriving in digital form came to the SCC as CD-ROMs (and later, DVD format).  To accommodate this, the SCC had a lab of ten desktop computers (known as the Datacenter), dedicated solely to accessing this type of material.

But the times changed, and so did the way people accessed digital material.  As the ‘net grew in size and capacity, it no longer made sense to ship reference material on disc, and so the access moved online.  Students migrated from visiting computer labs to bringing their own laptops (and later, netbooks and handheld mobile devices).  Traffic at the datacenter dropped to virtually nothing.  The space had to be re-tooled to continue to be relevant and useful.

And so, with my taking on the newly-minted role of Digital Data Curator, and in collaboration with my colleagues, a new plan for the former datacenter was developed.  Instead of being a place to merely access content, we would be a place to create it.  Analog items that needed to be digitized would be assessed and handled here.  New born-digital content would be edited, packaged, and prepared for permanent digital archiving in our repository.  We would be a laboratory where students getting into the field – and even faculty and staff who have been here a good while – would learn, hands-on, how to triage and care for items of historical significance, both digital and analog, and prepare them for online access.

The concept for a new facility was born.  And we call it the Digital Curation Research Center.

The center is still in “beta,” as we plug along with some internal projects for testing purposes along with a couple of willing test subjects within the university and surrounding community.  This is so we can test out the workflow of the space and make tweaks and optimizations as needed.  Our plan is to officially launch the space in the Spring of 2010, with a series of workshops and how-to sessions for the various things that make digital curation vital (e.g. digital photography, video editing, audio and podcasting, and scanning).

The plan is that this will be a continual, evolving learning experience for all involved.  People who have never really used cameras and recording equipment in a historical context will learn just how increasingly valuable the content they create, and the stories it will tell, can become over time.  And those of us in the DCRC day in and day out will encounter things that we’ve never run into before, and will have to wrap our heads around the issue of preserving it effectively.

Below are related documents that provide additional information about the DCRC.  More information will be coming up as we get closer to the official launch:

The case for improved large file support in digital repositories
Nov 2nd, 2009 by Isaiah Beard

As the person responsible for handling the various file formats in RUcore, the digital library repository for Rutgers University Libraries, I’ve been looking with trepidation at the increasing sizes of the digital assets people are starting to create.  In 2004 when the architecture for this was first envisioned, very few digital items grew past the hundred-megabyte point.

How things have changed!  Video and even audio files are routinely pushing into the gigabytes, now that technology has progressed to the point where high-definiteion video and audio can be originated for ubiquitous mobile devices.  And as RUcore and other large repositories seek to preserve this content, we are finding ourselves running into a hurdle we did not anticipate: the ability for our architectures to handle these very large digital files.  In particular, files larger than 2 Gigabytes has posed some exceptions forFEDORA, our infrastructure of choice, and this is a very big deal for video content in particular.  Consider that 2 Gigabytes can comprise less than 5 minutes of HD content, and you can see our dilemna.

Added mechanisms to support these large items has been slow in coming, and have presented some difficulties of their own in implementing.  For this reason, I’ve drafted a document which explains our position on why we need uniform large file support in digital repositories.  Feel free to have a look and provide feedback.

With any luck, developers will heed the call presented here and in other institutions,a nd work to make better support for big files a reality.

