In the Cloud, you can’t choose your neighbors
Jun 23rd, 2011 by Isaiah Beard

Some recent, high-profile security-related events are adding another wrinkle of complexity for those who are trusting the cloud for their data storage and content delivery: who your neighbors are, and what they might be doing.

On June 21, the FBI raided a Reston, Virginia based server farm for Swiss hosting provider Digital One.  While the agency isn’t commenting, the speculation is that they were looking for data related to a single hacker group – LulzSec – responsible for recent numberous high-profile security breaches waged against Sony Corporation and several law enforcement agencies.

Unfortunately, that raid entailed the physical removal of multiple pieces of server hardware that, among other things, served as the virtual, cloud-based home for dozens of other websites.  Most of these affected parties are presumed to be legitimate customers that were storing data or serving web content… conducting real business that wasn’t running afoul of any laws.

As a result, several high profile corporate content developers, including Instapaper, Curbed Network, and Digital One’s own website and support system, were either suffering degraded service or were taken completely offline for more than a day.  Without a backup, the data could have been lost indefinitely while the FBI conducts whatever investigation on whatever client captured their interest.

The ramifications of this event are clear: Cloud services are shared services.  One of the big advantages of the Cloud is the notion that multiple entities can share the same large datacenter and resources without necessarily having to buy it all themselves.  Unfortunately, it’s rare in a public Cloud setting that you are allowed to choose who you’re sharing your resources with.  Often, this isn’t a big deal, but if your “neighbor” happens to be attracting a lot of attention (from hackers or law enforcement agencies), then your data and operations may also be affected as a result.

This is yet another reason to consider having a backup plan, and not totally entrusting all of your data to a single Cloud vendor.

Japan’s Crises, and its ramifications on digital preservation
Mar 23rd, 2011 by Isaiah Beard

A Sony HVR-Z1U camera. This device is a digital video workhorse at the SCC, and relies heavily on digital video tape... something which could be rather hard to come by in the near future.

My heart, thoughts, and a donation goes to those affected by the Earthquake, Tsunami, and now radiological crisis that Japan must grapple with.  It’s not exaggeration to say this turn of events is truly unprecedented.  Sitting thousands of miles away, and only observing the events through websites and television screens, I’m aware that I cannot possibly grasp the ordeal that survivors now face.

With that preface, it’s difficult to even think at this point of how the disaster will inconvenience those of us far removed.  However, there will be a rather significant impact for quite some time, given our technological dependencies in a digital world, the number of electronic components and supplies that are produced in Japan, and how we use those components to capture our current history and cultural heritage.

Our first hints of trouble began with an advisory issued to consumers of magnetic tape media. Sony, a major manufacturer of various varieties of tape media as well as semiconductors, optical discs such as DVD and Blu-ray, and electronic components, has been hit hard.  Sony was forced to shut down a number of factories in the region while recovery efforts continue. The earthquake has forced a halt to production in various manufacturing facilities in Japan, including those of magnetic media manufacturers, and suppliers are now warning of an impending shortage and possible price spikes:

“Our industry has already been affected by a halt in media manufacturing operations – professional media supply shortages are evident, namely HDCam SR,” explained a post on the Comtel Pro Media web site. “Worldwide stock shortages present a realistic threat to our industry and the immediate needs of the television and motion picture production.”

Of particular note is a shutdown of the Sony Corporation Sendai Technology Center, currently the only facility in the world producing HDCAM-SR tapes.

Read the rest of this entry »

Dataset sharing and preservation strategies at Rutgers
Mar 17th, 2011 by Isaiah Beard

As of January 18 of this year, the National Science Foundation has enacted policies that ensure researchers take seriously the need for data sharing and dissemination.  According to the new mandate:

Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing. See Award & Administration Guide (AAG) Chapter VI.D.4.

To that end, researchers are now required to submit a Data Management Plan with their grant requests, detailing how the project will comply with research sharing guidelines set forth by the NSF.

These requirements leave researchers with a choice: either come up with a plan on their own, or seek help from their institutions on a comprehensive data sharing and preservation model.  Fortunately, the resources and tools exist at Rutgers for its researchers to easily take the latter route.

In anticipation of these data sharing requirements, the university has setup a site to guide researchers through the ins and outs of data sharing.  The Rutgers University Research Data Archive site clearly explains the importance of sharing and preserving research data, and details some of the current offerings for researchers who need a platform to share their research data to comply with NSF guidelines.

It goes without saying that one such option listed on the site (and the platform I recommend) is the Rutgers University Community Repository.  In anticipation of this need, the RUcore team has developed the RUResearch Data Portal, a section of our digital repository meant specifically for serving research data needs.

Already trusted by faculty members to store their academic publications, and the mandatory platform for Theses and Dissertations in the Graduate School of New Brunswick, RUResearch is a natural extension of RUcore’s mission to preserve and make accessible the university’s academic output from a centralized resource that adheres to established digital preservation standards.  With RUResearch, you can not only be assured of meeting NSF’s requirements on paper, but you will also have the security of knowing your research data is truly safe and preserved.

More information on data preservation services can be found on the Rutgers Libraries Website, including dates for in-person presentations on the services we offer the academic research community.  And, if you are a researcher interested in how RUcore and the RUResearch platform can help you, contact our Data Services Librarian, Ryan Womack, and he will be able to give you the information you need to get started.

Lessons Learned from Google’s temporary Gmail loss
Mar 1st, 2011 by Isaiah Beard

GMail kept users notified through a status page of their ongoing recovery efforts.

This past week offered up a little dose of panic to an estimated tens of thousands of users to Google’s free Gmail service, when they logged in to discover that all of their e-mail was missing.  According to Google:

We released a storage software update that introduced the unexpected bug, which caused 0.02% of Gmail users to temporarily lose access to their email. When we discovered the problem, we immediately stopped the deployment of the new software and reverted to the old version.

Read the rest of this entry »

NY Times Article on the realities and costs of Born Digital preservation
Mar 16th, 2010 by Isaiah Beard

Salman Rushdie. Source: Wikipedia. Click on image for link to source.

The New York Times today published an article that reflects some of the challenges of preserving born digital content – that is, documents, data and other content that has been created digitally, on a computer or electronic device, and for which there is no physical original (such as on paper).

In particular, they highlight the efforts of Emory University, in preserving Salman Rushdie’s archival materials.

Among the archival material from Salman Rushdie currently on display at Emory University in Atlanta are inked book covers, handwritten journals and four Apple computers (one ruined by a spilled Coke). The 18 gigabytes of data they contain seemed to promise future biographers and literary scholars a digital wonderland: comprehensive, organized and searchable files, quickly accessible with a few clicks.

But like most Rushdian paradises, this digital idyll has its own set of problems. As research libraries and archives are discovering, “born-digital” materials — those initially created in electronic form — are much more complicated and costly to preserve than anticipated.

Electronically produced drafts, correspondence and editorial comments, sweated over by contemporary poets, novelists and nonfiction authors, are ultimately just a series of digits — 0’s and 1’s — written on floppy disks, CDs and hard drives, all of which degrade much faster than old-fashioned acid-free paper. Even if those storage media do survive, the relentless march of technology can mean that the older equipment and software that can make sense of all those 0’s and 1’s simply don’t exist anymore.

Imagine having a record but no record player.

An interesting aspect of this collection and its exhibition is that it emulates the experience Rushdie had in creating the content.  Rather than just viewing the finished documents, you get to see the computer desktop as he saw it, open up the same applications he used, all in the 1980s and 1990s technological contexts… and not using the modern, Web 2.0, Windows 7 or Mac OS X trappings we’re accustomed to in today’s computers.

I think this article is an excellent read, irrespective of what one’s views may be on the subject matter.  Material of all kinds, in increasing amounts, faces the same perils as this collection every day, and archivists everywhere, including this one, wrestle with how best to retain it all.  So far, the only tried and true method for such types of preservation is to obsessively manage and migrate the content, and that requires making tough decisions as to how to proceed, what formats to migrate to, and hoping the decisions made are the right ones to keep the content viable, at least until the next generation of technology requires that the hard decisions be made again.


SIDEBAR
»
S
I
D
E
B
A
R
«
»  Substance:WordPress   »  Rights: Creative Commons License