In the Cloud, you can’t choose your neighbors
Jun 23rd, 2011 by Isaiah Beard

Some recent, high-profile security-related events are adding another wrinkle of complexity for those who are trusting the cloud for their data storage and content delivery: who your neighbors are, and what they might be doing.

On June 21, the FBI raided a Reston, Virginia based server farm for Swiss hosting provider Digital One.  While the agency isn’t commenting, the speculation is that they were looking for data related to a single hacker group – LulzSec – responsible for recent numberous high-profile security breaches waged against Sony Corporation and several law enforcement agencies.

Unfortunately, that raid entailed the physical removal of multiple pieces of server hardware that, among other things, served as the virtual, cloud-based home for dozens of other websites.  Most of these affected parties are presumed to be legitimate customers that were storing data or serving web content… conducting real business that wasn’t running afoul of any laws.

As a result, several high profile corporate content developers, including Instapaper, Curbed Network, and Digital One’s own website and support system, were either suffering degraded service or were taken completely offline for more than a day.  Without a backup, the data could have been lost indefinitely while the FBI conducts whatever investigation on whatever client captured their interest.

The ramifications of this event are clear: Cloud services are shared services.  One of the big advantages of the Cloud is the notion that multiple entities can share the same large datacenter and resources without necessarily having to buy it all themselves.  Unfortunately, it’s rare in a public Cloud setting that you are allowed to choose who you’re sharing your resources with.  Often, this isn’t a big deal, but if your “neighbor” happens to be attracting a lot of attention (from hackers or law enforcement agencies), then your data and operations may also be affected as a result.

This is yet another reason to consider having a backup plan, and not totally entrusting all of your data to a single Cloud vendor.

Japan’s Crises, and its ramifications on digital preservation
Mar 23rd, 2011 by Isaiah Beard

A Sony HVR-Z1U camera. This device is a digital video workhorse at the SCC, and relies heavily on digital video tape... something which could be rather hard to come by in the near future.

My heart, thoughts, and a donation goes to those affected by the Earthquake, Tsunami, and now radiological crisis that Japan must grapple with.  It’s not exaggeration to say this turn of events is truly unprecedented.  Sitting thousands of miles away, and only observing the events through websites and television screens, I’m aware that I cannot possibly grasp the ordeal that survivors now face.

With that preface, it’s difficult to even think at this point of how the disaster will inconvenience those of us far removed.  However, there will be a rather significant impact for quite some time, given our technological dependencies in a digital world, the number of electronic components and supplies that are produced in Japan, and how we use those components to capture our current history and cultural heritage.

Our first hints of trouble began with an advisory issued to consumers of magnetic tape media. Sony, a major manufacturer of various varieties of tape media as well as semiconductors, optical discs such as DVD and Blu-ray, and electronic components, has been hit hard.  Sony was forced to shut down a number of factories in the region while recovery efforts continue. The earthquake has forced a halt to production in various manufacturing facilities in Japan, including those of magnetic media manufacturers, and suppliers are now warning of an impending shortage and possible price spikes:

“Our industry has already been affected by a halt in media manufacturing operations – professional media supply shortages are evident, namely HDCam SR,” explained a post on the Comtel Pro Media web site. “Worldwide stock shortages present a realistic threat to our industry and the immediate needs of the television and motion picture production.”

Of particular note is a shutdown of the Sony Corporation Sendai Technology Center, currently the only facility in the world producing HDCAM-SR tapes.

Read the rest of this entry »

Lessons Learned from Google’s temporary Gmail loss
Mar 1st, 2011 by Isaiah Beard

GMail kept users notified through a status page of their ongoing recovery efforts.

This past week offered up a little dose of panic to an estimated tens of thousands of users to Google’s free Gmail service, when they logged in to discover that all of their e-mail was missing.  According to Google:

We released a storage software update that introduced the unexpected bug, which caused 0.02% of Gmail users to temporarily lose access to their email. When we discovered the problem, we immediately stopped the deployment of the new software and reverted to the old version.

Read the rest of this entry »

NY Times Article on the realities and costs of Born Digital preservation
Mar 16th, 2010 by Isaiah Beard

Salman Rushdie. Source: Wikipedia. Click on image for link to source.

The New York Times today published an article that reflects some of the challenges of preserving born digital content – that is, documents, data and other content that has been created digitally, on a computer or electronic device, and for which there is no physical original (such as on paper).

In particular, they highlight the efforts of Emory University, in preserving Salman Rushdie’s archival materials.

Among the archival material from Salman Rushdie currently on display at Emory University in Atlanta are inked book covers, handwritten journals and four Apple computers (one ruined by a spilled Coke). The 18 gigabytes of data they contain seemed to promise future biographers and literary scholars a digital wonderland: comprehensive, organized and searchable files, quickly accessible with a few clicks.

But like most Rushdian paradises, this digital idyll has its own set of problems. As research libraries and archives are discovering, “born-digital” materials — those initially created in electronic form — are much more complicated and costly to preserve than anticipated.

Electronically produced drafts, correspondence and editorial comments, sweated over by contemporary poets, novelists and nonfiction authors, are ultimately just a series of digits — 0’s and 1’s — written on floppy disks, CDs and hard drives, all of which degrade much faster than old-fashioned acid-free paper. Even if those storage media do survive, the relentless march of technology can mean that the older equipment and software that can make sense of all those 0’s and 1’s simply don’t exist anymore.

Imagine having a record but no record player.

An interesting aspect of this collection and its exhibition is that it emulates the experience Rushdie had in creating the content.  Rather than just viewing the finished documents, you get to see the computer desktop as he saw it, open up the same applications he used, all in the 1980s and 1990s technological contexts… and not using the modern, Web 2.0, Windows 7 or Mac OS X trappings we’re accustomed to in today’s computers.

I think this article is an excellent read, irrespective of what one’s views may be on the subject matter.  Material of all kinds, in increasing amounts, faces the same perils as this collection every day, and archivists everywhere, including this one, wrestle with how best to retain it all.  So far, the only tried and true method for such types of preservation is to obsessively manage and migrate the content, and that requires making tough decisions as to how to proceed, what formats to migrate to, and hoping the decisions made are the right ones to keep the content viable, at least until the next generation of technology requires that the hard decisions be made again.

Google’s Neglected Archive
Oct 8th, 2009 by Isaiah Beard

Controversial as it may be to archivists of both the digital and analog realms, Google is often seen as the ubiquitous oracle of record by the general online population.  Websites (including libraries) strive to be listed on its search engines; it runs the de facto video archive for the internet at large; it’s made searching patents and even scholarly material easier to access than their respective online custodians in many cases.  Research institutions and businesses even now turn to it to handle their growing mounds of e-mail and electronic documents.

Yet, curators and librarians have been very skeptical of Google, its aims, and even its competency at truly being able to objectively preserve the massive digital content it aims to take on.  The ongoing Google Books drama is one such aspect of this.  But now, others outside of the world of preservationists are seriously calling Google’s competency and motives into question.

Wired.com has posted an article about what may well be an example of Google’s mismanaging of important archives.  Usenet was probably the Internet’s earliest online reference, social network, and historical archive, consisting of terabytes of text articles, files and writings from internet users (and its predecessors) dating back to 1980.  While most of the content can be quite mundane, some of these articles document milestones in online history or contain the writings of influential pioneers of the Internet.  Google, through its aquisitions of various entities over the years, became the world’s de facto curator of this content, and according to Wired contributor Kevin Paulson,Google is failing in this role:

… visiting Google Groups is like touring ancient ruins.

On the surface, it looks as clean and shiny as every other Google service, which makes its rotting interior all the more jarring — like visiting Disneyland and finding broken windows and graffiti on Main Street USA.

Searching within a newsgroup, even one with thousands of posts, produces no results at all. Confining a search to a range of dates also fails silently, bulldozing the most obvious path to exploring an archive.

Want to find Marc Andreessen’s historic March 14, 1993 announcement in alt.hypertext of the Mosaic web browser? “Your search – mosaic – did not match any documents.”

Wired’s send-up pretty much hints towards the exact concern that many archivists have about Google’s aims: that they aren’t motivated to maintain assets they hold if those assets are no longer making them any significant money.  With newer, more visually-appealing technologies largely supplanting the once-huge popularity of Usenet, its archive is not as hot a commodity as it used to be, and one can speculate that if it’s not generating enough ad revenue, Google isn’t going to care to maintain the archive or keep it functional.

So, what happens when certain subject matter in their scanned books archive becomes less-popular – and thus less visited – within the Internet’s incredibly short attention span?


SIDEBAR
»
S
I
D
E
B
A
R
«
»  Substance:WordPress   »  Rights: Creative Commons License