On putting old software out to pasture
Apr 7th, 2014 by Isaiah Beard

Three generations of Windows operating system versions.  Upper left: Windows 8.1, the current release from Microsoft. Upper right: Windows 7, its predecessor and likely upgrade candidate for most Windows XP users. Lower left: Windows XP, whose support from Microsoft ends today.  Lower right: the Virtualbox control panel, where each of these virtual instances are controlled off the host computer, a Mac.

Three generations of Windows operating system versions. Upper left: Windows 8.1, the current release from Microsoft. Upper right: Windows 7, its predecessor and likely upgrade candidate for most Windows XP users. Lower left: Windows XP, whose support from Microsoft ends today. Lower right: the Virtualbox control panel, where each of these virtual instances are controlled off the host computer, a Mac.

Tomorrow marks an important milestone in the lifecycle of computer software, and should be a day of concern for perhaps hundreds of millions of computer users worldwide.  April 8, 2014 is the final day that Microsoft will provide extended support for its aging Windows XP operating system.  Although Microsoft has not been providing any new features or functionality to this operating system since 2009, tomorrow’s deadline means that the company will also cease to provide important security updates to Windows XP going forward.  This potentially means that users still running the OS could be vulnerable to security risks such as viruses and malware. Although a great deal of new software titles already require a version of Windows that’s a bit more recent, it is expected that support will further decrease dramatically after tomorrow.

Arguably, this event will have a greater ripple effect in the end-of-life of any other operating system version. This is because Windows XP has endured in mainstream usage far longer than any other version of the Windows operating system, lasting nearly 13 years and still commanding a marketshare among desktop and laptop computer users of nearly 30% worldwide, even today. This impressive longevity is despite newer, more advanced versions of Windows being available since 2007, though a lot of this is owing to the fact that it’s immediate successor, Windows Vista, was met with criticism for most users who adopted it.

The immediate concern for these mainstream users is whether and how they should upgrade to newer versions such as Windows 7, or the currently released version for Microsoft, Windows 8.1. For many, this also means replacing old hardware which will not adequately run more up-to-date versions. Though, this can also be seen as an opportunity by the same users to consider other alternatives, such as using a Linux-based operating system on their existing hardware, or looking at other vendors such as Apple and their Mac line of desktop and laptop systems.

Posterity in Peril

But the consequences of this end-of-life event for Windows XP will have a greater ripple effect long after most of these users have moved on to different hardware and software. For over a decade, a vast majority of the world’s computer users play video games, use custom software, and created documents and media on a platform that is now no longer supported by its vendor. While the more widely-used pieces of software will be updated to run on current operating systems, there will inevitably be some programs and data left behind.  And this leads to serious questions for the preservationists among us. How will we preserve and keep alive old works that carry significant historical and cultural value?

One commonly used option is to preserve vintage hardware along with the software.  There is a burgeoning vintage hardware and retrocomputing community that has done exactly this, preserving specimens of computing past that have carried significance in past decades.  If you are old enough to have used a Commodore 64, TRS-80, Apple II or even older systems in years past and wax nostalgic now and then, it is this community that ensures those historic old computers live on today.

Unfortunately, not everyone has the cash, space, and means to collect and maintain antique computing hardware.  And even for those who do, legacy computers will inevitably age and fail with time, and replacement parts will get more difficult to come by as time passes, resulting in these classic machines becoming increasingly rare.  Fortunately, while the hardware may not live on to eternity, it’s a little easier to keep the software functioning for quite some time, without a lot of expense.

A virtual solution

Windows XP is a very fortunate position, as obsolete operating systems go.  This is because while the very newest computing hardware you can buy today will have a hard time running properly under Windows XP, they can all provide a virtualized environment for it that permits it to run very well, with no modification.  This stands in stark contrast to the computing environments of the past, where significant forensic work and lots of porting and programming are required to bring them back to life via emulators (a topic I’ll discuss soon in another post).

With a modern computer, Windows XP and the software running under it can be installed as a “guest” operating system in a virtual environment.  In other words, a high-end Windows 8.1 computer can have one of several virtualization suites installed, such as VirtualBox, VMWare, or Parallels, and run any operating system the user likes in its own window.  Virtualization is a technology that came into its own as Windows XP was the dominant operating system on computers across the globe, and through it, XP can continue to run, in its own sandbox, and with it the countless software programs of the past that should be preserved for their significance.

For those who are interested in trying it, there’s good news: Virtualbox is free, and a user with an existing licensed copy of Windows XP can get started fairly easy.  This tutorial explains how it’s done, step by step.  The only caveat is that virtualized software environments are best done on high-end computers with a lot of memory, and multi-core processors.  Each virtual machine “borrows” some RAM, disk space and CPU time from its “host” computer and operating system, so it helps to know that the computer you’re has resources to spare.

With luck, it is virtualization that will help Windows XP in not only being one of the longest-deployed desktop operating system environments in recent history, but also the best-preserved.

When copiers aren’t copying as they should…
Aug 12th, 2013 by Isaiah Beard

German researcher D. Kriesel discovered that certain characters are being modified by Xerox copiers, when documents are scanned to PDF.  In this example, the meanings of numeric figures were altered when the Xerox system changed out the number “6″ and with the number “8″ in multiple locations. The cause appears to be faulty compression settings, causing similar-looking characters to be overlaid and repeated in an effort to reduce the size of the scanned files.


Over the past week, there has been a great deal of buzz in the IT community about a discovery by a researcher in Germany that certain Xerox Workcentre copy/scan stations are altering the content of documents scanned to PDF. In particular, attention has been focused on the Xerox WorkCentre 7535 and 7556 models. Kriesel found that “patches of the pixel data are randomly replaced in a very subtle and dangerous way. In particular, some numbers appearing in a document may be replaced by other numbers when it is scanned.”

According to Xerox, a software update is coming to address the issue.  From their official statement:

We continue to test various scanning scenarios on our office devices, to ensure we fully understand the breadth of this issue.  We’re encouraged by the progress our patch development team is making and will keep you updated on our progress here at the Real Business at Xerox blog.

We’ve been working closely with David Kriesel, the researcher who originally uncovered the scenario, and thank him for his input which we are continuing to investigate.  As we’ve discussed with David, the issue is amplified by “stress documents,”  which have small fonts, low resolution, low quality and are hard to read.  While these are not typical for most scan jobs ultimately, our actions will always be driven by what’s right for our customers.

There are still points of contention, however.  Xerox claims the problem can be resolved by restoring the copiers to factory default settings.  Kriesel, however, has been able to show that documents still get mangled even when following these instructions.  Clearly, Xerox has a lot of to do… not just to fix the technical issues, but to regain the confidence of their users, that they can trust their copiers to make faithful reproductions.

What’s this all about, and why is it important?

A Xerox Workcentre copy/scan/print station

A Xerox Workcentre copy/scan/print station

In the past couple of decades, the office copier has evolved into something a lot more complex. No longer do these machines just spit out paper copies of the documents fed to them.  They now serve as feature-laden printing and scanning workstations.  In addition to just making paper duplicates, you can use the modern copier as a bulk laser printer to blast out hundreds of copies of a document from your computer.  Or, you can take a paper document and scan it into a digital file, often a PDF document, that can then be stored or e-mailed.

The last part is where we have our problem. In many office settings, the idea of reducing the amount of paper lying around is a desirable goal, and being able to replace old paper documents with a PDF scan makes sense. Of course, the expectation is that the PDF will exactly match the original paper document.

There’s just one problem: an absolute, exact copy would mean generating large, uncompressed images, resulting in huge PDF files that would be difficult to pass around in e-mail attachments, and cost a lot of money to store on large hard drives for archival purposes.  For many corporate settings, this would be a deal-breaker.  So, to keep file sizes down, nearly all of these copy systems (not just Xerox) compress the scanned images, using the industry-standard JBIG2 algorithm.

JBIG2 does its “magic” by finding pieces of an image that are identical (or, very close to identical), and using the same piece of data to represent all parts of the image that it feels looks similar enough.  This can be especially useful when working with text documents. Letters, numbers and words that repeat often can all share the same small data fragment, rather than every individual letter and number being uniquely described, every time they appear on a page.

However, this method of compression can also cause problems if not implemented carefully. If the software encoding the JBIG2-compressed image isn’t configured well enough to figure out the difference between two similar-looking characters – such as the numbers “6,” “8″ and “9″ for example, or perhaps the uppercase “O” and the number “0″ – then it’s possible that it might substitute the wrong data fragment for each character… in effect, changing those subtly-different-looking characters, and the meaning of that word or figure.

This can have huge ramifications when the character changes mean substantial changes in meaning. Financial documents, contact information, or any other type of critical data aren’t tolerant to having their meaning changed, potentially causing all kinds of critical, maybe catastrophic errors.

This is Reason One why, when scanning for archival purposes, preservation masters of digital objects should not use lossy compression algorithms. Now we see that it has ramifications for even day-to-day reproduction of documents, as well.
If you have been using a multifunction scan/copy/print system like this to scan and archive your paper documents, I would strongly recommend checking those documents for accuracy.  Avoid using such a machine for scanning of any critical documents you intend to archive long-term, until you have had a talk with your vendor, and know what algorithms are being used to encode image data.  Further, people using these copiers for even basic copying of documents are well-advised to inspect the accuracy of numbers, mathematical figures, and any other content where accuracy is key.


A teachable moment in personal data preservation
Apr 26th, 2013 by Isaiah Beard



An all-too-coomon sight: $3,000 worth of stealable student laptops sitting unsecured.

An all-too-common sight: $3,000 worth of stealable student laptops sitting unsecured.

It’s the time of the semester in most universities where nerves are frazzled, sleep is lost, and sadly, lots and lots of laptop thefts happen.  Where I work at the Alexander Library, the end of every semester brings throngs of students cramming for exams and finishing final projects, and they invariably bring their laptops, smartphones, and tablets with them.  Unfortunately, many are tempted to leave those devices sitting unsecured on desks when they step out for a break, despite repeated warnings not do this. Predictably, we also get the most reports of pricey electronic being stolen around this time of year as a result.

Having your expensive laptop or mobile device stolen is a humbling, stressful experience that even I have fallen victim to. However, the monetary loss of the hardware can pale in comparison to the value of the data inside the device.  Personal data can be stolen, resulting in anything from embarrassing disclosures of personal details, to outright identity theft.

Even worse: if you were working on something highly valuable to you, and you don’t have a backup copy anywhere else, the results can be devastating.

Currently circulating around social media and even local news is a photo of this flyer, posted around the Rutgers campus about a week ago:


My heart goes out to this person. Their entire academic career is now on the line because of a thoughtless criminal act.  And sadly, this isn’t the first time academic data has been lost to a theft: in Oklahoma, a similar “reward” was offered by a researcher wanting her critical data back as well.

Consider also that even if you’re vigilant, and lock down your hardware or never let it leave your sight, theft isn’t the only way you can lose your data.  Laptops and smartphones can be dropped and damaged.  Hardware failures and crashes happen.  Or a slip of the fingers could result in a file being accidentally deleted and lost forever.

But, unfortunate incidents like these can also be a teachable moment about how important it is to always have a backup plan.

If you own a mobile device, laptop, or even a desktop computer, and especially if you’re a student or academic that relies on them for your schoolwork or research, take the time right now to make sure your files are secure and backed up.  It may not be a convenient time, but data loss never makes an appointment!

Consider using an external drive, or an inexpensive cloud service, or both.  At the bare minimum, sign up for a free 2GB Dropbox account (or contact me for an invitation which will get you an extra 500MB), and store your work there as added protection.  Doing these simple steps will help ensure that you aren’t forced to try negotiating with a thief on the price to retrieve your data… further rewarding them for what they’ve done.

If the worst does happen, it may be possible to locate your stolen device if you have the right tools.  Apple devices have location tracking available through iCloud, but they have to be turned on beforehand to work.  Free tools such as GeoSense are available for Windows laptops as well.

One other thing to consider: your assignments, research data and coursework aren’t the only information kept on your devices.  Personal emails, banking data, photos, and info that can be used to steal your identity are also likely stored there.  These are things you don’t want a thief to have access to.  For this reason, you might also want to consider encrypting the storage on your mobile devices, and using strong passwords to prevent unauthorized access.

Easy to use, transparent full disk encryption options are built-in for Windows 7/8 and Mac OS X computers.  iOS devices (iPhones and iPads, starting with the iPhone 3GS and iPad 2) have encryption built in, too: just enable the passcode lock feature, and use a strong passcode to make it effective. Android devices like the Samsung Galaxy S III and IV have similar capabilities.

Using encryption helps prevent thieves from accessing your data, and that’s a good thing.  Even if there’s something irreplaceable on that laptop that tempts you to bargain with its abductor, the potential breach of your personal data probably isn’t worth it!

Evolving Standards: Updating our moving image digital specifications
Apr 8th, 2013 by Isaiah Beard

A 35mm film projector.

A 35mm film projector in operation.

As part of preservation-level digital standards, myself and colleagues have worked since 2004 to develop a best practice specification for digitizing moving images.  Our initial standard document was developed for the NJVid Portal, and was very basic in its specification.

Since then, some minor tweaks have periodically been added to the document.  But recently, some major developments have occurred with our campus infrastructure that have resulted in our need to consider slightly more substantial changes to our spec:

  • RUcore is in the process of implementing Wowza as it’s new streaming server platform, and it is already in operation for the libraries reservation streaming media service.  In conversations with the reserves and media teams handling video for those efforts, there has been some real-world testing of video quality improvements for MP4 streaming, and tweaks have been suggested for improving the quality of our streamed videos.
  • RUwireless Wifi campus connections have been upgraded to support higher bitrates (up to 3Mbps per connection).  Previously they were capped at 1Mbps.  This means that we can now push video and other content at nearly triple the data rate we were accustomed to over campus Wifi, given proper conditions.

In light of this, coupled with user demand for improvements in video streaming quality, and in preparation for Wowza streaming support on RUcore, we’re proposing changes to the digitization specs for moving images. A draft for comments is available.  Changes are noted in the document red, but to summarize:

1. MPEG-4 streaming bitrates have been increased to a minimum of 860kbps, recommendation of 2.1Mbps for high quality.
2. HD resolution is now supported at a minimum of 720p resolution.

Language has also been added to address digitization of motion picture film, and calls for a minimum of DCI 4K resolution, with support for MXF wrappers and Motion JPEG2000 where appropriate. Motion Picture Film scanning is still a moving target however, and mention is made that film digitization projects should start with a Digital Curation consult.

Library of Congress Announces National Recording Preservation Plan
Feb 21st, 2013 by Isaiah Beard

Peirce 55-B dictation wire recorder from 1945. Courtesy of Stanford University Libraries.  Source: Wikipedia.

A Peirce 55-B dictation wire recorder from 1945. Courtesy of Stanford University Libraries. Source: Wikipedia.


A good portion of our nation’s heritage has been immortalized in sound recordings.  From the late 19th century to the present, sound recordings have been used to capture music, speeches and historic events, the oral histories of people who have lived through important events in our nation’s history.

As with many electronic and mechanical recordings, however, this vast heritage is in danger.  In an effort to save what we can of these timeless recordings, the Library of Congress has put together a blueprint in the form of a National Preservation Plan.  This plan is the result of nearly a decade of work that was mandated by Congress as part of the National Recording Preservation Act of 2000.

As the Library of Congress puts in in their press release:

Experts estimate that more than half of the titles recorded on cylinder records—the dominant format used by the U.S. recording industry during its first 23 years—have not survived. The archive of one of radio’s leading networks is lost. A fire at the storage facility of a principal record company ruined an unknown number of master recordings of both owned and leased materials. The whereabouts of a wire recording made by the crew members of the Enola Gay from inside the plane as the atom bomb was dropped on Hiroshima are unknown. Many key recordings made by George Gershwin no longer survive. Recordings by Frank Sinatra, Judy Garland, and other top recording artists have been lost. Personal collections belonging to recording artists were destroyed in Hurricanes Katrina and Sandy.

The National Preservation Plan for sound recordings is available as a PDF file. In it, multiple recommendations are made, including:

  • Create a publicly accessible national directory of institutional, corporate and private recorded-sound collections and an authoritative national discography that details the production of recordings and the location of preservation copies in public institutions;
  • Develop a coordinated national collections policy for sound recordings, including a strategy to collect, catalog and preserve locally produced recordings, radio broadcast content and neglected and emerging audio formats and genres;
  • Establish university-based degree programs in audio archiving and preservation and continuing education programs for practicing audio engineers, archivists, curators and librarians;
  • Construct environmentally controlled storage facilities to provide optimal conditions for long-term preservation;
  • Establish an Audio-Preservation Resource Directory website to house a basic audio-preservation handbook, collections appraisal guidelines, metadata standards and other resources and best practices;
  • Establish best practices for creating and preserving born-digital audio files;
  • Apply federal copyright law to sound recordings created before February 15, 1972;
  • Develop a basic licensing agreement to enable on-demand secure streaming by libraries and archives of out-of-print recordings;
  • Organize an advisory committee of industry executives and heads of archives to address recorded sound preservation and access issues that require public-private cooperation for resolution.


»  Substance:WordPress   »  Rights: Creative Commons License
AWSOM Powered