One of the requirements of any proper data system is having the data properly stored, archived, and readily available. This has led many of us to data warehousing solutions with data stored on various types of media in various locations.
No single location can be considered truly secure. If the data are really important, the storage facilities must not only be geographically separated, but also in dissimilar locations. Why run the risk of using storage facilities in high profile targets like New York and San Francisco if you had the option of a data warehouse in Eagle Butte, South Dakota. (Where? you ask. Thats the point.)
Overkill? Maybe, until you consider that weve just completed a century that spawned almost continuous warfarenot to mention the advent of nuclear weapons, electromagnetic pulses, ethnic and social genocide, and two world wars. And this century has begun to march on a similar path of violence and destruction. Better opt for geographical diversity.
Ideally, your data will be stored on a variety of media simply because various media types have different optimal uses and longevities (see below). And ideally it will be in a format that can be readily accessed and interpretedboth now and in the future. The digital information age is barely 50 years old, and already data has become unusable for a variety of reasons. Simply storing something digitally is no guarantee of being able to access it in the future.
Consider some of the challenges that face us as we protect our digital assets.
Why Digital?
I find it interesting that we have rushed headlong to digitize everything related to information exchange. Yet the world itself is not digitalthe universe is not a series of discrete events that, taken as a whole, make up all that we know and imagine. It is a time-space continuum, much more like the analog media we are rapidly destroying. (Yes, yes, quantum mechanics says that things dohappen in discrete amountsquantabut thats on the subatomic scale and doesnt directly apply on the Newtonian level we live on.)
I have yet to hear any music from a digital system that can approach the rich clarity and brilliance of music I used to listen to on high-fidelity vacuum tube systems using an analog vinyl recording. Is this just the failed (and enhanced) memories of a middle aged man? Perhapsbut a musician friend of mine tells me that many instruments (piano, guitar) cannot be reasonably synthesized by even the most sophisticated digital instruments available; there are simply too many subtle indefinable variations that the artist imparts to the music through his instrument. In fact, some recent articles say that, contrary to popular belief, some amount of noise is necessary to enhance the quality of music.
How did we get to digital in the first place? Through our ignorance. Leibniz (thats Baron Gottfried Wilhelm von Leibniz, 1646-1716) wanted to calculate the area under a curve. He reasoned that by filling the space with measurable objects he could approximate the area. By reducing the size of those measurable objects (and increasing their number) he was able to approximate the area. The calculus was born.
More interestingly, digitization was born. The human mind (and thus machines created by that mind) seems able to process and understand discrete bits of data better than it can comprehend a continuum. The exceptional brilliance of a man like Albert Einstein was not his ability to mathematically describe a multi-dimensional universeit was his ability to envision that universe. All mathematics is simply a logical system built from certain postulates to aid the human mind in interpreting and working in the world as we know it. There is nothing universal about the truths of mathematics. Likewise, there is nothing inherently better or universal about digital.
I wouldnt be at all surprised if future systems of data processing rely on more of an analog model.
When Digital Isnt
When we think of digitizing something simple like text, we naturally think in terms of the binary number system. Digital data is, after all, just a stream of 1s and 0s (or ons and offs). So we convert our base 10 (decimal) system to base 2 (binary), and end up with 0010 representing the number 2 and 1111 representing the number 15. By giving every letter, number, and punctuation mark a numerical value, we can represent each with a binary valuea series of ons and offs. Any text can thus be represented using a binary stream of data.
Sounds simple, doesnt it? Its just a binary version of A=1, B=2, and so on. But our digital systems dont actually work that way. Systems in use now use hexadecimal (base 8) to represent binary data. Using the ASCII system, a lower-case A is represented by the hexadecimal number 61a somewhat arbitrary number.
My point is, there is no universal way of digitizing even the simplest dataits all open to interpretation. Having access to a digital data stream does not guarantee that it can be correctly deciphered. Egyptian hieroglyphics were pretty much of a mystery until the Rosetta Stone was discovered in 1799, which included parallel texts in Greek, Egyptian demotic, and hieroglyphs, thus providing a key to their decipherment. But even with that key, most translations of hieroglyphics are guesses and approximations by consensus.
Approximations are not good enough when the data are critical. Archived data must be stored with a key that will allow proper interpretation of the data. Just because something is digitalis stored in 1s and 0sdoesnt mean it will be read properly by different operating systems, different chipsets, and different computers.
Is this paranoia? After all, Word XP can read old WordStar files, right? But wait 75 years and try to read a 2002-era ASCII text file on your PDA neural implant.
Future Shocks
That being said, knowing what kind of digital format youre dealing with is only a small part of the problem. You will also need to have a useable version of the appropriate software to read the data. I have probably used over a dozen word processors and as many databases since I first started using PCs. I strongly suspect that my Ami Pro documents are rapidly becoming archaic. Will software still exist in a hundred years that will read VisiCalc files? I dont think so. If we really want our data to be usable for the long haul we have two paths: 1) Store not only the data, but also the software to use that data and the hardware to use that softwareand include instructions on how to provide power for and use the hardware; or 2) store the data in a format that will be universally readable by data processing machines now and in the future.
Choice number one is, well, stupid. Thats the point. The second option is probably doable, but not with the present mindset of hardware and software vendors and their proprietaryand incompatibledata formats. Export from a SQL database to a spreadsheet then into Access and youll get a bunch of errors, never mind going from dBase to Oracle.
We are currently working on a project that uses images that were created using DOS-based presentation software, but nothing we had could load the images for editing. Fortunately, we were able to find a set of diskettes of the original software. (Good thing they werent 5.25-inch floppies!) Those filesdigital fileswere created only six years ago. What will we be able to do with them 60 years from now? On the other hand, if those images were analogprinted, say, on archive-quality paperthey would certainly be usable now and in half a century.
Thinking Ahead
If the digital revolution is going to have a lasting impact on the way we do business, its essential that we use standard ways to represent data. XML is the (aging) poster child for interoperability, and for some types of data may be your best bet for long term digital storage. That isnt because its some sort of universal language, but because its self-describing and uses current universal data types (ASCII text, Unicode, etc.). So for the immediate future an XML based format may be your best bet for storing text based data such as documents and database information.
Unfortunately, businesses depend on many other types of digital data. One of the worlds leading financial services companies that started life as an insurance company prides itself on being paperless. That means it has terabytes of data in some sort of binary format, probably scanned. Those images arent going to translate into XML. Wethe business techiesmust insist that our vendors cooperate in helping develop standards for storage and retrieval of all types of binary data. While its possible to encode binary data into a text format and thus make it accessible in XML, it is not a very good solution. The restored binary data will still require some sort of proprietary system to render it useful.
If the above werent perplexing enough, we need to contend with the actual media on which we store our data. Magnetic tape has a useful lifetime of at most about five years if stored in a controlled environment. Magnetic disks (e.g., hard disks) can probably be trusted for five to 10 years. Optical media (e.g., CDs and DVDs) may have a lifetime of 30 years. So what? These are all very short periods of time. The Dead Sea Scrollsas analog as you can getwere discovered in a clay jar and were still readable after 2000 years. Do you honestly believe that any of the digital data that you are now working so hard to protect will still be readable in 4002?
Do you care? So What?
Maybe thats the real issue. Do we really need to protect our digital assets for millennia? Perhaps the simple process of periodically migrating data to new systems is enough. If were doing our jobs, were already keeping multiple backup sets of our data in different, protected locations. Were already doing local daily backups. We have ready access to systems that understand the data we have now.
Chances are that your customer data will not be accessible 200 years from now, and chances are even better that it wont matter. We are information professionals dealing with insurance and financial data that has a fairly limited usefulness in the business world. So from a business perspective, we probably dont need to be concerned about long-term storage of digital data.
Further, its not as if well suddenly be transported to the far future (or even the near future) and be forced to access old data with the machines of that future time. The data we use are constantly updated and upgraded as are the systems used to access them. Certainly, we occasionally bump into an old file that we never thought wed need and thats stored in an old format, but thats by far the exception. For the most part, as we upgrade the hardware and software we use to access data, we also upgrade the data themselves. Sometimes its a conscious effort of conversion, other times its part of the transfer process.
But the end result is the same: Over time, our digital documents are converted to whatever is the format du jour. The process is a slow one, but then again, it happens to be the same speed as life itself. So for the most part, as long as we keep up with our document conversions as we perform any systems conversions, digital storage wont be a problem.
However, on a larger scale we do need to be concerned. Digital assets are not just business data. They encompass works of art as well as historical data. If our society is going to bet its future on digital technology, we need to find uniform, standard ways to store that data. Most modern businesses dont have the luxury of maintaining great research centers like the old Bell Labs. We barely have enough resources to keep the information flowing properly through our own operations. Yet we all need to think seriously about aiding in the creation of digital standards if we have any hope of saving todays knowledge for the future.
Time Lapse
How long will various media last? Under typical conditions, look at the low end of the range below. Stored properlyin temperature- and humidity-controlled roomslook to the high end. Some products, obviously, have a much broader range.
And last means different things for digital and analog media. The Dead Sea Scrolls are still readable, although they are in bad conditionthats an advantage of an analog medium. A digital medium similarly damaged would be virtually useless.
Digital
Compact disks and DVDs: 10-100 years
Magnetic tape (including VHS): 10-30 years
Digital linear tape (DLT): 10-300 years
Analog
Ink-jet photographs on standard paper: 1-6 years
Newspaper and other standard paper: 10-20 years
Standard (silver halide) photographs: 15-20 years (certain Fuji papers can last up to 60 years)
Photographic slides and negatives: 100 years
Ink-jet photographs on archival paper: 2-200 years (Epson claims a 200-year life with its best inks and paper)
Acid-free or archival paper (e.g., books): Up to 500 years Microfilm and microfiche: 10-500 years
Stone tablets: 2000+ years
Want to continue reading?
Become a Free PropertyCasualty360 Digital Reader
Your access to unlimited PropertyCasualty360 content isn’t changing.
Once you are an ALM digital member, you’ll receive:
- Breaking insurance news and analysis, on-site and via our newsletters and custom alerts
- Weekly Insurance Speak podcast featuring exclusive interviews with industry leaders
- Educational webcasts, white papers, and ebooks from industry thought leaders
- Critical converage of the employee benefits and financial advisory markets on our other ALM sites, BenefitsPRO and ThinkAdvisor
Already have an account? Sign In Now
© 2024 ALM Global, LLC, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to [email protected]. For more information visit Asset & Logo Licensing.