Thursday, November 12, 2009

ATR: To the Defense of Archives (and the English Language)

I'm throwing down the gauntlet. I'm tired of people mis-using the term "archival" and now I see that the term "archivist" is going to start to be abused. I've engaged in a periodic running battle with Benjamin Wright. He's a lawyer; I'm not. I probably shouldn't single him out here, but unfortunately, he pops up on my Google Reader blogs and his usage gets in my face. Small apology to Mr. Wright for picking on him, because there are undoubtedly many sinners in this space.

Someone tell me when this term became a noun. Checking, "archival" is an adjective. However, I keep seeing the term used as a noun, generally in combination with "email", as in "email archival". What the heck is that? Arguably, you could say "archival email" and that would (sort of) have meaning, but "email archival" has no meaning in the English language. You could also say "email archive", but I'd rather that you didn't.

In this post, Mr. Wright says,
Many lawyers and archivists recommend, for example, that email systems be configured to purge email, by default, after only . . . 15 days or 30 days.
ARCHIVISTS say this? Whoa! At least in North America, an archivist is generally assigned to be the custodian of historical records. In other parts of the world, an archivist is the de facto records manager for an organization. But I would suggest that at least this records manager (and fallen away archivist) would never suggest that all email be deleted in short order. Email is a means of transport and storage for information. Paper is also a means of transport and storage for information. The content value of the information drives the retention period, not the means of transport and storage. If you are going to suggest that "email" has a particular retention period, then you also have to set a retention period for "paper".

Now records managers have been known to be accused of being Conan the Shredder from time to time by our colleagues who toil in historical archives. If anything, when conflicts arise regarding destruction of records in the information professions, it's archivists thinking records manager are too quick to destroy records and records managers thinking archivists are too slow.

So my thinking is that Mr. Wright is considering IT storage management professionals to be "archivists" (these are the same folks who have to manage all those email "archivals", I guess, so they must be "archivists"). Let's put a stop to this nonsense right now. We have enough difficulty explaining the difference between "backups" and "archives" to the IT folks. And when you insert a "real" archivist into the mix (because your organization has an historical archive), people get really confused.

I would suggest the following: the storage location for information is a "repository", regardless of the physical form of the information. So when you retain email, it is retained in an email repository. When you retain hard copy, you store it in a hard copy repository. The discipline of managing the repository is "storage management". The discipline of setting the retention period for an organization's records is "records management" or "retention management" or even some flavor of "governance".

If we can agree on those terms, let's look at the issue raised by Mr. Wright. Mr. Wright is rapidly leaning into the camp that suggests that email should be retained in an "immutable archive" (there's that term again, although not used by Mr. Wright) -- which means that you don't delete any email message once it is created or received. His logic in the article seems to indicate that only bad guys delete email because they must have something to hide.
For good business people, lengthy, complete email and text message records -- including all the metadata -- are a friend.

The only people who have incentive to destroy their electronic message records quickly are mobsters, the Mafia. They of course are motivated to toss all the smoking guns and dead bodies into the river.

I can understand where this logic comes from, but I would suggest that if you are going to retain all electronic messages, you have to retain all paper messages as well. And that means that you have to retain EVERYTHING that shows up in paper form -- and, to ensure that you are retaining metadata for the paper, you have to keep the envelopes too. So when I get an advertisement in the mail, I'm going to have to keep it, regardless of the fact that in the electronic world, that same content might have been deleted automatically by a spam filter. Those reusable Interoffice envelopes? Use them once and keep them to show when the document was sent. That FedEx box? Better keep it to show the routing stickers and codes used by FedEx to deliver the documents.

Clearly that's nonsense. But it is effectively what is being suggested in the electronic world. And I don't buy the statement, "Well, you can easily tell what is junk and what is a record when it is a paper document." So we're suggesting that human beings are capable of discerning non-record from record in the paper world, but they can't do this in the electronic world? But hard copy records take up too much space and they are costly to retain! Try that one with electronic records. There is a double standard here that simply makes no sense to me.

I would suggest that proper management of information starts with much better training than most organizations are prepared to deliver today. And, that training has to be followed with auditing of business practices. If you can train people well, create a simple system to ensure that record-worthy information is retained in appropriate repositories, and validate that the rules are being followed, then you will have a defensible system of record. That is neither easy, nor is it inexpensive. But it is necessary before we start to see case law and precedent built up that demands two systems of records in an organization -- paper records and electronic records.

I am rapidly becoming an advocate of developing systems that separate information into three buckets: first, non-records; second, transitory records; third, declared records.

Non-records do not have to be retained and if not immediately disposed of, should be fully discarded within a relatively brief period of time (say 60 days). So you build into your repository a means to discard the garbage within 60 days of receipt or creation.

Transitory Records:
Transitory records are maintained for a period of time to determine if they should become a declared record or discarded as a non-record. This transition period should be fairly short, perhaps no more than two years. Government retention schedules have most often defined transitory records and often set even shorter retention periods.

Declared Records:
Declared records are assigned a retention period tied to a record series or category on a retention schedule. The repository for declared records will generally have greater controls over deletion and changes to the records as well as provide mechanisms to establish legal holds (arguably, any repository for information -- including those for non-records and transitory records -- should have a mechanism to ensure that information subject to a legal hold is retained until released from hold).

To circle back to where I started, I would suggest that all of us who support the management and retention of information come to agreement on more precise use of terms that describe what is being retained and where it is retained, particularly when those terms have the potential to be seen as confusing or conflicting between information professionals.