Tuesday, September 15, 2009

ATR: Records Management as Science?

So the other day, Russell James asks, "Why Not 'Records Science'?" I read Russell's blog from time to time and he does think the deep thoughts. Russell tends to live in the archivist's world moreso than the records manager's world, but he's always thinking. It's not a bad thing. He and I both subscribe to Jac Treanor's statement that records management and archives are two sides of the same coin, so he can't be a bad guy.

Problem is, we may have missed the science bus some time ago. One of the biggest challenges to calling what we do a "science" is that most answers to records management questions start with, "It depends..." Yes, the profession is getting more and more standardized, such as is possible when the types of records are nearly infinite. And just when we think we have one system of records nailed down, something else pops up. ARMA posits the "Generally Accepted Records Principles" and a bunch of records managers scream, "I don't accept those!"

This all takes me back to my (still in progress) read of "Everything is Miscellaneous". Now could ARMA have chosen a more antithetical keynote speaker at Conference? We records managers want order. We create order from disorder. That's what we share in common with librarians and archivists, and to some extent, historians. But our order is rarely consistent or standard. Librarians rely on Mr. Dewey and the Library of Congress. Archivists describe to the rigor of MARC formatting. We just hope that people don't call paid invoices "pinks" or "paid bills". We rail at IT-folks who want to declare email a record type. We pale at the thought of deciding, arbitrarily, "If it hasn't been touched in two years, delete it!" We run interference for lawyers and try to thrash the ones who demand that we implement the latest vendor's "email archival" (whatever the heck that usage means -- I'm still trying to figure out if "archival" is used as a noun). We debate at length if a Tweet is a record and if so, how and where do we retain it. Where once we had thousands of line items on our retention schedules and shared those statistics with great pride and no little bit of chest-thumping, we now try to whittle the retention schedule down to 100 or so buckets inside of a half-dozen "big buckets" (and argue about what those really are). Our science is in constant flux and under continuous attack.

And this in a constant state of cost-cutting, insane advances in technology, outsourcing to the lowest bidder, and the anarchy of the records creator. Years ago, I wanted a concession at the ARMA Conference selling t-shirts that said, "Image it all, let god sort it out." Today, we're nearly better off saying, "Archive it all, let god and the lawyers sort it out." We exist in a world demanding governmental and corporate "transparency", where transparency comes from laws and regulations demanding retention of records. Court systems demand accounting for records and explanations for records not produced. But the courts live in the records and technology world of ten years ago. With each passing day, the anarchy of the records creators reigns supreme. Today I learned about an engineering initiative at Google called the "Data Liberation Front", where the mission statement is:

Users own the data they store in any of Google's products. Our team's goal is to give users greater control by making it easier for them to move data in and out.

Now one of the challenges for these guys is understanding who the customer is. While most of Google's customer's are individuals, Google also seeks the enterprise / corporate customer (and the link to this was found on Google's Enterprise Blog). Unfortunately, the last thing that you want in your organization's scientific approach to records management is even more user optionality. If you have science, you demand repeatability. An infinite number of independent users with an infinite number of unique records means an infinite number of outcomes. Google's sincere approach to the corporate customer is (and I have heard it almost verbatim from two different Google employees in different venues), "We're a search company. Just keep everything and we'll help you find it when you need it." No need to worry about metadata. No need to get rid of pesky personal emails or spam that sneaks through. No need for deduplication. Don't bother classifying, just keep it all. The term they repeat is, "an immutable archive", Uh huh. (Now I could rant about Google's naivete for a couple hundred more words, but I'll stop now.) They are coming around to a more "corporate" way of thinking, but they aren't there yet.

So let me return to Russell's premise. Russell wants us (and our kindred spirits in archives management) to refer to ourselves as "records scientists". Unfortunately, I'm afraid that if we go down that path, the only white coats that we'll wear will have extra long sleeves that tie around back. Let's take a look at Pat's definition of information and records.... Stuff gets created, lots of it. We like to call this stuff "information". We records managers create governance policies that separate the information into categories of "records" and "non-records". We care about records, so much so that we divide the records into "series", although sometimes we like to call those series, "buckets". Because we give each record an assignment into a series, we can easily find what we need by searching for records that are part of the same series. We make sure that records that are stored are associated with metadata that identifies and categorizes the records by series and dates and lots of other bits of data. To each series, we assign a lifespan called a "retention period". At the end of the lifespan, we expect that the records will undergo "disposition". If found worthy, some dispositioned records find their way to the Historical Archives, where they are carefully cataloged and preserved for all eternity (or until the company has to sell them on E-Bay). Everything else gets deleted, wiped, shredded, pulverized, incinerated, or pulped. That makes for a nice flow chart, a pretty PowerPoint slide, wonderful documentation, and total chaos in the real world. Why? Because no layperson can understand any step in that process to the same degree that we do. And they really don't care. They create stuff all day. Some of the stuff belongs to them; some of the stuff belongs to their employer, but the creator doesn't understand that, either. It's all "their stuff". They know that sometimes their stuff is needed and sometimes no one ever cares about it again. But it really becomes important only when they can't find their stuff. So they create their own means of organization, in a manner that makes perfect sense to them. And try as we may, our equally arbitrary means of creating taxonomies and series and buckets simply doesn't resonate.

So where does that leave us? The basic problem, in my opinion, is that we have spent far too much time dealing with records in a hands on manner. Let's face it: we like to organize all that stuff. We like the comfort of policy and structure and rules. We like to pretend that it is all so very scientific and completely repeatable. So some of us are drawn to a description of records as science. But the problem is that we spend entirely too much time doing the records management equivalent of debating whether or not Pluto is a major planet or a minor planet. It all seems so very important, but it really isn't critical to the big picture. And the problem is that while we focus on the little bits, we miss the bigger issues.

At the Day Job, I don't do as much records management as I used to. But I apply the essential "science" every day. I'm finding that there are many more issues to worry about and that the principles of records management that I learned along the way serve me well on many tasks that I never imagined I'd be doing. Let's look at those...

Litigation and Investigation Support: At the end of the day, it doesn't matter if we spend years organizing stuff into series. If it exists, it is relevant to the litigation, and it's discovered, it's called "evidence". At that point our job will be to explain where it came from, whether or not the court should trust it, and whether or not someone found all of the relevant stuff for the matter. It's real helpful to know where to look and to have mapped the locations of all the stuff in the organization so we can say that we've looked in all the right places.

Data Loss Prevention: This is a biggie. It encompasses privacy and security. It means that someone knows where the "important" (i.e. the stuff that can get us sued for losing) stuff lives and takes steps to protect that stuff and detect when it is "liberated" (to use the Google term) inappropriately.

New Technology Assessment: When someone decides to officially "Tweet", or a bunch of engineers in one part of the world want to stream a live webcam to their colleagues in another part of the world, what do we care about? When the business takes its stuff to the "Cloud", what requirements do we need to draw up?

At the end of the day, my job consists of adhering to two directives:
  1. Prevention of loss of data of concern.
  2. Minimizing disruption to the business.
"But there's so much more in what you do!" Nope. If you focus on records management, Directive 1 means that you can find the stuff that you need (if they need it, it is data of concern) and people or automated processes don't discard stuff that is required to be retained. Directive 2 means that we do whatever is possible to ensure that Directive 1 doesn't cause the business (that would be the people who make / deliver products and services and sell them) to spend a lot of time not being the business. Granted, I work in our company's Asset Protection group, but I don't think that these concepts are really foreign to anyone. Why do we classify records? So people can find them again. Why do we care about retention periods? Because we don't want records to be dispositioned too soon. Why don't people always follow our rules? Because we've somehow gotten in the way of them doing what they were hired to do.

So if it makes you feel better to call yourself a "records scientist" who practices the "science of records", be my guest. If your identity gets wrapped up in your business card, so be it. But don't be deluded into thinking that systems of records can be reduced to mathematical formulae and lines of computer code. There's a human element involved and that element doesn't appear on the Periodic Table, but sure makes a mess when it gets involved in records management. Our job is to find ways to design and govern systems that limit the chaos introduced by the human element.


David said...

Very interesting read. I have to admit that I could never think of Records Management as a science. To me, it is an art. And, as with all art, you can't really define what is right and wrong, but you know what works for you and what doesn't.

divya said...

Nice post! Record management is the systematic and efficient control of creation, storage, maintainance of structured or unstructured information. It is a way to efficiently use of human resources for records keeping activities. Security of information and data is also the key factor of records management. records management