Press

PRESS
November 23, 2009

Information Obfuscation: an Update on the Missing White House Emails

In April 2007, CREW issued a stunning report revealing that the Bush administration had lost over five million emails in a two-year period and, upon discovering the loss in late 2005, failed to take any steps to restore the missing email or to put in place an appropriate electronic record-keeping system.

The Bush administration responded with repeated denials, a stance that continued in response to lawsuits filed by CREW and the National Security Archive against the Executive Office of the President (EOP) and the National Archives and Records Administration (NARA). In January 2009, the White House asked the court to dismiss those lawsuits based on its claim to have taken all appropriate steps to restore the missing email.

Recently, as part of a negotiated standstill in the litigation, the EOP began releasing thousands of pages of documents that call into question the Bush administration’s response to the missing email problem.1 These EOP documents directly disprove public comments and statements the administration made in court filings, and verify the claims CREW has been making for the past two and a half years. Additionally, these documents provide disturbing examples of various ways in which the Bush administration neglected or undermined the preservation of email. In one instance, the White House’s chief information officer cancelled an electronic records management system only a few months after the administration had called its implementation “our number one priority.” In another instance, the White House Counsel was given effective “veto” power to override recommendations about which dates to approve for email restoration.

With this posting CREW begins the process of highlighting and explaining the significance of key documents we have discovered as part of our ongoing review, a process we will update with future posts as we analyze further productions. The key documents outlined here help explain why the restoration of email from the Bush years remains unfinished – some ten months after the EOP claimed to have completed this process – and confirm the failure of the Bush administration to act in the face of knowledge that its ad hoc system for preserving emails did not work. Thousands of pages of the newly released documents are designated as “sensitive,” meaning they are not subject to public disclosure. Many of the sensitive documents describe Bush-era procedures for storing electronic records that are no longer in use and plans for email preservation systems never implemented by the White House.

What Newly Released Documents Reveal

1. The White House was aware of serious problems with preserving and searching for Microsoft email long before it discovered the missing email in October 2005.

When the Office of Administration (OA) had problems locating emails in response to a January 22, 2004 subpoena from Special Counsel Patrick J. Fitzgerald – who was investigating the leak of Valerie Plame Wilson’s covert CIA identity – it called in Microsoft for assistance. Microsoft wrote a “post-mortem” analysis in February 2004, that described several serious problems with preserving and searching for emails.

First, the White House was unable to preserve any email from Microsoft Outlook/Exchange in the electronic record keeping system then in place – the Automatic Records Managements System (ARMS) – which was set up by the Clinton Administration to preserve Lotus Notes emails. “There is no current mechanism to transfer Exchange email into ARMS,” and OA’s plan for doing so “is not yet a stable and consistent solution” and “fails to consistently” move data into ARMS, Microsoft stated. “This is a critical issue because each time there is a subpoena EOP is at risk of not meeting the deadline” (emphasis added). Indeed, in response to a subsequent subpoena from Special Counsel Fitzgerald, OA could locate only whatever email still resided in individual mailboxes.

Second, because there was no way to preserve email in ARMS that was created by Microsoft/Exchange, the White House resorted to placing a copy of every such email sent or received in a “journal” mailbox, which it later converted into PST files for archiving. But when Microsoft looked for those PST files to locate documents responsive to the January 22 subpoena, it apparently could not find them. The reason? In October 2003, after the White House had received an earlier subpoena from Special Counsel Fitzgerald, OA stopped moving the journal mailboxes into PST files. This decision left all journaled email to accumulate in the journal mailboxes for some unknown length of time. These mailboxes were very large – by February 2004 there were 11 journal mailboxes on four servers – and searching them “does not provide a complete set of results,” Microsoft concluded (emphasis added).2

Third, the server drives for the EOP’s Record Management (EOPRM) “crashed” on February 5, 2004, the day before the EOP’s response to the January 22 subpoena was due. Microsoft was able to recover the database in which search results apparently were kept, but it had to entirely rebuild the EOPRM system on a different server. It is unknown if this crash had any impact on OA’s search for materials responsive to the Fitzgerald subpoena.

(The Bates numbers for the pages of documents referred to in this section are: OAP00000083, OAP00000087, OAP00000088 and OAP00000089.)

2. The White House inexplicably cancelled a comprehensive electronic records management system it had spent millions of dollars to develop – Electronic Communication Records Management System (ECRMS) – just months after calling this system OA’s “number one priority.”

With only an ad-hoc method for email preservation in place, the Bush White House spent millions of dollars developing a comprehensive electronic records management system called ECRMS. Inexplicably, in the fall of 2006, White House Chief Information Officer Theresa Payton cancelled ECRMS.

Just a few months earlier, a senior official in OA’s Office of the Chief Information Officer (OCIO) called ECRMS “the most important system that we have implemented in a long time, we need to get it right. This is our number one priority” (emphasis added). By September 2006, OA was planning on extending the software support contract for the system through November 2007. Yet, only a few months later, Ms. Payton cancelled ECRMS even though the White House still did not have a robust email preservation system in place.

Indeed, although OA had made some improvements in preserving email after discovering millions of missing email in late 2005, an audit in 2006 found many procedures still “entail manually intensive tasks” that were “applied inconsistently,” at least during contractor and staff turnovers. In August 2007, long after cancelling ECRMS, OA was only just beginning to search for a new vendor for the “Exchange-Inventory Management” contract, OA’s latest proposed solution to its email preservation problems.

(The Bates numbers for the pages of documents referred to in this section are: OAP00000720, OAP00024241, OAP00005033 and OAP00006521.)

3. Internal White House documents confirm the key facts in CREW’s April 2007 report on the missing email problems.

In April 2007, CREW released Without a Trace: The Story Behind the Missing White House E-Mails and the Violations of the Presidential Records Act, which reported that despite the October 2005 discovery of millions of missing emails, the Bush administration failed to take action on a plan presented to then-White House Counsel Harriet Miers to recover the emails. Moreover, the White House did not have an effective email records management system in place. A newly released internal OA presentation, also from April 2007, confirms these facts and provides additional details about the state of email preservation, what the Bush White House knew and when.

After acknowledging “[e]mail messages [were] not properly captured” on 473 component days3, the internal presentation discusses a three-pronged proposal OCIO had developed to address the issues. OCIO proposed restoring tapes to determine if PST files were on “legacy” servers that may have been removed; recovering servers holding journal mailboxes from that time period and creating PST files from them; and restoring individual mailboxes and creating new PST files. As CREW independently reported, this proposal was presented to White House Counsel in November 2005. Despite the obvious problems the millions of missing email posed and the crucial importance of preserving email, a year and a half later the Bush Administration still had failed to take any steps to recover the emails (except for emails from the Office of the Vice President subpoenaed by Special Prosecutor Fitzgerald).

This document also confirms that problems with the White House’s email preservation system continued long after they were supposed to have been fixed. The database of PST files – essential for tracking archived emails – was not “completely maintained” and provided an “incomplete inventory” as of late 2006, the document acknowledges. A 2006 audit showed the then-current email file management and search processes required “manual and complex steps” which had constrained resources and “no dedicated staff.” High turnover among staff and contractors in key roles “negatively impacted procedures and processes.”

(The Bates numbers for the pages of documents referred to in this section are: OAP00005029, OAP00005047, OAP00005040, OAP00005049 and OAP00005035.)

4. The Bush White House falsely claimed to have “no reason to believe” any email was missing.

In January 2008, White House spokesman Tony Fratto claimed the White House had “no reason to believe there’s any data missing at all – and we’ve certainly found no evidence of any data missing.” This is simply false. At that time, the White House was in possession of the 2005 analysis, which concluded that millions of email were missing, as well as a November 2007 evaluation of the 2005 analysis. The White House claimed the 2007 evaluation identified several flaws in the 2005 analysis. Yet OA acknowledged six months later that despite its purported “shortcomings,” the 2005 analysis raised valid concerns about missing email. This analysis offered the best evidence of the scope of this problem, and it was in the White House’s possession when Mr. Fratto falsely claimed the White House had no evidence of any missing data.

(The Bates numbers for the pages of documents referred to in this section are: OAP00004640, OAP00005132 and OAP00005143. The transcript of Jan. 17, 2008 White House briefing is available at http://www.citizensforethics.org/node/30806.)

5. The Bush administration set up a process for deciding which days of missing email to restore based on a desire to avoid spending money, not on locating all of the missing email.

After discovering and documenting the millions of missing email in late 2005 and early 2006, OA staff recommended a series of steps to recover this missing email that called for using the backup tapes. The Bush White House not only ignored these recommendations, but it also disdained the obvious solution of restoring all the emails on the backup tapes and comparing that collection with the existing emails. Instead, in 2007, after CREW and the National Security Archive had filed their lawsuits, the White House established a lengthy and complex three-phase process to decide which days of missing email to restore. Two years and $10 million later, this process continues.

Ironically, documents released by the White House suggest its approach was driven by the desire to keep costs as low as possible, even if it meant that not all of the email was recovered.

This is demonstrated by an August 2008 document describing the custom-designed software OA’s contractor developed to gather information about the contents of the PST files – the PST Inventory Verification & Investigation Tool (PIVIT). This document admits the focus of PIVIT “was to answer the question” of whether OA could “properly identify” days with zero and low email counts “so as to limit the number of days that we would have to restore from backup tapes[.]” (emphasis added).4 In one copy of the document from the White House’s files this passage is highlighted, and the words “crucial element” are handwritten in the margin. Similarly, in evaluating the results of its “Phase I” analysis, OA noted that “[f]or every zero day we are able to recover without physical data restoration, we are able to save significant time and dollars.”

By contrast, as the PIVIT document correctly notes, the “focus” of the 2005 analysis of missing emails was on the contents of the PST files “with the assumption that every day should have a normal count of email.”

(The Bates numbers for the pages of documents referred to in this section are: OAP00005142 and OAP00005360.)

6. In moving to dismiss CREW’s lawsuit, the EOP ignored serious and persistent flaws in the process used by the Bush White House to narrow the universe of days with missing emails and decide which days to restore.

On January 21, 2009, the EOP moved to dismiss CREW’s lawsuit on the basis that it had taken all necessary steps to restore the missing email. The newly released documents reveal the falsity of this claim and paint a picture of a persistently flawed process that is still unfinished, some ten months after the EOP claimed to have finished with the email restoration process.

First, contrary to the EOP’s assertion in its dismissal motion, the White House did not conduct the critically important independent verification of its 2008 process for counting and sorting email. Instead, an outside contractor conducted a “quality check” that nevertheless revealed several problems, at least one of which the White House ignored. In addition, the process continuously was in flux; the White House and its contractors constantly had to alter their approach because problems in all three phases of the recovery plan kept recurring.

In Phase II of its plan, the White House – through a vendor – used the PIVIT process to count the number of emails in the PST files it had located and assign the emails to different White House “components” by using information about the particular sender or recipient from the emails’ headers. At the end of this process, OA tried to find a third party to conduct an independent verification and validation of PIVIT, and subsequently claimed in its motion to dismiss that “PIVIT was reviewed by a third party contractor, and independently verified as an accurate tool in both methodology and practice.”

On the contrary, according to OA’s director of Federal Records Management, none of the vendors or government offices contacted was interested in conducting this independent verification and validation. “The potential notoriety in the media was likely more than they wanted to deal with and I can’t say I blame them,” the director wrote in an email. The best the White House could do was get a company called NAID that “did not have the skill to review the actual coding and other technical aspects” of PIVIT to conduct a “quality check.” The director of Federal Records Management noted the sharp distinction between a genuine independent verification and validation as opposed to a far less comprehensive “quality check.” These distinctions are critical given that PIVIT is central to the methodology used by the White House to locate and restore missing emails.

Even the quality check turned up problems. NAID found 279 PST files that were not in the PIVIT database. OA agreed some of these files should have been included in PIVIT and added them. In addition, NAID recommended a manual examination of some PST files that appeared empty to see if they actually contained emails. OA rejected this recommendation, saying it would not “invest more resources in Phase 2 due to these findings.” Thus, we will never know whether those apparently empty PST files actually contained missing email.

Second, the PIVIT software, its method of “de-duplicating” the emails, and ARIMA (the statistical model OA used in 2008 to identify potentially low email days) kept changing throughout Phases II and III. As a result, OA had to make multiple “passes” through the data, each time coming up with different results. Documents from September and October 2008 describe some of the changes, which include the addition of found PST files and the subtraction of those accidentally scanned multiple times, changes to the rules for de-duplication (especially with regard to emails sent to multi-person distribution lists), a “reconfiguration” of PIVIT, and changes in how ARIMA would treat certain days that might have low email volume.

Because of these changes, OA analyzed the data with ARIMA three times between June and September 2008. All of these changes caused Dr. Nancy Kirkendall, who created ARIMA, to question PIVIT. In a September memorandum, she asked OA what had been “done to assure that PIVIT is really working as intended?” PIVIT, she said, seemed like “a wonderful, but complex tool,” but all of the changes were impacting email counts. “ARIMA is only as good as the input data,” she warned (emphasis added).

These newly disclosed concerns with all aspects of the EOP’s three-phase plan raise critical issues about the efficacy of the restoration process and, at a minimum, reveal the falsity of the White House’s claim in January 2009 that it had taken all necessary steps to restore the missing emails.

(The Bates numbers for the pages of documents referred to in this section are: OAP00005439, OAP00004908, OAP00004909, OAP00005440, OAP00005301, OAP00005304, OAP00004905-4906, OAP00005289-5306 and OAP00004914.)

7. The Bush Administration abandoned its own statistical model in deciding which days to actually restore email and instead adopted an unscientific process.

CREW continues to have grave questions about ARIMA, the statistical model the Bush Administration used in Phase III to identify candidates for restoration. Even with its flaws, ARIMA initially identified 106 component days that are statistical “outliers” that should be considered for restoration. Instead of applying the results of its own analysis, however, the Bush administration selected only 48 of the 106 component days to restore based at least in part on an unscientific process of elimination.

A series of handwritten and printed charts includes the notes and comments of several Bush administration officials on the reasons for restoring or not restoring particular dates. These officials considered a variety of factors, such as whether a particular day fell on or around a federal or religious holiday or on a day where there was a system-wide problem, and the size of the component showing a low email count. But for some days, an official with the initials “TP” (presumably Theresa Payton) recommended ignoring a date identified by ARIMA simply because it was a Friday or Monday. Yet ARIMA supposedly was designed to take into account the days of the week and whether a particular day was a weekday or weekend.

Even these recommendations were not the final word. In one document, the OCIO said it was awaiting “final decisions from White House Counsel on which dates will be approved for restore.” Why was the White House Counsel making the final decision on which dates to restore, especially given its apparent lack of involvement in designing and implementing a process for identifying low-volume days? Indeed, the White House Counsel did not approve all the dates recommended for restoration.

(The Bates numbers for the pages of documents referred to in this section are: OAP00005207-5209, OAP00005413, OAP00005416, OAP00005418, OAP00005445OAP00005451OAP00005456, OAP00005448 and OAP00005162.) Footnotes:

  1. The documents released by the EOP include: (a) eight of the 38 boxes assembled by the Bush administration as potentially responsive to Freedom of Information Act requests CREW filed in April 2007;( b) several discs containing documents the Bush White House provided previously to the House Oversight and Government Reform Committee; (c) documents related to the current White House email preservation system; and (d) documents related to EOP’s effort to restore some of the missing email.
  2. Letting the journals accumulate for months also contradicts the White House’s contention in CREW’s lawsuit that “[w]hen a Journal reached its storage capacity, a .PST file was then manually created by contractors within OA to archive the messages contained in the Journal.”
  3. At the time, there were 12 “components” at the White House, such as the Office of the Vice President and the National Security Council. Frequently, email was missing from more than one component on a given calendar day.
  4. Of course, it is also possible the Bush White House purposefully implemented a process it knew would identify fewer missing days to minimize any public exposure and adverse publicity.

Citizens for Responsibility and Ethics in Washington (CREW) is a non-profit legal watchdog group dedicated to holding public officials accountable for their actions. For more information, please visit www.citizensforethics.org or contact Garrett Russo at 202.408.5565 or grusso@citizensforethics.org

[ Please review our commenting policy here. ]  [ Comments powered by Disqus ]
© 2011 Citizens for Responsibility and Ethics in Washington, all rights reserved.
• 1400 Eye Street NW • Suite 450 • Washington, DC 20005 • 202-408-5565 •