| |
|
|
Forensic
Implications of Metadata in Electronic Files
By
John Ruhnka and John W. Bagby
JUNE 2008 -
In this digital age, most business activities are transacted and
recorded using networked information systems. Business and accounting
records are prepared, reviewed, audited, and preserved in electronic
form, commonly called electronically stored information (ESI). It
is estimated that 94% to 99% of all business records are created
and maintained in electronic form (National Law Journal,
July 17, 2006) and most are never transformed into hard copy. A
unique characteristic of electronic records is that they include
hidden metadata that comprise extensive information about the creation
of a file, including “MAC dates” (the dates a file was
modified, accessed, and created), the date last printed, and, if
deleted, when it was deleted and by whom. Metadata can also reveal
the location of a file on a computer or network, the computer on
which it was created, the name of the person who last saved the
document, the number of revisions made, and any document ID or properties
added to the document. E-mails contain metadata that indicates the
sender’s address book information, the date a message was
sent, received, replied to, forwarded, to whom copies were sent,
and the existence of attachments. Metadata
has been called “the electronic equivalent of DNA”
and it can shed light on the origins, context, authenticity, and
distribution of electronic evidence (Craig Ball, Beyond Data
about Data: The Litigator’s Guide to Metadata 2005).
CPAs—especially those involved in forensic accounting and
litigation support—should be aware of how metadata is generated
by software, and the potential significance of metadata in electronic
business records and communications.
Metadata
Production, Location, and Access
Metadata
can be broadly divided into two categories. Application metadata
is automatically created by an application and is embedded in
every file created or edited using that software. Operating systems
that control individual computers, servers, and other devices
create systems metadata, which assigns file allocation
table fields (file name, creation, length, and use) to all files
stored on the system so that the operating system can identify
and locate that file. Systems metadata resides in the system registry
of the computer system or server used to access and store that
file.
Many CPAs
use Microsoft Office programs, including Word, Excel, PowerPoint,
and Outlook. All of these applications automatically produce dozens
of fields (types) of application metadata for each file they create.
Application and systems metadata fields are created and updated
for Word, Excel, and PowerPoint files each time a file is created,
opened, or used, as well as the optional information about changes
or versions that a user may intentionally track in a file. Adobe
Acrobat software creates detailed metadata path information that
can provide forensic information on PDF files.
Significance
of Metadata in Litigation
A 2007 survey
of litigation activities of 253 U.S. corporations revealed that
83% of respondents had new lawsuits filed against them in 2006
(Fulbright & Jaworski, Fourth Annual Litigation Trends
Survey Findings, October 2007). The most common subjects
of these lawsuits were labor/employment, contract enforcement,
and personal injury. Litigation
was also significant at the smaller companies surveyed: 17% had
at least one lawsuit claiming $20 million or more, and 98% of
mid-sized companies reported one or more lawsuits of $20 million
or larger. After a lawsuit is filed, a pre-trial discovery phase
occurs during which the litigants are required to identify and
disclose (produce) all information in their possession that is
requested by the opposition as potentially relevant to the subject
of the litigation. Because most settlements in litigation occur
before a trial is held, electronic records and e-mails disclosed
and evaluated by the parties during the discovery phase can often
determine the outcome.
Once a lawsuit
is filed or a party has been served with a document preservation
request, a “litigation hold” prevails that requires
the parties to preserve all evidence under their control that
is potentially relevant to the subject of the litigation. In some
circumstances, a legal duty to preserve potentially relevant evidence
can even arise before a lawsuit is filed. The watershed 2003 Zubulake
discovery ruling imposed legal duties to preserve potentially
relevant evidence as soon as litigation is “reasonably anticipated”
(Zubulake v. UBS Warburg, 2003 WL 22410619 at 4, S.D.N.Y.
2003).
Because metadata
in electronic files reveals forensic information about the creation,
authorship, history, and even intent of a document, it can play
a potentially critical role in litigation outcomes. In the Vioxx
product liability litigation that resulted in a large judgment
against its producer, Merck, the New England Journal of Medicine
reported that residual “tracked changes” accidentally
left in a Merck internal document indicated that Merck knew of
potential dangerous side effects of Vioxx (including heart attacks)
two years before placing the drug on the market (Forbes,
Dec. 8, 2005). The general rule on disclosing the metadata associated
with files demanded in litigation has been stated as follows:
[W]hen
a party is ordered to produce electronic documents as they are
maintained in the ordinary course of business, the producing
party should produce the electronic documents with their
metadata intact, unless that party timely objects to production
of metadata, the parties agree that the metadata need not be
produced, or the producing party requests a protective order”
(Williams v. Sprint/United Mgmt. Co., 230 F.R.D. 640,
D. Kan. Sept. 29, 2005) [emphasis added].
Preserving
Metadata when Reviewing or Producing Files
Larger organizations
are often involved in frequent litigation, which requires the
entity to identify, preserve, and disclose potentially relevant
electronic files in response to successive legal discovery demands.
To effectively manage this complicated process, which can have
significant implications on liability, it is sound practice to
institute an enterprisewide “ESI discovery team” to
manage and coordinate this complex and costly discovery. An ESI
discovery team includes key decision-makers who need to be involved
in the on-going process of planning for and responding to discovery
requests. This typically includes the CIO, IT system managers,
in-house legal counsel, representatives from administrative units
most closely involved in the litigation (e.g., an HR director
in a wrongful termination lawsuit), as well as outside counsel
and any third-party legal and electronic discovery consultants
and forensic experts who will be involved. The ESI discovery team
designs an organization’s “litigation hold”
procedures, and deploys litigation holds often involving multiple
and overlapping litigation for all enterprise locations.
A business
subject to a litigation hold must act quickly to prepare a written
“preservation plan” identifying all potentially relevant
information at all enterprise locations. The identification process
can use keywords describing the subject matter of the litigation;
identify specific users whose e-mails, instant messaging, and
voicemails may be relevant; and notify identified users to preserve
all data on desktop computers, laptops, and messaging devices.
The 2006 Federal Rules of Civil Procedure (FRCP) require both
accessible and “inaccessible” ESI, such as network
and server back-up tapes, to be preserved. Any over-writing or
reuse of back-up tapes that may include e-mails potentially relevant
to the litigation must be immediately halted. The 2006 FRCP Rule
26(f) requires a “meet-and-confer” to occur early
in the litigation to negotiate the scope of discovery by and for
each side. This conference should decide which files are to be
collected, reviewed for potential relevance; and, if relevant,
produced, as well as the format in which files are to be produced
and whether they will include metadata, along with a timetable
for discovery.
The FRCP
contains a default preference for delivery of electronic files
in “native file format” (the format in which the data
is ordinarily preserved), including all associated metadata. Thus,
potentially relevant ESI needs to be preserved in native file
formats with metadata intact before the multiple steps
involved in collecting and reviewing files for relevance are initiated.
Opening a file for review alters its metadata and could be viewed
as “tampering” with evidence. Parties producing files
requested for discovery need to be able to show an unbroken chain
of custody to assure its admissibility as evidence and to avoid
judicial sanctions. To ensure this, it is advisable to make a
secure “snapshot” digital record of all potentially
relevant enterprise server systems and files that is separately
archived before any review for potential relevance is
conducted, so that original files and metadata remain intact.
If the parties
disagree about the format in which files are to be produced or
whether file and system metadata are to be included, the federal
courts will look at the potential relevance of metadata to the
issues in dispute. In an options-backdating case, for example,
metadata showing the dates of successive entries contained in
options documents could be critical. A second consideration is
the cost of producing metadata. If metadata already exists in
the native file formats, it is more likely to be required whereas
if it is not present in the native file format and must be reconstituted
from other sources, it is less likely to be required.
Forensic
Uses of Metadata
Forensic
accounting provides an evidentiary basis for economic transactions
and reporting events by identifying the process of capturing,
using, storing, and transmitting business and financial data.
This can involve manual processes, such as data entry, computations,
verifications, and interpersonal communications, in conjunction
with a company’s IT and network systems. Metadata can help
to identify the human and system actions in information systems;
can be used to investigate and verify fraud, abuse, mistakes,
or system failures; and can help to establish elements such as
causation, timing, and the extent of knowledge or mens rea
(guilty knowledge)—all of which are at issue in criminal
or civil litigation. An example of the forensic use of metadata
is in stock options-backdating investigations, where the integrity
of the dates entered on written option documents is often the
crux of the dispute. In Ryan v. Gifford (2007 Del. Ch.
Lexis 168), the Delaware Chancery Court ordered respondents to
produce the disputed stock option documents in an electronic format
that would permit examination of all metadata associated with
the documents, noting that “Maxim’s special committee
as well as Deloitte & Touche undoubtedly reviewed metadata
as part of their investigation into the backdating problems at
Maxim.”
In Williams
v. Sprint/United Management Co. [230 F.R.D. 640 (D. Kan.
2005)], the plaintiff in an age discrimination lawsuit sought
an Excel workbook in its native file format. But the defendant,
Sprint, stripped out all metadata from the Excel spreadsheet
files that it produced, arguing that the metadata could reveal
privileged information that the company had a right to withhold
(the formulas and calculations used to derive information in the
Excel spreadsheets that were linked to spreadsheet cells). The
court held that blanket withholding of metadata from the requested
accounting records went too far, and ordered Sprint to produce
all of the metadata in its accounting files as maintained in the
ordinary course of business, except for specific metadata that
it claimed was protected by attorney-client privilege.
Confidentiality
and Malpractice Implications: Client Files
Some potentially
relevant information demanded in litigation, such as attorney-client
communications and litigation work product including associated
metadata, may be withheld from disclosure as “privileged”
information, subject to judicial review. Claims of privilege must
be identified in a “privilege log” that identifies
the author, recipients, subject matter, and dates of all withheld
files. The privilege log alerts opposing parties to the fact that
potentially relevant information has been withheld, and dates
and the identity of participants enable opponents to review and
challenge these claims. The 2006 FRCP amendments impose a faster
pace for discovery, which increases the risk of accidental disclosure
of privileged documents, but also provides that parties may request
a “claw back” (a court-ordered return) of privileged
files or trade secrets in the event of inadvertent disclosure.
Litigants may also request court protective orders that prohibit
the disclosure of proprietary, confidential, or private information
that has been accessed by an adversary or its experts.
CPA firms
play an important role in providing electronic discovery and forensic
services in litigation. The 2007 Socha-Gelbmann Electronic Discovery
survey (www.sochaconsulting.com/2007survey.htm)
indicates that $2.6 billion was spent on electronic discovery
services in 2006, and that CPA firms are increasingly significant
vendors in this arena. (Ernst & Young and KPMG were ranked
in the top 10 electronic discovery service providers in 2007.)
CPAs who provide forensic information and damage calculations
for clients need to be aware of the liability implications of
metadata contained in client files. Inadvertent disclosure of
metadata in client files could result in a waiver of subsequent
client claims of legal privilege for the metadata, or enable opponents
to use metadata against clients’ interests.
CPAs are
increasingly being held to the same malpractice standard as lawyers.
Mattco-Forge, Inc. v. Arthur Young & Co. (6 Cal.Rptr.2d
281; Cal. Ct. App. 1992) involved a suit by a client against a
CPA firm. Arthur Young & Co. was hired as an expert witness
and damages consultant to assist Mattco in a lawsuit against General
Electric. Mattco claimed that Arthur Young negligently provided
unsubstantiated calculations for the profits allegedly lost because
GE had struck Mattco from its supplier list. Because original
estimate sheets were not available for all contracts, Arthur Young
had asked Mattco to prepare noncontemporaneous estimates for the
missing estimate sheets. These estimates, not identified as being
noncontemporaneous, were turned over to GE, which used them to
have Mattco’s legal claims dismissed. The
California Appellate Court noted that in technologically driven
litigation, the engineers, physicians, real estate appraisers,
and other professionals—including accountants—hired
to assist a party in preparing and presenting a legal case can
play as great a role in shaping and evaluating their clients’
case as do lawyers. Accordingly, the court said, they should be
held to the same malpractice standards.
Managing
Metadata
While metadata
should not, without prior judicial approval, be intentionally
altered or removed from documents subject to a litigation hold
or demanded in litigation, metadata may be removed in the ordinary
course of business as necessary to preserve enterprise and client
confidentiality, as well as to safeguard proprietary information.
The AICPA Code of Professional Conduct, Rule 301, Confidential
Client Information, provides that: “A member in public
practice shall not disclose any confidential client information
without the specific consent of the client.” If a CPA assisting
a client with bid calculations were to send an amended version
of a bid proposal to the opposing side which included metadata
that revealed that the client had initially approved much higher
bid amounts, the CPA could be liable for breach of client confidentiality
or even a malpractice claim for jeopardizing the contract. Potential
liability for the disclosure of metadata harmful to client interests
means that metadata confidentiality policies that will pass muster
with both legal discovery rules and AICPA ethics rules, including
pre-release metadata viewing and “scrubbing” (intentional
metadata removal) of security-sensitive files, should be conducted
on a company-wide basis and not be left to individual discretion.
CPA firm
personnel should possess the necessary technical skills to both
view and to remove metadata from electronic files. Metadata is
viewable in several ways. Basic metadata in Microsoft Office documents
is viewable from the “File” menu, under “Properties.”
There are tabs for “General,” “Statistics,”
and “Contents” information. Word will reveal to any
user a Word document’s authors, the date of creation, the
date last modified, the number of revisions, and where the document
is stored. If optional used-added features such as “Track
Changes” or “Comments” were enabled when a Word
document was created or edited, any user can see which other users
made specific edits to a document and when.
In addition,
commercially available metadata viewers can be used to access
a much larger array of metadata. For examples, see www.payneconsulting.com/products
and www.docscrubber.com.
Payne Group also produces Metadata Assistant, a widely used metadata
“scrubber.” Detailed instructions on removing metadata
from electronic files are beyond the scope of this article; nonetheless,
a Microsoft Office 2003/XP add-in called “Remove Hidden
Data” can remove most—but not all—metadata from
Office 2003 documents. Microsoft also offers “Office Document
Inspector” for Office 2007, which can remove most metadata
from Word, Excel, and PowerPoint files. For a whitepaper on the
uses and limitations of MS Office Document Inspector, see
esqinc.com/Content/WhitePapers/Document-Inspector.php.
Click
here to view Sidebar.
John
Ruhnka, JD, LLM, is the Bard Family Term Professor of Entrepreneurship
at the business school of the University of Colorado at Denver.
John W. Bagby, JD, is a professor and co-director
of the Institute for Information Policy in the college of information
sciences and technology at the Pennsylvania State University, University
Park, Pa.
|
|