Digital Discovery & e-Evidence ® Reproduced with permission from Digital Discovery & e-Evidence, 16 DDEE 19 and 40, 01/21/2016. Copyright 姝 2016 by The Bureau of National Affairs, Inc. (800-372-1033) http://www.bna.com BEST PRACTICES iDiscovery Solutions’ Dan Regard and Charlie Platt explore the world of litigation, timestamps and metadata. It is a twilight zone of litigation where everything—and nothing— makes sense; where time zones, epochs and obscure digital formats mark the dividing line between critical insight and falling prey to the unexpected complexity of the simplest of questions: ‘When?’ ‘You are about to enter another dimension, a dimension not only of sight and sound but of mind. A journey into a wondrous land of imagination. Next stop, the Twilight Zone!’ A Brief Legal History of Time BY DAN REGARD AND CHARLIE PLATT he clock ticks past midnight. It’s late and the documents you’ve been reviewing for the past several hours have all blurred together. You pull up a key email for what seems like the hundredth time this evening and, rubbing your eyes, you stare at it, knowing you’ve read it a dozen times already but feeling that somehow an important piece of the litigation puzzle is buried there. As your eyes start to glaze, it hits you. Excitement washes away exhaustion, you pull up another email and another. One of the emails appears to be part of a chain, but according to the date and timestamps, it was sent two hours before the first part of the chain was received. On another, one of the key witnesses seems to be sending email at highly unusual hours, but review of the content appears to indicate normal business hours. T COPYRIGHT 姝 2016 BY THE BUREAU OF NATIONAL AFFAIRS, INC. Upon further scrutiny, you discover that the email record conflicts with the text message record and the telephone call records. Calendar entries titled, ‘Friday Afternoon Project Status Update,’ appear to be consistently scheduled for the morning. How concerned should you be? Are these times and dates correct? If not, where did they go wrong and who altered them? If they are incorrect, what are the correct times and dates and how do they change your arguments? Time and date values can prove tricky: they can easily be interpreted incorrectly, and when that occurs, they have the potential for negative consequences. Depending on the case, a few hours’ difference on a critical email or document can easily make the difference between litigating and bringing opposing counsel to the settlement table. Order Versus Chaos The universe is a messy place, and despite our best efforts to fit it into tidy little buckets, it simply refuses to comply. Physics tells us that the chair you’re sitting on is mostly empty space, and yet it works perfectly well as a chair. Historians tell us that it should be impossible to build Stonehenge with the technology of that time, and yet there it stands. We are all taught that it takes 365 days for the Earth to orbit the sun and 24 hours for it to complete a single rotation, but that isn’t necessarily the whole truth. ISSN 1941-3882 2 One area where we expect logic, order and consistency—but reality proves exceptionally chaotic—is time. Ever since the dawn of civilization, we have attempted to measure and classify time. From Stonehenge to Incan calendars to atomic clocks, we categorize our existence into years, our years into seasons, seasons into months, months into days and days into hours, minutes and seconds. We group time into centuries, eras, millennia and eons, and break it down into milliseconds, microseconds, nanoseconds (one billionth of a second) and beyond. As with other systems of measure, our methods of time measurement have grown more complex as we’ve developed a need for greater precision. How Many Days are in a Year? The Egyptians are credited with the introduction of the 365-day calendar. Prior to that, calendars were commonly based on 360-day years broken down into 12 months of 30 days each. Such calendars were based on the lunar cycle, with the full moon coinciding with the middle of the month. As civilization became increasingly agrarian, such calendars became less useful as they would fall out of synchronization with the seasons, with spring months eventually shifting to occur in the winter. This made the calendar ineffective for agricultural planning purposes, such as determining when to plant and harvest crops. To correct for this, many civilizations would insert leap months to periodically catch the calendars back up to the seasons. Although the Egyptians were correct and 365 days is more accurate than 360 days, it still isn’t quite right. A year is closer to 3651⁄4 days and every four years we add a leap day to account for this discrepancy. But even that isn’t quite correct, so every 100 years we skip adding a leap day (unless the year is divisible by 400, in which case, we do). To compound this complexity, it turns out a day isn’t actually 24 hours long and every so often we have to add in a leap second. In fact, a day in 2015 was about 1.7 milliseconds longer than a day was in 1915 (Dennis D. McCarthy, 2009, p. 232), so not only are days not 24 hours long, but they aren’t even consistently inconsistent. Time Zones and Trains Fast forward from ancient Egypt to the late 19th century in the United States. The prevailing method for keeping time was ‘solar time’, with local time based on noon coinciding with the sun being directly overhead. This generally wasn’t a problem—until the trains arrived. Trains opened up the country, allowing people to travel rapidly between cities; so rapidly that the combination of train travel and solar time led to passengers setting their watches in New York and, upon arriving in Philadelphia, finding that their watches were no longer correctly reporting the local time.1 This lack of consistency between cities grew increasingly problematic, so on October 1, 1884, the Interna1 This was particularly troublesome when various railway companies kept timetables according to their own home office. A traveler going from Washington, D.C. to Boston might have to change his watch four times or else show a discrepancy of 24 minutes. Bianculli, Anthony J., Iron Rails in the Garden State: Tales of New Jersey Railroading (2008), p. 94. 1-21-16 Complications of Time Zones and Timestamps s 41 time zones on Earth (+14 UTC to -12 UTC) that change over time; s 13 current different methods or dates for Daylight Savings Time (DST) that change over time; s 5 different DST offsets (20, 30, 40, 60 or 120 minutes); s 15 different major computer epochs; s Multiple measurement increments for some epochs; s Multiple storage formats; s Varying degrees of precision; s One document may commingle methods, formats, epochs, and DST . tional Meridian Conference met in Washington, D.C. and established Greenwich Mean Time (GMT) as a standard. As a result of this conference, and the adoption of GMT, the time zones we know and use today evolved. This wasn’t the only occasion that our means of timekeeping failed to keep pace with evolving technology. Advances in timekeeping, specifically the atomic clock, led to the creation of Coordinated Universal Time (UTC) by the International Radio Consultative Committee in 1960. UTC linked the universal clock to the atomic clock, allowed for the introduction of leap seconds and, most importantly, provided a standard notation for representing time zones, such as ‘UTC -0500’ for Eastern Standard Time (EST). And this wouldn’t be the last time that timekeeping was forced to evolve to meet advances in other technologies. Relativity and GPS Consider a modern example of evolving technology and the increasing need for accuracy: the Global Positioning System (GPS). We rely on GPS every day—on our phones and in our cars—to know where we are and tell us how to get where we want to go. Without proper understanding of time and the means of calculating it accurately, GPS would be an impossibility. The system relies upon hyper-accurate clocks, clocks that must be accurate to within nanoseconds. These clocks reside in satellites travelling at close to 8,700 m.p.h. and orbiting at an altitude of roughly 12,500 miles above the earth. According to Einstein, such clocks fall prey to both the Special Theory of Relativity and the General Theory of Relativity. Under the Special Theory, speed affects time. The faster you go, the slower time elapses. Under the General Theory, gravity affects time. The stronger the force of gravity, the slower time elapses. Thus, under the Special Theory, the 8,700 m.p.h. speed proves fast enough to make the GPS clocks tick COPYRIGHT 姝 2016 BY THE BUREAU OF NATIONAL AFFAIRS, INC. DDEE ISSN 1941-3882 3 seven microseconds slower per day than ground-based clocks (Pogge, 2013). Alternately, under the General Theory, the 12,500 mile altitude is far enough removed from gravity to make the GPS clocks tick 45 microseconds faster per day (Pogge, 2013). The net effect is that over the period of a day, the satellite’s clocks are 38 microseconds faster than ground-based clocks. While this may sound inconsequential, GPS requires nanosecond accuracy. Thirty-eight microseconds is 38,000 nanoseconds, and would cause GPS to be off by over five miles after only a single day and the effect would compound each day thereafter. The system would be useless after only a few hours of operation. Just as we saw with the arrival of the railroads, where we started moving fast enough and far enough to show the limitations of our timekeeping, there is a modern parallel with GPS. As our communications get faster and our systems increasingly integrate with each other, the world gets conceptually smaller and inadequacies begin to appear in our timekeeping technology. Yet we rely on time and date values regularly in litigation: Who knew what— and when did they know it; when was the email sent; when did the stock trade occur; where was the defendant at a specific time? All of these questions are based on time and location and, increasingly, that location is reported via GPS—which is itself reliant upon accurate timekeeping. cal time viewpoint for each individual, this single conversation occurs on three completely different days. Conversely, from a universal time viewpoint, the conversation is a single event. If later asked when the conversation took place, one individual might say late at night on Monday; another could disagree and say it was very early morning on Wednesday; and the third, disagreeing altogether, would say it was in the middle of the day on Tuesday. The important thing is that all three would be correct. Works Cited Dennis D. McCarthy, K. P. (2009). Time: From Earth Rotation to Atomic Physics. Weinheim, Germany: John Wiley & Sons. Pogge, R. W. (2013, April 10). Real-World Relativity: The GPS Navigation System. Retrieved September 30, 2015, from Ohio State University: Astronomy 162: http:// www.astronomy.ohio-state.edu/~pogge/ Ast162/Unit5/gps.html Various. (2015, September 24). Coordinated Universal Time. Retrieved from Wikipedia: https://en.wikipedia.org/wiki/ Coordinated_Universal_Time Various. (2015, August 23). List of Time Zone offsets. Retrieved from Wikipedia: https://en.wikipedia.org/wiki/List_ of_UTC_time_offsets Time is Relative And so we come to the problem of time as it applies to litigation. A primary source of confusion regarding time in litigation stems from the need to represent time as both a relative value (local time) and an absolute value (universal time). We use local time to understand events as they unfold for a specific individual, e.g., ‘When did plaintiff receive the email?’ When asking this question, we’re not interested in what time it was in Greenwich or Tokyo or Hong Kong, but rather what time it was locally for plaintiff. Conversely, we use universal time to understand how events in different time zones relate to each other. Multiple events occurring in different time zones can be converted to UTC so that a proper sequence of events can be observed. To complicate this, there are presently 41 time zones in effect around the globe. There are instances where two time zones exist within a single city. There are instances of half-hour time zones (e.g., Australian Central Standard Time \ACST\, which is UTC +0930). There are even instances of quarter-hour time zones (e.g., Nepal Time \NPT\, which is UTC +0545). And, in what would seem to be a complete defiance of logic, there are instances where today, tomorrow and yesterday are all occurring at the same time. This is owing to the oddity that time zones span a total period of 26 hours. Christmas Island is 14 hours ahead of UTC, while Howland and Baker islands are 12 hours behind. This means that three people having a hypothetical phone conversation could actually be talking on three different dates, the first at 11:00 p.m. Monday, January 1 on Baker Island; the second at 11:00 a.m. Tuesday, January 2 in Greenwich, England; and the third at 1:00 a.m. Wednesday, January 3 on Christmas Island. From a loDIGITAL DISCOVERY & E-EVIDENCE REPORT ISSN 1941-3882 Time Zones Within Litigation The confusion around time zones described in Part One of this article (16 DDEE 19, 1/7/16) is often magnified by the discovery process. This is partly due to the wide variety of systems and time zones potentially encountered during the life of any given document or email—including when and where it is created, received, and archived. That is further compounded by the impact of where and when the document or email is collected, processed or produced. Even further confusion can be introduced by identical files being collected, processed and/or produced by different teams, at different times, using different systems and for different purposes—each of which can easily lead to varied interpretations of the same date and time values. Due to this complexity, eDiscovery software typically allows for the normalization of time-stamped files to a single, particular time zone. The Electronic Discovery Reference Model (EDRM) recently published draft guidelines which state: One of the fundamental characteristics of Electronically Stored Information (ESI) is time zone. Most electronic data is stored in UTC (Coordinated Universal Time). The user’s operating system uses regional settings on the user’s system to convert the UTC time to the user’s local time zone. In order to avoid discrepancies caused by custodians who travel between multiple time zones, or projects with custodians in multiple time zones, normalization is needed.1 1 The current March 2015 draft is available for review at the referenced URL, retrieved August 16, 2015 from: http:// www.edrm.net/resources/standards/processing BNA 1-21-16 4 The target time zone, usually a particular offset from UTC, is used to interpret the times found within the documents, in theory providing a coordinated presentation across the collection. In practice, additional factors need to be taken into consideration, including periodic seasonal adjustments to local time and the myriad ways in which time can be digitally stored within electronically stored information. . Practice Tips for Timestamps s Address date stamps and timestamps in your case management order (CMO) s Discuss date stamps and timestamps with your litigation support team s Identify timestamp relevance as soon as possible s Do not make time zone assumptions about printed date stamps and timestamps s Do not over-rely on unverified timestamps s Do not over-rely on verified timestamps until you know the purpose for which the timestamp was recorded (relative time, absolute time, sequencing) Spring Forward, Fall Back and Wreak Havoc Unfortunately, the practice of using UTC adjusted by a normalizing offset can often fail to represent local time accurately. The primary reason for this discrepancy is the use of daylight saving time (DST) during summer months. DST, another agrarian holdover, is a series of idiosyncratic national, regional and local rules designed to adjust local time to take advantage of earlier sunrises during summer months. These rules vary from country to country (e.g., the U.S. recognizes different dates for DST than the U.K.); from year to year (e.g., the U.S. changes the start and end dates of DST each year); and from locality to locality (e.g., Arizona does not observe DST, but some areas within Arizona do, except for areas within those that do not, except for areas within those that do: See Jeddito, Arizona). Interpretation of DST is dependent upon specific knowledge of these national, regional and local rules. The rules vary based upon geography, time of year, specific year in question, and the 20, 30, 40, 60 or 120minute increment in use for that locality. Due to challenges inherent in interpreting DST, especially for historical events, most eDiscovery software ignore DST when converting, normalizing and displaying timestamps. Unfortunately, such an omission in the conversion methodology can result in evidentiary inconsistencies to the uninformed. For example, when converting timestamps to local time for New York City (NYC), this arbitrary approach would subtract five hours (-0500) when calculating the creation time of any document or email. This would be fine for documents created during the winter, but for 1-21-16 documents composed during the summer, the conversion would be an hour off. Confusion and contradiction can result when comparing dates and times from this data to other, external dates and times where the external events are properly adjusted for DST (e.g., cell phone, swipe card or historical GPS data). To Illustrate . . . A specific example might include a garage swipe card, GPS and email sent at times significant to a defendant’s actions on a summer day in NYC. The access card shows defendant’s car exiting the garage at 5:45 p.m., and GPS records from the car and cell phone show defendant driving home from 5:45 p.m. to 6:30 p.m. In seeming contradiction to this data, a critical email appears to have been sent from defendant’s home desktop computer at 6:05 p.m., raising the obvious question, ‘‘Who sent the email?’’ Upon further forensic review of the ESI, it is discovered that the email metadata was processed using the arbitrary offset method (-5 hours), while the access card and GPS records accurately report DST (-4 hours). After correcting for this processing error, the home email is determined to have been sent at 7:05 p.m.; a time when defendant was clearly at home. A potential alibi has fallen apart. Obscure Digital Formats and Metadata One common characteristic of all timekeeping systems, both digital and analog, is the use of a fixed anchor point, an ‘‘epoch’’, from which the system originates. This epoch can range from the accession of the current emperor (Japan), to the birth of the leader (North Korea), to the incarnation of Christ (Christianity) or to the beginning of Creation (Hebraism). Just as traditional calendars have various epochs, computer timekeeping systems have their own epochs. There are currently over a dozen major epochs used in digital timekeeping, including January 1, 1601 (Windows FILETIME); November 17, 1858 (OpenVMS VAX); and January 1, 1970 (Unix Epoch Time). There Questions for Litigators About Timestamps s How important are timestamps for this litigation and/or this document? s Does a 24-hour variance matter? s Does the CMO address timestamps? s Do your tools handle timestamps and if so, how? s Has the timestamp been verified and if not, can it be? s For what purpose was the timestamp originally recorded? s Has the timestamp been migrated, reformatted or changed over time? s Can the timestamp be correlated within the document, with other documents and/or through deposition? COPYRIGHT 姝 2016 BY THE BUREAU OF NATIONAL AFFAIRS, INC. DDEE ISSN 1941-3882 5 are even a few artificial dates, such as January 0, 1900 (Microsoft Excel).2 Besides being anchored to diverse epochs, each system can count using differing increments. A given digital date and time value may be recorded as the number of 100-nanosecond intervals that have occurred since January 1, 1601, or as the number of seconds since January 1, 1970. In addition to using different increments and epochs, computer-encoded timestamps may be physically stored using different formats—textually or digitally. When stored textually, the data can usually be viewed using a standard text editor, but when stored digitally, it can require forensic tools to extract. When stored as text, the text can be written using plain, decimal, hexadecimal, or even octal. Within each format, the particular sequencing can vary—and the formats themselves can vary for the same epoch and increment system, even within the same file. Some Examples. For example, Microsoft Exchange emails contain many timestamps. One timestamp might be in FILETIME format, represented by the number of 100-nanosecond increments elapsed since January 1, 1601. FILETIME entries, when stored as text, are typically written in hexadecimal format. In this format, a single timestamp may appear in multiple locations in the email, some sequenced left-to-right (little-endian), some right-to-left (big-endian), and some even in mixed combinations. Another timestamp might be in plain text, written in standard notation easily read and understood without special tools, while even another may also be in plain text, but written as a FILETIME value in hexadecimal, which requires decoding prior to reading. Moreover, a given timestamp in a given format (e.g. floating point) may be more precise when it is close in time to its day zero, and less precise when it is time removed from day zero. Meaning you may have nanoseconds at January 1, 2000, but only seconds or minutes at another date and time. How many timestamps can appear in a single Microsoft email? The number can vary considerably. In a recent matter, we identified as many as 17 different timestamps in the header of a single email, which included various server routing times, anti-virus and SPAM scanning times, archiving and journaling times, MIME encoding times, and sent and arrival times (in multiple different formats and locations). Of these 17, FILETIME appears multiple times, in both big-endian and little-endian format, and plain text appears multiple times, both with and without UTC and DST designations. Why Is This Important? When all of these different formats, created by different systems, match up in sequence, they increase the integrity and reliability of the date and time represented. In contrast, if some of the common, easily identifiable and accessible dates and times, e.g. plain text versions, don’t match the encoded timestamps, e.g. FILETIME, it 2 This artificial date can be observed by opening Excel, entering ‘‘0’’ in a cell, and then formatting the cell as a Long Date, at which point Excel will report the date as ‘‘Saturday January 0, 1900’’. DIGITAL DISCOVERY & E-EVIDENCE REPORT ISSN 1941-3882 may be an indication that the timestamps have been modified and no longer accurately represent the historical record. In a recent case, due to eDiscovery software using an arbitrary UTC offset and incorrect assumptions about the time zone for embedded metadata, the ‘‘sent’’ times for several critical email messages were incorrectly reported. When this was discovered, and corrected, the existence of these additional embedded timestamps added significant credibility to the revised times. Reporting that 17 different time values, not just a single timestamp, were relied upon for our analysis, and that all timestamps were appropriately coordinated with each other, added weight to our argument that this was now the correct interpretation. Several of the values appeared in multiple locations and in multiple different formats, again increasing the weight of the argument and providing a strong and difficult-to-refute basis for our conclusions. This is especially important when the conclusions being drawn contradict prior statements and are relevant and substantial. Interpreting Dates, Times Presented On Documents s Where and when (date) the document was written; s The configuration of the server and workstation; s The epoch and increments of the timestamp; s The method and format used for storing the timestamp; s The chain of custody of the document and the conversion or application of timestamps during custody; s The method of collection and preparation for both review and production. What Does This All Mean? Luckily, in most litigation, the content of documents is more important than the precise date and times. Normalized timestamps that get it right most of the time seem to be a good balance of accuracy to cost. It would be time consuming and costly to verify the time stamp of each and every document. More often, the appropriate approach is to wait until litigation boils down to relatively few documents, and only then, assuming a concern related to dates or times exists, taking steps to verify time stamps. This concern can arise when sequencing of events is critical, and documents and emails have dates and times in close proximity to each other, especially when litigation involves multiple time zones or even continents. When litigation involves Europe, Asia and/or North America it can be very common for some shortcuts of normalizing to result in improperly expressed time stamps which, in turn, affects the apparent sequence of documents. BNA 1-21-16 6 When the date and time values are of concern and do need to be verified, a background in math, computer science and forensics is helpful to properly identify and decipher the various timestamps embedded within document or message. (Embedded? Aren’t they simply displayed? Sometimes.) But the more reliable method of determining timestamps is often to decipher computer-encoded timestamps. To the extent that these were captured correctly (another issue for another day), they’re more likely to travel through a chain of custody unaltered— and they’re less likely to be ambiguous as most computer-encoded timestamps are set in UTC. However, there are challenges and when these issues arise it can require specialized knowledge and experience to address them. Conclusion While time can be messy, it doesn’t necessarily follow that time is inherently inaccurate, or that time values in ESI can’t be relied upon. It does mean that a given timestamp shouldn’t be taken at face value without properly understanding the source and the assumptions relied upon to extract and process that timestamp. For the vast majority of documents, the specific accuracy of the date and time value is not the focus. But when timestamps are critical, don’t assume that the timestamp on a TIFF reflects the time zone or local time you think it reflects. A review of embedded metadata might be in order to confirm and clarify the situation. Dan Regard, the co-founder and CEO of iDiscovery Solutions, is an electronic discovery and computer science consultant with 25 years experience in consulting to legal and corporate entities. Mr. Regard is a member of the Sedona Conference WG1 and WG6, as well as a board member of Georgetown Advanced Institute for e-Discovery. Charlie Platt, a Senior Managing Consultant at iDiscovery Solutions, has over 25 years experience consulting with corporations and clients on information systems development, infrastructure and analysis, digital forensics, cybersecurity and incident response, database administration, eDiscovery cases, software analysis and development, and project management. Mr. Platt is a Certified Ethical Hacker, a Microsoft Certified DBA, and holds certifications in C/C++, Infrastructure and Networking. 1-21-16 COPYRIGHT 姝 2016 BY THE BUREAU OF NATIONAL AFFAIRS, INC. DDEE ISSN 1941-3882
© Copyright 2024 Paperzz