Workshop Description Electronic Investigations Workshop Ian M. MacDonald , Ph.D. Associate Professor and Chair Department of Computer Science The College of Saint Rose [email protected] Disclaimer • This is an introductory workshop in which we will survey important topics in the field and explore a few of the freely available software tools. ▫ The actual tools used in the field are cost-prohibitive for a workshop (>$1500 per license) • Realistically speaking, obtaining proficiency with respect to the workshop goals would take several semesters (if not years) of study and experience. ▫ The good news is that many of these skills can be selftaught! ▫ This workshop should give you a jump start and/or help you determine if this is something you would like to learn more about • “As computers become more advanced, so do criminal activities. Therefore, the computer forensics niche is in constant progression along with the technological advancements of computers,” ▫ Frederick Gallegos • “Cyberspace is an indefinite place where individuals transact and communicate. It is the place between places” ▫ Bruce Sterling (1994) • “Electronic evidence is likely to exist for most crimes!” ▫ OAS-REMJA Working Group on Cybercrime (2009) • In mid-2010, over 1.8 BILLION users were on the Internet exchanging information ▫ www.internetworldstats.com “This workshop addresses the issues faced by investigators in an increasingly high-tech world. The workshop will focus on the investigative aspect of electronic evidence; the use of software to assist in investigations, hardware and operating system fundamentals, defining computer and computer-related crime, document management, searching techniques, identifying possible fraud, and collecting and documenting electronic evidence.” Disclaimer (continued) • I am not a lawyer, nor am I qualified to lecture on any of the legal policies and/or procedures involved in the collection of digital evidence (such as search warrants, etc.). Therefore, I will not be covering any such topics • I am not an auditor, investigator, nor do I represent any law enforcement agency. Therefore, I will provided limited coverage with respect to the “applied” aspect of electronic evidence collection and cyber-forensics. Crime Terminology • Computer Crime These are the SAME ▫ Any criminal act committed via a computer • Computer-related Crime ▫ Any criminal act in which a computer is directly or indirectly involved • Cybercrime ▫ Any criminal act in which the Internet is involved (and therefore, multiple computer systems) What is computer forensics? • AKA electronic investigations, system foresnsics, digital forensics, electronic discovery, etc. • Term coined in 1991 at the first training session held by IACIS (International Association of Computer Specialists) • Refers to the tools and area(s) of expertise required to effectively collect, investigate and analyze evidence within the digital realm Where is digital forensics used? • • • • • • • Law enforcement agencies Government agencies / military Law firms, criminal prosecutors Academic institutions (research) Corporations Insurance companies Individuals Concerns • Computer/Cyber crime often lacks physical boundaries Criminal Behavior Profiles, Statistics and Challenges of Computer Forensic Investigations Perceived Insignificance • More than 1/3 of polled officers believe that the investigation of computer crime is not necessary ▫ i.e. “interferes with the ability to focus on traditional crime” ▫ Many investigators focus exclusively on child pornography • Cyber criminals themselves are viewed as harmless “geeks” or “nerds” ▫ This is seldom the case, however! ▫ “cross borders” without a passport (virtually, of course) ▫ Difficult to coordinate law enforcement in multiple countries ▫ Some countries do not cooperate at all! • Lack of physical evidence ▫ Criminal does not need to acquire expensive “tools” Ability to Investigate? • >34% of agencies surveyed had at least 1 individual who had “received training” in computer crime investigations ▫ <19% feel this person is competent ▫ 12% feel this person can actually do computer forensic examinations • 70% of those “trained” claim the training was “basic, general, introductory, etc.” ▫ Long story short... Many computer/cyber criminals are outpacing law enforcement. ▫ Fortunately, much of the criminals are nonspecialist users! Ability to Prosecute? • Prosecutors lack sufficient knowledge to prosecute computer crime! ▫ Remember, in order to be prosecuted, you first have to be caught! • Often prosecutors place low priority on computer crime ▫ Violent crimes attract much more attention The vicinage problem • The vicinage refers to the location of the physical act (of crime, presumably) • Identifying the vicinage is very difficult when the crime occurs in cyberspace! • There are no international guidelines for cyberactivity • Example from the text: Suppose an individual in Washington, DC uses a server in Canada to send a threatening e-mail to the president of the United states. To complicate matters, let’s assume that this individual utilizes an anonymizer located in Germany, although the perpetrator and the victim are in the same area. ▫ What authorities must cooperate with the USA in order to locate the individual? Other Complications • The degree of anonymity is closely related to the amount of inter-jurisdictional communication required to locate an individual • More advanced offenders use: ▫ encryption ▫ steganography More on these later! • Legislation has been proposed that would make encryption keys “discoverable” under court order ▫ How in the world will they pull this one off? Lack of Reporting • CSI/FBI Computer Crime and Security Survety (Goroshko, Ludmila, 2004) report: • The majority of Fortune 500 companies are electronically compromised each year ▫ At least $10 Billion in losses per year • 75% of all businesses have been compromised at some point ▫ 45% from “insiders” • Yet, only 17% of such victimizations are reported to law enforcement!!!! What does it mean to be anonymous? • Anonymous e-mail account registration • Anonymous forum membership • anonymizer ▫ Sites that mask the IP address of a user ▫ Accomplished through rerouting, deletion/reencapsulation of packet header information • re-mailers ▫ A form of an anonymizer that strips source address and other header information from an e-mail message, then re-sends with alternate information Some re-mailers then send these e-mails out to other remailers The forwarding of e-mail messages is often intentionally delayed by random time intervals Who is most at risk? • According to a Department of Justice study: 1. Businesses 2. Individuals 3. Financial Institutions • Typical criminals: ▫ “Insiders”, i.e. long-term employees (ages 20 – 45) ▫ Most trusted (i.e. most authority/access) • Motives: ▫ Revenge ▫ Greed ▫ Resentment What is most investigated? • A study (Hinduja, 2004) reports that agencies investigated the following (most common first): 1. harassment / stalking 2. child pornography 3. forgery 4. identity theft 5. e-commerce fraud 6. solicitation of minors Offender Profiles • Ages 16 – 57 ▫ Most prevalent in upper 30’s to 40’s • Minimum of high school diploma ▫ Many with college degrees • Moderate to high technical ability • Few (or no) prior arrests • Possession of highly capable computer equipment (and mass storage) All of the above describes a the majority of all Internet users!!! Based on this information, do you think it is possible to use profiling to identify potential offenders? Federal Resources • FBI, Secret Service, CEPTF (Child Exploitation and Pornography Task Force) • Very capable of investigating computer crime • However, physically impossible to help all state/local agencies • In order for the feds to get involved, the crime must: ▫ Threaten public safety ▫ Involve exploitation of children ▫ etc. Real Case: “Maxus” • Hacker by the name of “Maxus” gained access to almost a half-million credit card numbers from CD Universe. • Maxus demanded $100,000 blackmail to prevent releasing the numbers to the public • CD Universe alerted the FBI ▫ 25,000 credit card numbers were compromised ▫ “Maxus” still at-large Extent of Victimization Experience by American Corporations The SCARY Truth! • • • • • • • • 25% detected external system penetration 27% detected denial of service 79% detected employee abuse of Internet privileges 85% detected viruses 19% suffered unauthorized use 19% reported 10 or more incidents 35% reported 2 to 5 incidents 64% acknowledging an attack reported web site vandalism (60% reported denial of service) • Over 260 million dollars in damages were reported by those with documentation ▫ Unreported money loss estimated to be much higher! Real Case: Western Union • September, 2000 • Western Union is the world leading money transfer agency • Over 15,000 credit/debit numbers were obtained by intruders ▫ This caused WU to temporarily shut down its website!!! ▫ Unknown how much money was lost due to the downtime. Hardware Theft • Generally speaking, computer hardware is seldom secured ▫ Often available in public areas without any security! • Small is a big problem ▫ Components (i.e. memory, etc.) are extremely small and can fit in a pocket, wallet, etc. • Difficult to trace missing components • There is a market for this stuff! ▫ Many online auctions selling stolen merchandise ▫ Black market dealers ▫ Market values differ For example, at one time a $1K CPU in the US was worth $3K in the UK Intellectual Property Theft • AKA software theft • Real Case: August 2001: ▫ FBI arrested a group of men possessing $10 Million in counterfeit Microsoft software ▫ DVD install discs had mock hologram • Data Piracy ▫ Also referred to as software piracy ▫ The reproduction, distribution and use of software without the permission of the copyright owner. ▫ Very, very difficult to prevent Software Piracy • Shareware ▫ Freely distributed software, but different from “freeware” ▫ Sharing with friends/colleagues is encouraged ▫ Authors ask for a voluntary donation from users (but it is not required) Occasionally “registered” versions of the software may have enhanced functionality to encourage donations Electronic Evidence • WareZ ▫ Commercial programs that have been made publically available through “wareZ sites” ▫ Owners/administrators of wareZ sites are highly elusive, well-educated and therefore, avoid prosecution Electronic Evidence Def’n: data and information of some investigative value that are stored on or transmitted by an electronic device (usually in digital form) • What does it mean to be in “digital form”? Electronic Evidence • Circumstantial ▫ Indirect. Obtained by synthesizing an idea from seemingly unrelated facts • Physical ▫ Factual, undeniable evidence. ▫ Interpretation may be prone to error. • Hearsay ▫ Statements made out of court by someone not giving testimony (generally not admissible) • Repeatability and Reproducibility ▫ Repeatability – the ability to get the same results in the same testing environment ▫ Reproducibility – the ability to get the same test results in a different testing environment Chain-of-Custody (Chain-of-Evidence) The Four-Step Process • The route the evidence takes from initial possession until final disposition. • Very important with respect to computer forensics! Acquisition Identification ▫ Careful record keeping and procedures will help to ensure a valid chain-of-custody ▫ Failure to do so dismisses the evidence you have collected! Evaluation Presentation The Four-Step Process • Acquisition ▫ Gathering evidence/data from a crime either currently in progress or, more commonly, one that has already occurred. • Identification ▫ Evidence classified with physical (i.e. a hard drive), and logical relationships (i.e. the location of evidence on the drive) • Evaluation ▫ Is the computer evidence valid/relevant? ▫ Quality of evidence, not quantity! • Presentation ▫ Filter out non-critical evidence and decide on the most profound exhibits to use for legal proceedings Discussion Points • Why are individual victims reluctant to report computer crime? • Why are private corporations reluctant to report computer crime? • What can be done by a company to help prevent computer crime (both internal and external) • Why do you think bulletin boards (and chat rooms) are favored by some deviant subcultures? Hardware Basics • Three basic computer components: Computer Terminology Hardware, Software, File System Structure ▫ Hardware ▫ Software ▫ Firmware Data & Storage Basics • The structure of data is extremely simple: Binary ▫ Note: If you do not understand the binary number system, please read up on this (you learned this in CIS111) • Bit = Binary Digit (0 or 1) ▫ Maybe 0 = off, 1 = on ▫ Maybe 0 = false, 1 = true • Byte = 8 bits Bigger Bytes! • All larger byte representations are in blocks of size 2N ▫ ▫ ▫ ▫ Kilobyte = 210 bytes (or 1,024 bytes) Megabyte = 220 bytes (or 1,048,576 bytes) Gigabyte = 230 bytes (or 1,073,741,823 bytes) Terabyte = 240 bytes (or 1,099,511,627,776 bytes) • Acronyms are KB, MB, GB, TB, respectively. • Roughly speaking, the entire library of congress could fit (uncompressed!) on 10 TB Take the bus! • Buses ▫ Sets of parallel wires connecting various components ▫ Parallel wiring allows several bits to be simultaneous transmitted • USB (universal serial bus) ▫ Allows quick connection to system bus Nibbles & Bytes Let = 1 byte Word = Double Word = A single character can be represented by a single byte (spaces and newlines are considered characters) Therefore, how many bytes would it take to store the phrase Forensic Computing? How many words? Hardware • Motherboard: ▫ Primary circuit board in which all other components attach • PC cards – “expansion” cards. Not so common anymore (what used to be optional components are now standard on the motherboard)... ▫ Connected to PCI (peripheral component interconnect) express bus • CPU ▫ Hz = Cycles per second ▫ MHz = 1 Million Hz ▫ GHz = 1 Billion Hz RAM • Random Access Memory • Temporary storage of application(s) and some data • Volatile Hard Drives (HDD) & Mass Storage Devices • “permanent” storage solution • death of the floppy disk! ▫ 1960’s saw 5.25”, then 3.5” ▫ Many people still have boxes full of floppy disks containing valuable data • CD/DVD/Blue Ray RW Drives becoming standard • Storage: Other Mass Storage Media • Memory cards (AKA flash memory) ▫ ▫ ▫ ▫ ▫ SD Micro-SD Flash drives (AKA jump drives, dongles, etc.) Compact Flash (CF) XD ▫ CD storage ≈ 650 – 850 MB ▫ DVD storage ≈ 4.7 GB (dual layers ≈ 8.5 GB) ▫ Blue Ray storage ≈ 25 GB (dual layers ≈ 50 GB) Multi-Format Card Readers • Commonly installed in most desktops (also becoming common in laptops) • Necessary for digital camera / video enthusiasts. • Necessary for digital forensic lab Software • Three major types: ▫ Boot sequence ▫ Operating system (OS) ▫ Application software Handheld Devices • Thousands of handheld devices have been released over the years • Many of these devices mimic the capabilities of standard computers • Frequently, these devices contain crucial digital evidence Boot sequence • Series of steps taken by the computer starting immediately after it is powered on • “pulling itself up by its bootstraps” • Initial boot sequence loads low level software/data from CMOS • Once completed, the OS begins to load Battery CMOS (Complementary metal-oxide semiconductor): small memory chip on the motherboard Operating Systems (OS) • A “layer” of software that provides: ▫ A level of abstraction from the hardware ▫ An interface between the application programs and the hardware ▫ A method of visually accessing the file system (and other components) ▫ GUI • Most popular: ▫ ▫ ▫ ▫ Microsoft Windows (1987) MAC (1980’s) (new systems are Unix-based) Unix (1969) Linux (1990) – Unix-like & freely distributed Elementary Networking • Routers ▫ Special-purpose computers to handle connections between 2 or more networks ▫ Kind of like “traffic cops” for packets • Hubs ▫ Central switching devices ▫ “Dumb routing” • Packets ▫ Relatively small chunks of data labeled for delivery somewhere on a network ▫ Consist of control information (header) and data IP’s / Domains / DNS • IP Address ▫ 32 bit (4 byte) logical numerical identifier for a machine (must be globally unique*) Kind of like a “phone number” IPv6 uses128 bit addresses Ex: 36.231.98.53 • Domain ▫ A group of IP addresses identified by a domain name (i.e. strose.edu) • DNS (Domain Name System) ▫ In short, this is a mapping from domain names to IP addresses Internet • Started: ARPANet (September, 1969) • network ▫ The interconnection of 2 or more communicating devices • internet ▫ The interconnection of 2 or more networks ▫ “The” Internet is the most well-known example • Note: Internet ≠ WWW Cookies • Used by HTTP • Pieces of information sent from a web server to a local host machine (web browser) • Saved on local machine • Sent back to the server when requested • Examples: ▫ Login/registration information ▫ Shopping cart contents & info ▫ User preferences for a site WWW • World Wide Web ▫ The layer of software & applications that “sit” on the physical Internet. • Contents: ▫ Web pages / sites ▫ Newsgroups / Bulletin Boards ▫ IRC (Internet relay chat): “chat rooms” Individuals use “nicknames” that must be unique for each system. Therefore, an individual belonging to several chat sites may have multiple nicknames Connections to the Internet • Digital Subscriber Line (DSL) ▫ Various types: ADSL, HDSL, RADSL ▫ Use standard phone line (if communications hardware is available within your area) • Cable Modem ▫ Use standard co-ax cable (usually cable TV provider offers this service) Microsoft File System Very Brief Overview • Dial-up Modem ▫ Uses phone line (very slow connection speed) • Satellite ▫ For those in the “boonies” FAT & NTFS • AKA Windows/DOS • FAT (file allocation table) ▫ Older MS file system, but frequently used as the file system in removable media ▫ Locations of data on disk stored in table(s) ▫ History: FAT12, FAT16, then FAT32 Ex: FAT32 = 32 bit addressing (max file size is therefore 232 or about 4.3 GB) • NTFS (new technology file system) ▫ Replacement for FAT • FAT/NTFS determine how/where files can be “hidden” on the disk Partitions • A portion of the HDD separated from others • Example: ▫ You can install 2 OS’s on the same machine (but each usually resides on its own partition) • Partition gaps ▫ Data can be hidden between partitions ▫ A utility can be used to remove references to hidden locations on the drive Utilities include Norton Disk Edit, WinHex, Hex Workshop Registry • Hardware / Software configuration database • Access to registry through REGEDIT application Windows File System Structure A Brief Overview Hard Disks: Sector Format • A sector is the basic 512 byte unit of storage on a HDD. • In addition to data, each sector stores some control information: ▫ ID: sector number that identifies it on disk (and contains status info) ▫ Synchronization fields: helps guide the read process ▫ Error-Checking Code (ECC): for data integrity ▫ Gaps: Spaces provided to allow enough time for the drive controller to continue the read process Clusters • Smallest logical storage unit on a HDD • Contiguous “chunks” of space managed by the file system for efficient storage • Clusters can range in size from 4 sectors (2,048 bytes) to 64 sectors (32,768 bytes) • Example: Consider a 3000 byte (1.3Kb) file ▫ If the system uses 2,048 byte clusters, how many clusters would this file occupy? ▫ How much space is wasted? • Slack Space Control ▫ The area between the end of a file and the end of a cluster. ▫ This unused space is still assigned to the file Data 512 bytes Windows - NTFS NTFS – Deleting Files • New Technology File System • Capable of self-repair and high-performance • Supports: ▫ ▫ ▫ ▫ Large volume storage File-level security (encryption, decryption) Compression Auditing • NTFS Master File Table (MFT) ▫ Stores information regarding file attributes • When the number of files on an NTFS volume increases, the size of the MFT increases • Utilities that defragment NTFS volumes on Windows systems cannot move MFT entries • Files deleted within Windows are moved to the Recycle Bin ▫ All information about the original file (and original location) is maintained • Files deleted from a command prompt are not moved to the Recycle Bin ▫ However, we can still recover all or part of the file using Windows forensic tools ▫ NTFS reserves space for the MFT to maintain it as it expands NTFS – Data Streams C:\Java>dir Volume in drive C has no label. Volume Serial Number is 1E9E-1A20 C:\Java>more < myfile.txt:stream1 text_message C:\Java>dir Volume in drive C has no label. Volume Serial Number is 1E9E-1A20 Directory of C:\Java 03/02/2010 03/02/2010 02/06/2010 03/02/2010 NTFS File Streams 01:57 PM <DIR> . 01:57 PM <DIR> .. 10:03 PM 8,680,549 drjava.jar 01:49 PM 40,960 pmdump.exe 2 File(s) 8,721,509 bytes 2 Dir(s) 82,108,739,584 bytes free C:\Java>echo text_message > myfile.txt:stream1 Directory of C:\Java 03/16/2010 03/16/2010 02/06/2010 03/16/2010 03/02/2010 10:41 AM <DIR> . 10:41 AM <DIR> .. 10:03 PM 8,680,549 drjava.jar 10:41 AM 0 myfile.txt 01:49 PM 40,960 pmdump.exe 3 File(s) 8,721,509 bytes 2 Dir(s) 82,108,731,392 bytes free What happened? File size difference: 8,192 bytes • A file stream (AKA alternative stream) is the ability to hide data “behind a file” ▫ This can be text, images, etc… or a virus! • Example: ▫ You could create a simple file 1KB in size, like resume.txt ▫ Then… attach a 3MB executable file to it (hidden, of course) ▫ The file system would report this file as 1KB resume.txt file • Deleting the file deletes the stream • As an exercise: ▫ Next time you are on your laptop, try hiding an executable file behind a simple text file. Look at the file size both at the command prompt and with the Windows GUI WinHex WinHex Screen Shot • A hexadecimal editor capable of viewing all types of files, including hidden files, streams and slack space on virtually any media. • The tool allows you to examine the contents of RAM as well. • Free-version has just enough capability for personal use. A commercial version has many more features (and less annoying limitations) Plain-text Editor (NotePad++) Text Files • By far, the simplest of all file formats • Usually contain simple ASCII text. In other words, the file is simply a set of contiguous bytes (one for each character) File Size: 120 bytes WinHex ▫ Special characters, such as line-feed, carriage return, space, tab, etc. exist in there as well • Several types of files are plain-text: ▫ ▫ ▫ ▫ HTML files Source-code files Log files Etc. Word Processor and Spreadsheet Documents • As we know, these contain much more data than just the text. • In fact, the file size can be several thousand times larger than the equivalent text-only version! • Much of the “extra” information can be parsed using WinHex (or similar tool) ▫ Locate user, group(company), author, revision history, etc. File Size: 26,112 bytes!!! I had to scroll all the way down to here just to see the document contents File Size: 9,542 bytes Notice the lines and lines of data showing up within WinHex for a simple document! Author (user) WOW!!! Image Files Image Files • Most image files have a header record (at the very beginning of the file) that indicates that it is, in fact, an image • JPEG (by far, most popular) ▫ Joint Photographic Expert Group (1986) • Two compression modes: ▫ Lossless (compression ratio ≈ 3:1) ▫ Lossy (compression ration ≈10:1) ▫ JPEG files have “JFIF” within the header ▫ PNG files have “PNG” within the header • Therefore, opening a suspected image file within a hex editor can be an easy way to determine the actual file type. Note: Time permitting, there will be a more detailed discussion regarding digital images • “Q number” controls the quality ▫ Q = 100 denotes highest quality, minimal compression ▫ The lower the Q number, the lower the quality, and hence higher the compression (and quantization) JPEG compression process (Lyu, S., 2010) (Lyu, S., 2010) Steganography Application of Steganography • Steganography • Purposes: ▫ Practice of embedding hidden messages within a carrier medium • Modern steganography works by replacing bits of useless or unused data in regular computer files with bits of different, invisible information • Steganography can also be used to supplement encryption Classification of Steganography (cont’d.) ▫ ▫ ▫ ▫ ▫ Medical records Workplace communication Digital music Terrorism The movie industry Detecting Steganography • Indicators include: ▫ Software clues on the computer ▫ Other program files ▫ Multimedia files • Detection Techniques ▫ ▫ ▫ ▫ ▫ LSB substitutes the rightmost bit in the binary notation with a bit from the embedded message. Tools Cryptography • Tools include: ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ 2Mosaic FortKnox BlindSide S-Tools StegHide Snow Camera/Shy Steganos Pretty Good Envelope Gifshuffle JPHS wbStego OutGuess Invisible Secrets 4 Masker Data Stash Hydan Cloak StegaNote Statistical tests Stegdetect Stegbreak Visible noise Appended spaces and invisible characters ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ Stegomagic Hermetic Stego StegParty StegoSuite StegoWatch StegoAnalyst StegoBreak StegSpy Stego Hunter ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ WNSTORM Xidie CryptArkan Info Stego Stealth Files InPlainView EzStego Jpegx Camouflage ◦ ◦ ◦ ◦ ◦ ◦ ◦ Scramdisk CryptoBola JPEG Steganosaurus ByteShelter I appendX Z-File MandelSteg and GIFExtract Here’s an example using an online tool “Steganografie” http://www.kwebbel.net/stega/enindex.php • Cryptography ▫ Art of writing text or data in a secret code • Three types of cryptographic schemes used: ▫ Secret-key (or symmetric) cryptography ▫ Public-key (or asymmetric) cryptography ▫ Hash function • Steganography Versus Cryptography ▫ Steganography replaces bits of unused data from various media files with other bits that, when assembled, reveal a hidden message ▫ In cryptography an encrypted message that is communicated can be detected but cannot be read Watermarking • Digital watermarks ▫ Digital stamps embedded into digital signals • Application of Watermarking ▫ ▫ ▫ ▫ ▫ Embedding copyright statements Monitoring and tracking copyright material Providing automatic audits of radio transmissions Supporting data augmentation Supporting fingerprint applications • Steganography Versus Watermarking Now, let’s go through tutorial 1 http://academic2.strose.edu/math_and_science/ macdonai/EEWorkshop ▫ Main goal of steganography is to protect the data from detection, while that of watermarking is to protect data from distortion by others What is Identity Theft / Fraud • What is identity? • Affects over 10 million Americans each year. • Methods: ▫ ▫ ▫ ▫ ▫ Defining & Identifying Fraud Focus on Identity Theft & Fraud Eavesdropping Postal mail theft Dumpster diving Computer theft / hacking … and more! • Motives include: ▫ Economic gain ▫ Access to secure information ▫ Revenge (rare) Implications • • • • Financial loss (and inconvenience) Tarnished reputation Unauthorized access National security! ▫ Border crossings ▫ Immigration ▫ Airline & public transportation security Refining Our Definition of Identity Theft vs. Identity Fraud • Identity Theft “the illegal use or transfer of a third party’s personal identification information with unlawful intent” • Identity Fraud “a vast array of illegal activities based on fraudulent use of identifying information of a real or fictitious person” • Main types of identity theft/fraud in the US: 1. 2. 3. 4. 5. Assumption of identity Theft for employment and/or border entry Criminal record identity theft/fraud Virtual identity theft/fraud Credit or financial theft Assumption of Identity • Rare, difficult to pull off • Criminal assumes the identity of the victim ▫ ▫ ▫ ▫ Personal histories Friendships / relationships Job Etc. Theft for Employment and/or Border Entry • Illegal immigration is a serious problem in the US! • Real case: INS has seized/intercepted tens of thousands of fraudulent documents, such as: Alien registration cards Visas Passports Citizenship documents Employment eligibility documents Criminal Record Identity Theft/Fraud Virtual Identity Theft/Fraud • In this case, an innocent victim may appear to have a criminal record • Often goes unnoticed by victim for a long period of time • The burden is often on the victim: ▫ First, the victim must clear his/her name and prove innocence ▫ Second, must obtain a court order to expunge the record(s) ▫ This could all take years to resolve ▫ This typically costs the victim a substantial amount of money in legal fees • A development of a fraudulent virtual personality • Easy to construct: Credit Identity Theft/Fraud • Most common • Identity theft/fraud to facilitate the creation of fraudulent accounts / credit cards • Does not include stolen credit cards, rather just someone’s credit • In 2006, the FTC reported 3.2 million Americans fell victim to credit identity theft/fraud. ▫ i.e. I could create a virtual identity that I am 6’9” and bald with a historic college basketball career. • Often used for online dating, flirtations ▫ Old pictures, fake age, etc. • May be used as a method for stalking/harassing or financial fraud Reporting • Again, we are faced with inaccuracies in the statistics for identity theft/fraud ▫ Delay in reporting/awareness ▫ Private companies want to protect their own interests ▫ Lack of mandatory reporting to federal agencies ▫ Lack of national measurement standards • Information sources ▫ ▫ ▫ ▫ Credit reporting agencies Software companies Popular and trade media Government agencies • Often identity theft/fraud is simply a component in a larger crime and is therefore not separately reported Some Statistics • 2002 General Accounting Office – first study of ID theft/fraud ▫ Identity theft/fraud cases are increasing ▫ Most common consumer complaint to FTC ▫ ~5% of those surveyed had been victims of identity theft/fraud ▫ ~6% of Americans had seen unauthorized purchases on credit cards ▫ ~13% had discovered the misuse of their personal information ▫ Of FTC complaints: 42% identity theft used in conjunction with credit card fraud 20% unauthorized telecommunications/utility services 13% bank fraud 9% personal info used for employment purposes 7% fraudulent loans 6% falsifying government documents or fraudulent receipt of benefits Methods of ID Theft • Mail theft ▫ Personal mail contains personal/sensitive info ▫ Often mailboxes are not secure storage for mail! ▫ “popcorning” Targeting mailboxes with the red flag up (i.e. outgoing mail) Often contain credit card payments, information, etc. ▫ Consider using the post office for sending sensitive mail • Insiders ▫ Yes, again! ▫ Can be either intentional or accidental ▫ Example: In 2005 Citigroup reported that UPS lost the personal financial information of about 4 million customers Talk about blaming someone else!!! • Fraudulent or Fictitious Companies ▫ Collect and process information either voluntarily from naïve customers or without the victims knowledge at all ▫ Examples: Choicepoint - A US company that collects all sorts of information about people across the country The cost of Victimization • Disclaimer: Lack of accurate / mandatory reporting makes cost estimation difficult • Cost should be measured by both: ▫ Loss of money (may be recoverable) ▫ Loss of time (not recoverable) • Average time before awareness of identity theft/fraud activity is 12 – 14 months (Stuart, et. al, 2004) • Costs to economy exceeded 50 billion in 2007 (America) ▫ Much less in other countries • Common victim profile: ▫ White male, early 40s, living in metropolitan areas • Dumpster Diving ▫ Sifting through commercial/residential trash looking for sensitive documents ▫ Combating this: Paper shredders Disk-wiping software • Physical Computer Theft ▫ Very common ▫ Personal information can be retrieved very quickly • Bag Operations ▫ Sneaking into a hotel room to obtain information from computers, paperwork, removable media, etc. ▫ Entire hard drives can be copied onto removable media in a short period of time. Danger at the ATM • Reading / recording information from the magnetic strip on an ATM or credit card ▫ This information can then be “re-printed” on a secondary card and used for fraudulent purchases or cash withdrawals • Card skimmers ▫ Mini cameras/copiers mounted near ATM machines (or near point-of-sale machines) ▫ See images on next slide… Virtual/Internet Methods • Many Internet protocols are not secure Card skimming equipment is installed over existing card reader ▫ SMTP (application-level e-mail protocol) ▫ FTP (application-level file transfer protocol) ▫ HTTP (application-level protocol for transferring web pages from server to client) • Phishing ▫ Solicitation of information via e-mail ▫ Redirection to fake websites ▫ Stats 2004-05 73 million Americans received at least 50 phishing e-mails 2.4 million reported losing money PIN # camera • Popular phishing examples • Categories of phishing ▫ spoofing Using company trademarks/logos so that the e-mail appears to be valid (30% linked to e-bay/Paypal!) ▫ pharming redirects the IP address from a legitimate site to a phishing site (accomplished through DNS modification, virus, etc.) ▫ redirectors redirect network traffic to undesired sites often redirects to fraudulent DNS servers (most DNS look-ups are “good” to delay user detection) ▫ advance-fee fraud (or 419 fraud) promise of large financial windfall if personal information is given ▫ phishing Trojans & spyware executable malicious code typically attached to e-mail, but can also be executed remotely • Keyloggers ▫ Inexpensive ($20 - $300) ▫ USB loggers now available ▫ devices/software which record keystroke data ▫ can be stored locally then “collected” or transmitted to a remote location periodically ▫ may also capture screen shots ▫ goal is most often to retrieve usernames/passwords Protecting Yourself • Monitor the use of your cards/accounts!!! • Check with the major credit reporting agencies: ▫ Equifax (www.equifax.com) 1-800-685-1111 ▫ Experian (www.experian.com) 1-888-EXPERIAN ▫ Trans Union (www.tuc.com) 1-800-916-8800 • FDIC can answer some FAQs about regulations and banking practices ▫ Google “Fair and Accurate Credit Transactions Act (FACTA) under the FTC / FDIC • Check – verification services ▫ If you suspect someone is writing checks under your name, contact the merchant’s check-verification company Ex: www.checkrite.com, www.crosscheck.com, www.equifax.com, www.telecheck.com, etc. Tips to Avoid ID Fraud • From Newman text: ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ ▫ Be suspicious of contests Beware of imposters Beware of suspicious downloads Do not respond to unsolicited e-mails Guard personal information Look for complaints and complaint processes Pay using the safest method Remember that easy money does not exist!!!! Research the dealer or vendor Resist pressure Understand the offer! Contacts • Government ▫ ▫ ▫ ▫ ▫ Consumer.gov – www.consumer.gov FBI – www.fbi.gov FDIC – www.fdic.gov Federal Trade Commission – www.ftc.gov US postal service – www.usps.com/postalinspector • Nongovernment ▫ Better business bureau – www.bbb.org ▫ National Association of Attorneys General www.naag.org ▫ National Consumers League – www.nclnet.org ▫ National Fraud Information Center – www.fraud.org Initial Steps • If you are the “first responder” Collecting and Documenting Electronic Evidence ▫ Make every effort to not move the electronic device(s) • Wherever and whenever possible obtain a detailed record using: ▫ Video ▫ Photography ▫ Notes/sketches Documentation – Initial Steps • Immediately record: ▫ ▫ ▫ ▫ ▫ The type of device Location Position of computer(s)/device(s) List of connected peripherals Any wireless access points (WAPs) that may be capable of connecting to other devices The presence of such a device may indicate the existence of evidence external to the initial scene Documentation – Initial Steps • If the monitor is on ▫ Immediately photograph and take notes of the display contents • If the monitor is off (i.e. suspended) or if a screen-saver is presently running ▫ Slightly move the mouse (do not press any keys on the keyboard) ▫ Photograph/record display contents Documentation – Label Everything! • Be sure to label all cables and connections so that they can be referred to later in your report Collecting Physical Evidence (cont.) • When NOT to pull the plug: ▫ If any of the following are actively in use on the machine: Chat rooms Instant message windows Open documents Remote data storage connections Obvious illegal activity Etc. • Refer to the handout entitled “Collecting Digital Evidence Flow Chart” Volatile vs Non-Volatile Data • Volatile data ▫ Must be retrieved while computer is still on (and in a state left by the alleged perpetrator) ▫ Stored in RAM, cache or some temporary file ▫ A shutdown (or re-boot) may destroy or distort the data • Non-volatile data ▫ May be retrieved at any time (i.e. off-site or after collection) ▫ Permanent storage within a system or data file Collecting Physical Evidence • Often, the device(s) used to generate the evidence in question must be confiscated. • If the computer is OFF: ▫ Carefully package all evidence devices, power cables, etc. AFTER you have fully photographed and documented the scene. • If the computer is ON: ▫ Removing the power supply is usually the safest option, but proceed according to your company’s policy ▫ If running Windows, pulling the power cord will preserve the last user’s login information and many other recently performed actions ▫ Again, this should be done only after thoroughly collecting written and photographic evidence Example: Evidence to Collect When Online/Economic Fraud is Expected • According to the First Responders Guide, potential electronic evidence may exist in: • Computers • Removable media • Mobile communication devices • External storage devices • Online auction sites/account data • databases • PDAs, address books, contact lists • Any printed e-mail, notes, letters, etc. • Calendars or journals • Financial asset records • Accounting or recordkeeping software • Photos and/or image files Quick Note… • The process of collecting volatile and nonvolatile data from a computer system requires a rather large set of software tools and a great deal of experience and study. • Further complications arise with: ▫ Different operating systems (and versions): MAC, Linux, Windows, etc. ▫ Multiple software solutions for every possible need in the evidence collection process Tip #1 Communicating with IT Professionals Tip #2 • If you are investigating a case involving electronic evidence, be sure to involve qualified IT professionals as much as possible ▫ You may not be able to retrieve the information you are looking for without help • Use e-mail whenever possible to seek the help of an IT professional ▫ Maintain a record of questions and answers for future use ▫ Written queries allow you to take the time to explain your problem clearly ▫ Most IT professionals tend to respond quicker to e-mail inquiries (as opposed to voice messages). Tip #3 • It may not be possible to maintain anonymity while investigating an individual. ▫ Be sure to use only the IT professionals with the necessary clearance to assist you. ▫ Keep all communication confidential Ex: system administrators may be able to access archived e-mail records (end-users typically delete emails that contain self-incriminating evidence) ▫ IT staff usually have the tools/utilities to assist you in an investigation Tip #4 • Keep a record of every action performed by both you and the IT professional(s). ▫ It is often helpful to request the IT professional keep their own record (you can never have too much documentation!) ▫ Be there while the IT professional is assisting in the case. Don’t hesitate to ask questions! Discussion Points • What are some of the things to look for when identifying fraudulent electronic documents (e-mail, web pages, correspondence, etc.)? ▫ I’m interested in hearing your ideas specific to your occupation/needs (hypothetical cases, of course) • For non-IT professionals, what are some of the challenges faced when communicating with IT professionals? ▫ i.e. what do you need from your IT staff? • For IT professionals, what are some of the challenges faced when communicating with non-IT professionals? ▫ i.e. what do you need from your non-IT staff in order to do your job? What is a Search Engine? Keyword v. Advanced Conceptual Searching • There are millions and millions of web pages with interesting content out there! • Imagine the number of potential words or phrases one could search!!! • A search engine is a web-based application that specializes in information retrieval. How does a Search Engine Work? Search Engine Example - Google • Input: • Google searches for keywords that appear in web pages. • Rankings are done based on a number of factors, including: ▫ Keywords & phrases from end user • Output: ▫ Hyperlinks to web content, usually sorted based on relevancy. We call this the search engine result page, (or SERP) • Mechanism: ▫ Complex internal algorithms ▫ Most search engines are “always working”, crawling through the web 24-7 in an effort to better optimize the engine Search Engines – Tips & Tricks • Search for key words, not entire sentences • Leave out connective words, such as “the”, “a”, “of”, “on”, etc. ▫ Ex: “fiddler roof”, not “fiddler on the roof” • Try not to be too specific! Try being general and digging deeper from the SERP. ▫ Ex: “Saint Rose computer science”, not “College Saint Rose computer science department faculty artificial intelligence publications recent ” ▫ The number of times the keyword appears on a page • Crawlers (or “spiders”) continuously traverse link-to-link through the Internet to build index pages for certain keywords. Search Engines – Tips & Tricks (2) Keyword Search - Limitations • Keyword searches work pretty well… however, they do have some significant limitations • Fundamental Problem: index term synonymy ▫ That is, not all documents use the same words to refer to similar concepts • Polysemy ▫ Many words have multiple meanings ▫ This can create loads of irrelevant query results ▫ For example: “Axe” image results include: Keyword Search Example • From (Kiryakov, A., et. al, 2007): ▫ Take the query following query: “telecom company” Europe “John Smith” director ▫ Keyword search would NOT return: “At its meeting on the 10th of May, the board of London-based O2 appointed John Smith as CTO” Keyword Search – Limitations (cont.) • Keyword searches are destined for text-based results • Images, videos, audio files, etc. must be explicitly labeled (or “tagged”) in order to show up in a query result set. • In short, we need to model “ideas”, not simply keywords Thesaurus Expansion • Easy method for expanding the keyword search ▫ Expands the search using several different keywords within the same concept ▫ When not available, can be implemented manually ▫ Still may return non-relevant query results ▫ Why not? Well, the search engine would need to know the following relationships: 1. O2 is a kind of telecom company 2. London is located in Europe 3. CTO is a type of “director” What is Concept-Search? • A concept-based search takes a query string and expands it using relevant terms based on a defined lexicon (the words and expressions of a language) • For example (Taylor, C., 2009): ▫ The search phrase: “bank account” May be easy to modify to return bank, banks, account, accounts, banking, etc. ▫ A concept search can return related results: “bank” and “account”, but also “deposit”, “funds”, “withdrawal”, “transaction”, etc. Google Concept • Uses Google’s search engine technology • Adds a feature called mind-mapping ▫ Think of this as a tree-like structure with the central topic at the center ▫ Each “branch” leads to a subtopic which can be a: word phrase, or image • The central topic and all subtopics are treated as keywords and fed into the Google search engine ▫ All search results are broken down into the central topic and all subtopics TheBrain Software • Tool for generating a mind map • Watch video here. Hardware Solutions for Searching • Several companies provide hardware to better facilitate and optimize the search process. • Example: “Search Appliance” by Thunderstone ▫ None of these solutions are cheap… • Google also sells a hardware search appliance, i.e. Google Mini Mind Mapping • Start with the central idea and place this in the direct center of your page • Here is a decent video tutorial to explain the general concept. Concept Search Products for Windows • conceptSearching (conceptsearching.com) has released a product called conceptClassifier for Microsoft SharePoint ▫ The product automatically generates metadata and extracts concepts from content as it is created Therefore, the solution may need to be installed and active on a machine before any potential “evidence” is generated • not free! Concept Search Example • Real Case (Taylor, C., 2009): A NJ Law Journal reported that a company (not named) was conducting an internal investigation looking for insiders involved in embezzlement. • Keyword searches related to banks, accounts, deposits, etc. turned up nothing useful • A concept-based search was then run on clustered and threaded terms. The result came up with “A large number of baseball-related discussions between two men who were not sports fans” ▫ The company matched terms, e-mail dates, etc. with bank transfers, deposits, etc. WordNet (Princeton University) • Contains a large database of cognitive synonym sets (synsets) that group synonymous or similar words together as concepts. • This could be used to expand the keyword search. • Access Web-Interface through here: ▫ http://wordnet.princeton.edu/ • Unix-based downloadable product available ▫ Windows version coming soon… WordNet (cont.) • Antonymy ▫ E.g. rich and poor are antonyms ▫ Not a relationship between word meanings i.e. {rise, ascend} and {fall, descend} are conceptual opposites, but not antonyms Now, let’s go through tutorial 2 • Hyponymy ▫ A semantic relationship between word meanings i.e. maple is a hyponym of tree tree is a hyponym of plant http://academic2.strose.edu/math_and_science/ macdonai/EEWorkshop • Meronymy ▫ The part-whole relation ▫ A y has an x (as a part) or x is a part of y Thank you!!! • Please feel free to contact me at any time if you have any questions! • My on-campus phone is (518)454-5163, however, I am usually much easier to reach via e-mail ([email protected]) References • • • • • • • • • • • • Britz, M., “Computer Forensics and Cyber Crime: An Introduction”, Pearson (2009) Newman, Robert., “Computer Forensics: Evidence Collection and Management”, Auerbach Publications (2007) EC-Council, “Computer Forensics: Investigating Hard Disks, File and Operating Systems”, Cengage (2010) Lyu, S., “Digital Image Forensics”, University at Albany, SUNY (2010) EC-Council, “Computer Forensics: Investigating Image and Data Files”, Cengage (2010) U.S. Department of Justice / NIJ, “Electronic Crime Scene Investigation: A Guide for First Responders”, 2nd ed. (2008) Kiryakov, A., et. al, “Concept Searching”, Information Society public document (2007) Woods, D., “Introducting Google Concept”, http://ezinearticles.com/?Introducing-GoogleConcept&id=129641 Marcella, A. J. & Menendez, D., “Cyber Forensics: A Field Manual for Collecting, Examining, and Preserving Evidence of Computer Crimes”, Auerbach Publications (2008). Taylor, C. , “A Quick Look at Concept Search”, NetworkComputing.com , http://www.networkcomputing.com/e-discovery/a-quick-look-at-concept-search.php (2009) Miller, George A. "WordNet - About Us." WordNet. Princeton University. 2009. <http://wordnet.princeton.edu> Vacca, J.& Rudolph, K., “System Forensics, Investigation, and Response”, Jones & Bartlett Learning (2010)
© Copyright 2026 Paperzz