IDENTIFYING PASSWORDS STORED ON DISK
This chapter presents a solution to the problem of identifying passwords on storage media. Because of the proliferation of websites for finance, commerce and entertainment, the typical user today often has to store passwords on a computer hard drive. The identification problem is to find strings on the disk that are likely to be passwords. Automated identification is very useful to digital forensic investigators who need to recover potential passwords when working on cases. The problem is nontrivial because a hard disk typically contains numerous strings. The chapter describes a novel approach that determines a good set of candidate strings in which stored passwords are very likely to be found. This is accomplished by first examining the disk for tokens (potential password strings) and applying filtering algorithms to winnow down the tokens to a more manageable set. Next, a probabilistic context-free grammar is used to assign probabilities to the remaining tokens. The context-free grammar is derived via training with a set of revealed passwords. Three algorithms are used to rank the tokens after filtering. Experiments reveal that one of the algorithms, the one-by-one algorithm, returns a password-rich set of 2,000 tokens culled from more than 49 million tokens on a large-capacity drive. Thus, a forensic investigator would only have to test a small set of tokens that would likely contain many of the stored passwords.
KeywordsDisk examination stored passwords password identification
Unable to display preview. Download preview PDF.