It's very fast to check 600,000 emails if in fact some of them are identical to those previously found in the first analysis of Hilary's email. Each message is "hashed." A hash is a unique identifier generated by a mathematical routine that guarantees different messages will have different hashes.

So, they hash all of the 600,000 or so emails. This should only take a minute or so given current computers' processing power. Then, they compare these hashes to the existing hashes and toss those that are identical. This is actually very, very fast, and the entire process will take at most a few minutes. Note that one need not compare the hash, say, h1, to the hashes of all 600,000 to see if it has a match. A simple technique is a binary search (look it up on Wikipedia). In this case it will take less than 20 comparisons per message, and on the average it will take about 10.

Certainly, the FBI hashed all of the previous messages. This is quite standard for enabling quick data searches. And note that hashing works not only on text but on all data as well.


