CPA Journal

Search
Software
Personal
Help

Security by the numbers

# Encryption

By George T. Friedlob, Franklin J. Plewa, Lydia L.F. Schleifer, and Corey D. Schou

While not all auditors are computer security experts, all should be aware of the controls a company might initiate to protect its physical and intellectual assets. In addition, accounting professionals should recognize basic security deficiencies that might create undue exposure to loss.

In its simplest form, an encryption scheme might represent each letter by a number--called a substitution alphabet--for example, a=1, b=2, c=3 etc., through z=26. An encryption of the simple message "Save the Kingdom" would then yield 19-1-22-5 // 20-8-5 // 11-9-14-7-4-15-13.

This cipher is easy to compromise, even if we make the number sequence less obvious. Statistically, "e" is the most common letter in the English language. There are two in our message. The remaining 25 letters in the alphabet occur in descending frequency as follows: t o a n i r s h d l u c m p f y w g b v k j x z and q. In the English language, there are also frequently occurring triplets and double letter combinations. With a not-too-large block of text and judging letters from their frequency and words from their context, this cipher is no challenge.

To complicate a cipher and make cracking it vastly more difficult, most messages undergo a second encryption process using an encryption key. To illustrate, suppose our key is the word "bird." Using the same substitution alphabet, bird becomes 2-9-18-4. Now we write our key repeatedly--bird-bird-bird, etc.--beneath our encrypted message, both in substitution alphabet. Aligning the letters, the encryption is continued by adding the two number strings, letter by letter as shown in item 1 of the Table.

Now each letter will be represented by different numbers at different places in our messages. Our two e's, both number 5 in the original encrypted message, have become 9 and 23. The frequencies are misleading: t and m are both 22; e and n are both 23. Each letter is represented by four possible numbers, e.g., the letter a can be either 3, 10, 19, or 5.

Plaintext a a a a

Message 1 1 1 1

b i r d +Key 2 9 18 4

Cyphertext 3 10 19 5

Decrypting the message proceeds in exactly the reverse of encryption as shown in item 2 of the table.

Encryption Process

Our encryption process has two distinct parts, identifiable in most ciphers, a key and an algorithm.

The Key. Our key is the four-letter word, bird, but imagine keys that are much longer, perhaps phrases instead of words--"Birds Fly South for the Winter," or even strings of random numbers that have no meaning whatsoever. Where security is extremely important, keys of 300 digits are not uncommon.

The Algorithm. An algorithm is simply a step-by-step procedure for solving a problem or accomplishing some goal, one that is frequently mathematical. Here, our algorithm consists of four operations:

* Creating a substitution alphabet by assigning numbers to each letter,

* Encrypting the message and the key,

* Aligning the letters of the repeated key with the letters of the message, and
finally

* Adding the letters of the encrypted key to the letters of the encrypted message.

Again, we can imagine more complex procedures, perhaps using many more mathematical operations, including combinations of such activities as dividing, multiplying, and logging the encrypted message. For example, the numerical values of the encrypted letters might be multiplied by a random number key, the product squared, and the key subtracted from the product. Whatever combination of operations is applied to the message, security can generally be improved by repeating the procedures a second time, or a third.

In our encryption example, Save the Kingdom, we can improve security by adding the key to the encrypted total again and again. We must, however, start with a different letter in our key each time. Using this procedure, the letter a may be represented by four different numbers if we add the key once, but by ten different numbers if we add it a second time (5, 7, 9, 12, 14, 19, 21, 23, 28, or 37). Item 3 in the table shows the calculation.

Data Encryption Standard (DES), an internationally standardized cipher, performs 16 iterations of the same series of operations. One software incorporation of DES, called Triple DES, uses three applications of DES, one after the other, for a total of 48 iterations.

A critical feature of the algorithm is that the message must decrypt mathematically to only one solution so the receiver sees exactly the message that was sent. There is a variety of supposedly uncrackable algorithms available commercially and in the public domain. The math in these algorithms--the operations, their sequence, the use made of constants and the key--is quite complex, but the people and techniques dedicated to cracking them are quite sophisticated.

A person must have both the correct key and the correct algorithm to decrypt a message. Most algorithms are publicly available, yet they can be used with confidence because each requires a key. The public availability of an algorithm increases the confidence of cryptographers, because thousands, perhaps millions of attempts will be made to crack an algorithm if it is available. If no one succeeds, cryptographers use the algorithm with confidence. Exhibit 1 provides a survey of several popular encryption programs in current use.

How Many Keys? There is a problem that we have not discussed. What if a CEO wants encrypted, private communication with each of the company's 300 operating managers? If no operating manager is to be able to decrypt the CEO's communication with the others, the CEO must have 300 encryption keys. How many keys must the CEO generate if the CEO wants each manager to have secure communication with each of the others? The first manager must have 299 keys to communicate with the remaining managers. The second must have 298 (299 minus the key shared with the first manager), the third 297, and so on--a total of 44,850 keys. This is an unmanageable number of keys--each key must be securely transmitted to each of the parties who will use it. If an unencrypted key is transmitted to one of the managers, how does the CEO know it was not intercepted by hackers, wiretappers, or mail thieves? Since most algorithms are publicly available and the security of a cipher is completely in the key, any effort expended in selecting an algorithm is wasted if the user is then careless with the key(s).

There is a simple solution to this seemingly impossible problem. Some algorithms use a pair of numerical keys; key A for encryption, key B for decryption. Once a message to John Doe is encrypted with key A, it cannot be decrypted with key A, even by the person who encrypted it. The message can be decrypted to the original text only with the second key.

Key A, used for encryption, is called a "public key" and John makes it readily available to his associates, for use by any person who wishes to send John an encrypted communication. Key B, used for decryption, is John's "private key," known only to him. Only John can decrypt communications encrypted with his public key.

The two keys, A and B, are related to each other and are produced as a pair. Cracking the cipher requires solving a type of math problem known as a nondeterministic polynomial-time (NP) problem. These problems require vast amounts of computer time to solve, the amount of time increasing exponentially as the size of the NP problem increases. The cipher can be broken and the private key discovered, but it requires perhaps a million or so years of computer time. So the 44,850 keys required in the previous example are reduced to 300 necessary key pairs.

RSA Public Key Cipher. One commonly used public key cipher is RSA, named after its creators, Rivest, Shamir, and Adleman. The NP problem that an attacker must solve to crack RSA is that of factoring a number produced by multiplying two prime numbers together. A prime number is evenly divisible only by 1 and itself; factors are numbers multiplied together to get a third number. For example, 23 and 19 are prime numbers and 23 X 19 = 437. It is easy to produce a number that is the product of two primes. The reverse, factoring 437 into its two prime factors, is more difficult.

This illustrates another characteristic of NP problems: Despite the large amount of time required to generate a solution, a possible solution can be easily tested. A person might spend a while with a hand calculator searching for the prime factors of 437, but anyone can test 23 times 19 in an instant. The current key length standard for commercial activities is 154 digits, and for military communications is 308 digits. With existing technology and algorithms, factoring a 150 digit number requires 1,000,000 years of 1,000,000 instructions per minute computer time; factoring a 300 digit number requires 100,000,000,000,000 years of 1,000,000 instructions per minute computer time.

In addition to using a transformation that is difficult to reverse, RSA and other public key ciphers use a type math called modular arithmetic to confuse a would-be cracker. Modular arithmetic relates two numbers by the remainder left when one number is divided by the other. The expression "17 modulo 5 = 2," means 17 divided by 5 leaves a remainder of 2 (17/5 = 3, remainder 2). Modular arithmetic makes cracking a cipher more difficult because it turns a smooth, continuous function into an erratic, discontinuous series. For example, let x = 10, 20, 30, 40, 50. Similarly, x2 = 100, 400, 900, 1,600, 2,500. For the same progression of x, "x2 modulo 85" yields a haphazard pattern that is difficult to predict: 15, 60, 50, 70, 35. Item four in the table shows the calculations.

Because the algorithms used with public-key/private-key pairs are so complex, they consume vast amounts of computer time, and people do not use them to encrypt entire messages. Instead, if Marsha wishes to send John an encrypted communication, Marsha generates a random number encryption key, called a one-time session key, to be used for that one message only. Marsha uses the one-time session key to encrypt her message, then she encrypts the one-time session key with John's public key, and sends John both the encrypted one-time session key and the encrypted message. John receives Marsha's message, decrypts the random number one-time session key with his private key, and uses the decrypted one-time session key to decrypt Marsha's message.

When Marsha uses John's public key, she doesn't worry about transmitting a secure key to John for use in their communications. Marsha's computer simply generates a new random-number one-time session key each time she sends John a message. A curious person trying to decrypt Marsha and John's message traffic has a difficult task since each randomly generated key is used for only one message.

Encryption is complex, but the software does everything. It generates a random-number, one-time session key, uses it with a standard encryption algorithm to encrypt Marsha's message, then switches to a public-key/private-key algorithm and uses John's public key to encrypt the random number key. The software then concatenates (links) the two encryptions and sends the package off to John. On John's end, his software uses his private key and the same public-key/private-key algorithm to decrypt the one-time session key, then uses that key and the standard algorithm to decrypt Marsha's message.

Cryptanalytic Attacks

The best advice on encryption is to use a public-key algorithm available to hackers, that has been attacked and tested by many people, but not yet cracked. Use the algorithm, but be aware that one day it will be cracked. The students who cracked Netscape's encryption software analyzed the software's random number encryption key generator and found a flaw that let them successfully guess the key. This approach, while unlikely to succeed, did succeed because the keys generated were far from random. Most cryptanalytic attacks fall into one of several classes.

Brute Force Attack. The French student who cracked Netscape's encryption used a brute force attack employing two supercomputers and over one hundred computer workstations for a period of eight days. A brute force attack involves trying one key combination after another until one eventually works. A brute
force attack requires no particular logic
or cleverness, but will succeed if
continued long enough.

Known and Chosen Plain Text Attacks. Another type of cryptanalysis involves either a known-plaintext attack or a chosen-plaintext attack. In a known-plaintext attack, the analyst has obtained an encrypted message and the same message in plaintext. Plaintext messages are often available in deleted files, perhaps saved to disk when the word processing package created temporary copies during processing. Deleted files are not erased, but are merely noted as deleted so the space can be used later. All or part of the information in deleted files may be still on disk and recoverable years later. With increasing frequency, attorneys seek to subpoena disk drives and floppies to search for text regarding personnel actions or criminal activities.

To stop an intruder from recovering deleted files, the deleted files must be overwritten. Some encryption software programs have an option that overwrites plaintext files after you encrypt and delete them. This option overwrites the deleted file only once and does not overwrite text your word processing software might have saved to disk when you created and edited the plaintext. The Federal government recommends three separate write-overs to obscure sensitive data and even then special equipment may recover faint magnetic traces of an old message.

If a cryptanalyst knows that a company encrypts all messages to and from a particular file server, the analyst can send a chosen plaintext to someone in the company and later capture the encrypted message. This approach provides the analyst an encrypted copy of a known plaintext. Because some text combinations are more useful for cryptanalysis than others, a cryptanalyst prefers a chosen plaintext attack to simply a known plaintext attack. This is a variation on the method used during World War II to determine if the message traffic associated with a planned attack on Midway Island was correct. The U.S. Navy sent a message in the clear mentioning that the water purification facility was broken. The Navy watched enemy message traffic until a message about the broken facility appeared using the same location code for Midway.

Key Management Attacks. Most successful cryptanalytic attacks are not directed at cracking the encryption, but rather at stealing the encryption key. This strategy has been far more successful than a cryptanalytic attack. For years, the Russians bought encryption keys from James Walker and his associates for the secret communications of the United States' military. There are a number of precautions that can be taken to protect an encryption key.

Physical Security. Your private key must be on a disk that is physically protected. The key should not be stored in your computer if you are linked to any network or if your computer is not physically secure. In essence, you need to keep your private key on a disk in your possession or in the company's secure key distribution center.

If your computer is not secure, an intruder might install software that captures a copy of each key generated or corrupts your random number key generator. So-called random number keys generated by computer are produced by an algorithm and are only pseudo-random numbers. By altering your algorithm, an intruder might limit the number of keys generated to a nonrandom number series easily crackable by the intruder.

Protecting Stored Files. Although Federal standards specify that information should be protected in the storage state, senior researchers at the AT&T Bell Labs maintain that encryption should be used only to safeguard messages during transmission. They feel that if your computer is safe enough for you to trust that the encryption software has not been corrupted, you do not need to encrypt your files. If your computer is not secure enough to trust the encryption software, why encrypt? This is reasonable if it is assumed that all individuals with access to the system have the same need for access to the data. If not, encrypting stored files allows for compartmentalization of data within the organization.

The one case in which the Bell Labs researchers do recommend encrypting stored data is backup tapes. Because backup tapes often receive poor security, they should be encrypted if they contain sensitive data and if there is a secure mechanism for storing the encryption key.

Spoofing Attacks. Spoofing is computerized impersonation. An attacker sends a message pretending to be someone else, using the receiver's public key. This attacker might manage to delay a shipment to an important customer, or cancel a scheduled production run.

In one type of spoof, the attacker attempts to convince John that he is Marsha, and Marsha that he is John. If this spoof is successful, all communication flows through the attacker. John encrypts his message with Marsha's fake public-key supplied by the attacker and sends the message to the attacker, believing it sent to Marsha. The attacker decrypts the message, reads it, encrypts it with Marsha's real public key, and sends it along to Marsha, who thinks it came directly from John. Most messages are passed along intact but occasionally one may be changed by the attacker to gain additional advantage. Because public keys are available to anyone, the person receiving a message must be able to establish that the message is authentic before relying on it. This can be accomplished using a digital signature.

Audit Concerns

Auditors concerned about a company's encryption data security might build an audit program using the questions below. In most cases, more information is obtained when questions are phrased to require an explanation rather than a yes or no response.

1. What is the cryptography experience and education of the person who selects and approves encryption software?

2. How does the person who selects and approves encryption software keep abreast of developments in cryptanalysis, so that he or she will know when the encryption algorithm is broken?

3. Who decides what information will be encrypted and what will not?

4. How is encryption used to secure transmission of information and stored files, including backup tapes?

5. What use does the company make of private-key algorithms, such as DES that require exchanging keys in secret?

6. What are the procedures for storing, exchanging, and protecting encryption keys?

7. Does the company use encryption software that compresses messages before encryption to eliminate recurring blocks?

8. What physical security measures are in place for computers that contain encryption software and/or private keys?

9. To what extent are computers that contain encryption software and/or private keys linked to networks of any type?

10. How are plaintext files of sensitive information obscured after they are
deleted?

11. Does the company encrypt all communication to and from a particular server, or backup files from a remote location--exposing plaintext files to interception in route?

12. What procedures does the company use to authenticate each message and the sender of each message to avoid spoofing?

Auditors should seek to provide leadership in information security. In most organizations they are the only individuals that have access to the entire information asset picture. Auditors should recommend the establishment of sound policy and practice and the implementation of rigorous employee education and training. Education and training of both management and employees is perhaps the most inexpensive and effective security measure.

George T. Friedlob, PhD, CPA, CMA, is a professor and IIA Faculty Fellow, at Clemson University, Franklin J. Plewa, PhD, is a professor at Idaho State University, Lydia L.F. Schleifer, PhD, is associate professor at School of Accountancy Clemson University, and Corey D. Schou, PhD, is a professor and chairman of the Department of Information Systems, Idaho State University.

For an expanded discussion of encryption, see the authors' monograph, An Auditor's Introduction to Encryption, 1997, The Institute of Internal Auditors.

# W

## In Brief

It Starts So Simple...

The authors present a comprehensive how-it-works story related to encryption and computer security. From the simplest form, building up to the more complicated, encryption and how it works is explained by example. The fundamentals of keys and algorithms are brought into play in understandable terms. The use of numerical values for letters is explained in detail.

Included is a significant discussion of the many ways in which encryption can be attacked. There is also a survey of popular encryption software, as well as the authors' recommendations on what questions to ask when building a related audit program.

The CPA Journal is broadly recognized as an outstanding, technical-refereed publication aimed at public practitioners, management, educators, and other accounting professionals. It is edited by CPAs for CPAs. Our goal is to provide CPAs and other accounting professionals with the information and news to enable them to be successful accountants, managers, and executives in today's practice environments.

©2009 The New York State Society of CPAs. Legal Notices

Visit the new cpajournal.com.