RC4 Encryption in Excel: Technical Deep Dive and Security Analysis

Summary
Complete technical breakdown of Excel RC4 encryption from password entry to encrypted bytes. Learn how legacy .xls encryption works, why it is weak, and how it compares to modern AES protection.
Excel RC4-based encryption belongs to the legacy era of Office security. It still appears in older .xls files and occasionally in compatibility modes. This analysis provides a complete technical pipeline from password entry to encrypted bytes, with concrete examples and security implications.
This deep dive focuses strictly on file-level "Open Password" encryption, not worksheet or workbook structure protection.
🔹 What Is RC4?
RC4 is a symmetric stream cipher developed by Ron Rivest that encrypts data by XORing plaintext with a pseudorandom keystream derived from a secret key. It was widely adopted in legacy software, including early Microsoft Excel versions, due to its high performance and low computational cost.
Today, RC4 is considered cryptographically weak and deprecated because of known statistical biases and fast brute-force feasibility. Modern security standards forbid its use in production systems.
| Property | Details |
|---|---|
| Cipher Type | Stream cipher |
| Key Size | 40-128 bits (Excel 97-2003) |
| Designer | Ron Rivest (RSA Security) |
| Year Introduced | 1987 |
| Current Status | Deprecated / Broken |
1️⃣ Where RC4 Fits in Excel Encryption Timeline
Understanding when RC4 was used helps contextualize its security weaknesses. Microsoft gradually phased out RC4 as computational attacks became feasible.
| Excel Version | File Format | Encryption Method | Security Level |
|---|---|---|---|
| Excel 97-2003 | .xls |
RC4 (40-128 bit) | Critical Risk |
| Excel 2007 (early) | .xls |
RC4 (optional) | Moderate Risk |
| Excel 2007+ | .xlsx |
AES-128/256 (default) | Secure |
RC4 is used only for workbook "Open Password" encryption, not worksheet protection. Once a file is saved in .xls format, the encryption method remains RC4 unless explicitly converted to a newer format.
2️⃣ High-Level Encryption Flow
The RC4 encryption process in Excel follows a specific sequence. Here is the conceptual pipeline:
User Password
↓
Password Hashing (SHA-1)
↓
Key Derivation (with salt + block index)
↓
RC4 Key Schedule Algorithm (KSA)
↓
RC4 Keystream Generator (PRGA)
↓
XOR with Workbook Bytes
↓
Encrypted .xls File
RC4 is a stream cipher, which means encryption and decryption are symmetric operations. The same keystream XORed with ciphertext produces the original plaintext.
3️⃣ Step-by-Step Internal Mechanics
🔹 Step 1: User Password Input
When a user sets a password, Excel receives plaintext input. For example:
Summer2024!
Excel never stores this password directly. The plaintext password is immediately transformed through hashing and never persists in readable form.
🔹 Step 2: Password Hashing
Excel hashes the password using SHA-1, a cryptographic hash function that produces a fixed 160-bit output.
SHA1("Summer2024!") =
9d1a4f5c1f0e5c8b1d6d8fbbf7c6a0b2d8c2f4a1
This hash becomes the root secret for all further cryptographic operations. Any change to the password produces a completely different hash.
🔹 Step 3: Key Derivation (Weak KDF)
Excel mixes the password hash with additional entropy sources:
- Password hash (from Step 2)
- Workbook salt (random 16-byte value)
- Block index (important for re-keying)
The pseudo-logic looks like this:
RC4Key = SHA1(PasswordHash + Salt + BlockIndex)
🔹 Step 4: RC4 Key Schedule Algorithm (KSA)
RC4 initializes a 256-byte state array S through the Key Scheduling Algorithm. This permutation is the foundation of the keystream.
S = [0, 1, 2, 3, ..., 255]
j = 0
for i in 0..255:
j = (j + S[i] + Key[i mod key_length]) mod 256
swap(S[i], S[j])
This step permutes the state array S using the derived key. The quality of this permutation directly affects keystream predictability and security.
🔹 Step 5: Keystream Generation (PRGA)
RC4 generates a pseudorandom byte stream using the Pseudo-Random Generation Algorithm (PRGA):
i = (i + 1) mod 256
j = (j + S[i]) mod 256
swap(S[i], S[j])
keystream_byte = S[(S[i] + S[j]) mod 256]
Example keystream output:
A3 F1 9C 77 2B 8E D4 5A ...
This keystream is combined with file data byte by byte using XOR operations.
🔹 Step 6: XOR Encryption
Plaintext bytes are XORed with keystream bytes to produce ciphertext. This is the actual encryption step:
| Component | Hex Values |
|---|---|
| Plaintext (BIFF header) | 50 4B 03 04 |
| Keystream | A3 F1 9C 77 |
| Ciphertext (XOR result) | F3 BA 9F 73 |
Decryption repeats the same XOR operation. XOR symmetry is why the identical process decrypts the file.
4️⃣ Block-Based Re-Keying
Excel does not use a single continuous keystream for the entire file. Instead, it re-derives the RC4 key per data block:
Block 0 → Key₀
Block 1 → Key₁
Block 2 → Key₂
...
This approach avoids certain stream cipher vulnerabilities associated with long keystreams, but:
- Still uses SHA-1 for key derivation
- Still computationally fast
- Still brute-force friendly
5️⃣ Practical Example: Office Hash Format
Password recovery tools like Hashcat extract Excel RC4 hashes in a standardized format for offline cracking:
$office$*2003*128*16*SALTHEX*ENCRYPTEDVERIFIER*VERIFIERHASH
Example extracted hash:
$office$*2003*128*16*13b7cf2d2a5a436f*d048c5880a286c7d*99909f3bb868d81b
Attackers brute-force passwords and verify correctness via decrypted verifier bytes. This verification step avoids decrypting the full workbook for each guess, dramatically accelerating attacks.
6️⃣ Why RC4 Excel Encryption Is Weak
| Weakness | Explanation |
|---|---|
| ❌ RC4 is deprecated | Known statistical biases in keystream output |
| ❌ No key stretching | Millions of password guesses per second possible |
| ❌ SHA-1 only | Fast hashing algorithm, GPU-friendly for attacks |
| ❌ Short passwords common | Human password patterns easily exploited |
| ❌ No memory hardness | Cheap parallel attacks on commodity hardware |
Modern GPUs can test billions of RC4 passwords per second. Security depends almost entirely on password entropy and length.
7️⃣ Comparison With Modern Excel AES
| Feature | RC4 Excel (Legacy) | AES Excel (2010+) |
|---|---|---|
| Cipher Algorithm | RC4 stream cipher | AES-128/256 block cipher |
| Key Derivation | SHA-1 (1 round) | PBKDF2 with SHA-512 |
| Iterations | ❌ 1 | ✅ 50,000-100,000 |
| GPU Resistance | ❌ None | ⚠️ Partial (slowed) |
| Security Status | Broken | Acceptable |
AES significantly raises the computational cost of brute-force attacks through key stretching and stronger cryptographic primitives.
8️⃣ Security Takeaways 🔒
✓ What RC4 Protects
RC4 Excel encryption protects against casual snooping only. It prevents average users from opening locked files without tools.
✗ What RC4 Does Not Protect
It does not meet modern security standards. Any sensitive .xls file should be considered recoverable by motivated attackers.
Best Practices
- Convert legacy
.xlsfiles to.xlsxformat - Use AES-256 encryption for sensitive workbooks
- Store passwords in a dedicated password manager
- Assume legacy RC4-encrypted files are already compromised
- Implement regular security audits for document protection
If you need to recover access to a legacy Excel file, Niraiya specializes in secure password recovery using advanced techniques.