Design for the encryption system:

Encryption on AFF will be implemented by AFF Base Encryption
Services. On top of the Base Encryption may be layered either
Passphrase Encryption or Public Key Encryption. 

AFF Base Encryption:
---------------------
Currently we'll be doing this with AES-256, but the system can be
evolved to accommodate other encryption schemes as needed.

Today AFF data pages are stored in segments named page%d --- page0, page1, etc. 
The flag indicates if compression is used or not. 

Encrypted pages will be stored in segments named page%d/aes --- ie,
page0/aes, page1/aes, etc. 

Restrictions:

* A single "affkey" is used to encrypt every page.
* The AES-256 key cannot be changed.

Encryption will be done with AES256 in CBC mode. 
The IV is the name of the sector, padded with NULs.

AES256 requires that all buffers be padded to the AES block size,
which is 16 bytes.  For performance we don't want to add padding if
the page is already a multiple of the bock size, so here is the
algorithm:

* If len%16==0, do not pad
* If len%16!=0, let
      extra = len%16
      pad   = 16-extra
      Append pad NUL bytes
      Encrypt
      Append extra NUL bytes.
      Write

Now, when segment is read:
 extra = len%16
 pad = 16-extra

* extra==0, it wasn't padded
* Otherwise
     Remove extra NUL bytes
     Decrypt
     Remove pad NUL bytes

In this way, the length does not need to be explicitly coded.

On decryption, the key can be "validated" by attempting to decrypt
page0/aes and seeing if page0_md5 matches (because that's the MD5
for the unencrypted, uncompressed page.) A new API call will be
created for this purpose. 

If a key is set, then pages that are written are automatically encrypted first. 

If both an encrypted page and an unencrypted page are present in the
file, the unencrypted page is returned (because the software never
looks for the encrypted page.)

If an unencrypted page is updated and encryption is turned on, the
encrypted page is first written, then the unencrypted page is deleted.

It is an error to change the affkey encryption key once it has been set.



Advantages:
* Simple to implement & test.
* It's real encryption, not a "password" like E01 format uses. 
* Works transparently with S3 implementation.
* Allows an unencrypted file to be encrypted in-place. 
* We can push this down into a lower layer to provide for encryption
  of all metadata, although that won't be done in the initial
  implementation. 

Disadvantages:
* Only encrypts the page data, not the metadata, in the initial implementation.
* Only way to change the key is to copy to a new AFF file. 
* Encryption key is cached in memory in the AF structure.


Proposed API:
 af_set_aes_key(af,key,keysize) - sets the key; use alg=0 to turn off encryption.
                                - key is unsigned char. 
				- keysize is in bits.
 af_validate_key(af) - returns 0 if the key that was set can be used
                       to validate a page
 af_validate_key_page(af,pagenum) - Specifically checks to see if pagenum
                        can be validated with the key that was set.
      returns 0 - validates, -1 = does't validate; -2 = page doesn't
                   exist; -3 = page md5 doesn't exist.


AFF Passphrase Encryption
--------------------------
This approach builds upon the Base Encryption, but allows the user to
store a passphrase. Instead of using SHA256 to generate the encryption
key directly, the encryption key is a random 256 bit string. This
string is then encrypted with the passphrase and stored in the AFF
file. 

The scheme could easily support multiple passphrases on each file,
although that may not be useful.

The encrypted encryption key is stored in a new segment: affkey-aes256

The contents of affkey_aes256 a 68 byte structure:
    bytes 0-3    - Version number. This is version 1. Stored in network byte order.
    bytes 4-67   - The affkey, encrypted with AES in codebook mode 
                   using SHA-256 of the passphrase as the encryption key.
    bytes 68-131 - the SHA-256 of the affkey (so you know when you got it).

With this scheme the passphrase can be changed without requiring the
entire disk image to be re-encrypted---just rewrite affkey-aes256
with a new password. 

Advantages:
* Easy to change the key
* The passphrase is not cached in memory.

Disadvantages:
* If you can encrypt, you can decrypt (it's a passphrase).


Proposed API:
af_use_passphrase(af,char *phrase)
    - Tries to use an existing passphrase from an AES-encrypted AFFILE
    - errors if there is no AES-encrypted data to decrypt of if passphrase is wrong.

af_establish_passphrase(af,char *phrase)
    - If no encryption has been used yet, makes a random key and
      stores it encrypted with the passphrase.
    - fails if encryption has been used

af_establish_passphrase_key(af,char *passphrase,char *key,int keylen)
    - Verifies that the key is good (by decrypting existing encrypted data)

af_change_passphrase(af,char *oldphrase,char *newphrase)
    - Validates that oldphrase is correct, then changes it to new phrase.


AFF Public Key Encryption:
--------------------------
This approach is similar to AFF Passphrase Encryption, except that the
instead of encrypting the affkey with a passphrase, we encrypt it with
an X.509 public key. 

To ease integration, our plan is to use the S/MIME standard and
existing software. S/MIME encrypts messages. The "message" that is
encrypted is a hexdecimal representation of the affkey, eg:

     To: nobody@afflib.org

     affkey: 00112233445566778899AABBCCDDEEFF

This message is then "encrypted" using a standard S/MIME library and
the resulting "email message" is stored in the segment
"smimeaffkey". 

Eventually we will expand this to the signing of individual segments.
This may be done with a single segment with the SHA256 codes of all
the other segments, or it may be done with a signing segment for each
segment. I don't know yet.


Advantages:
  * Easy to implement with existing cryptographic tools.
  * Cleanly handles multiple recipients using S/MIME facilities.

Disadvantages:
  * Use of S/MIME appears silly and off-topic.

API:
For writing:
  af_set_smime_recepients(af,key)
  af_set_smime_signer(af,key)

For reading:
  af_set_smime_reader(af,key)

================================================================
Performance Notes:

When reading encrypted AFF files, specify read buffers that are at least 16
bytes larger than you expect.  This gives the internal routines space
to do the decryption in place. Otherwise additional memory needs to be
allocated and data needs to be copied. 


================================================================

IMPLEMENTATION
==============

AFFLIB encryption will continue to use the cryptographic primitives
provided by the OpenSSL library.

The AFFILE Structure will be modified to include these additional fields:
  AES_KEY ekey    - The OpenSSL AES256 encryption key
  AES_KEY dkey    - The OpenSSL AES256 decryption key


Reading:

Getting pages is currently implemented with a chain of functions:

  af_get_page() - gets the page and decompresses it if necessary.
  af_get_page_raw() - gets raw pages (without compression)
  af_get_seg()      - gets the actual segment


Proposed modification:
 
  If af_get_seg(s1) fails AND if a symmetric encryption key has been
  set, the function will then look for s1/aes. If this is found the
  segment will be decrypted and returned.

Writing:


Currently pages are written with these functions:

  af_update_page(af,pagenum,data,datalen)
  af_update_seg()

Procedure for writing encrypted pages:

 - Modify af_update_page() to call a new function,
   af_update_page_raw(), which does the encryption.


Other work that needs to be done:

 - Make sure that pages are only written with this function. In
    particular, check out afconvert, aimage, and atest


