An InfinityDB Encrypted Java NoSQL Database is 100% encrypted and 100% authenticated 100% of the time, with hashing, password changing and signing with multiple certificates. It it is otherwise identical to the InfinityDB Embedded Database.
This encryption and security edition is in beta testing – please email firstname.lastname@example.org. Please check out the Sample Code, which shows all of the API.
Transparent Data Encryption for Data at Rest
InfinityDB Encrypted uses ‘Transparent Data Encryption’ or TDE for data at rest to minimize the impact of security. It is necessary only to provide a password on file creation and opening, and all the rest is handled internally. You can combine this TDE with other security measures to increase safety of all data at rest. Because InfinityDB Database uses a single file for all data in a given database, the encryption covers all data at once reliably, even while in use. The encrypted files can be used as backups with no encryption step, and with no decryption on restore. There is no point in time when any unencrypted data reaches storage.
- All stored block data is 100% encrypted 100% of the time according to a password using AES-128 or AES-256
- All stored block data and header is 100% authenticated 100% of the time according to the password using HMAC-SHA256
- Passwords may be changed at any time
- Encryption levels may be selected – strong or regular for export compliance. Future versions will provide more
- File content is fully dynamic – there is no ‘encrypted’ vs ‘unencrypted’ state and open() is fast
- Fast full-file hashing of either encrypted or unencrypted blocks for content and integrity checking is provided
- Optional signing ensures a guaranteed overall file content; backups are therefore authenticated and safe from external modification
- Signing algorithms are selectable including SHA256, SHA3 or MD5 any other with RSA, DSA, or ECDSA
- Multiple X509 certificates and their trust chains or bare public keys are stored in the file and organized automatically
- Partial signing by different processes will finally reach fully signed state; not all private keys are needed at once
- Custom validation strategies can be implemented by client code – N of M, only validated, distinguished-name based
Compatible with InfinityDB Embedded
- The API is a superset of the InfinityDB Embedded Database API
- There is no performance hit
- Compression is preserved at 1 to 10x or more
- Unencrypted files are compatible; they can still be opened for read or write and stay unencrypted
These features of InfinityDB Encrypted provide vital security for the entire set of Items in the database, which means all of the database content is protected.
Encryption of course prevents unauthorized reading of the data based on a secret ‘Password-Based-Encryption’ or ‘PBE’ password. This password can be short or long to provide ease of memorization or strong security. A PBE password can be changed at any time without needing to re-create the file. The industry standard encryption is used – AES-128 or AES-256.
Authentication and Integrity Protection
The password is also used to protect the integrity of all of the content by means of an ‘HMAC’ hash, which combines the password with a regular secure hash function to identify accidental or intentional corruption of the data every time the relevant portion of the file is read. Because the HMAC is dependent on the PBE password, it cannot be calculated by someone not having the password, preventing impersonation by an attacker. Without the HMAC, encrypted data could be modified externally, and the decrypted data would change in some unpredictable way, with unpredictable results. The HMAC is calculated on each write of a data block and on reading it back, and it is simply verified to have remained unchanged. The industry standard HMAC algorithm is used – HMAC-SHA256.
A hash algorithm is available with or without the PBE password to determine quickly the contents of the database. This is like a standard file hash but it avoids being dependent on the part of the encrypted file that does not encode the ItemSpace, which is the set of encrypted blocks. If the hash is stored separately, it is easy to recompute it and check that has not changed. Either the encrypted or the plaintext data blocks can be hashed. Hashing the plaintext blocks is slower and it requires the PBE password, but it verifies the HMAC of every block as a side-effect.
Passwords can be changed even after the file is created. Unlike most file encryption systems, InfinityDB uses a standard two-step ‘AES key wrapping’ mechanism to convert the PBE password into the final data encryption and HMAC keys. With this feature, one can better isolate and secure production, backup, test, or transmitted databases. Passwords can be changed regularly, for example. Changing the PBE password to a large random number and ‘forgetting’ it is a way to effectively ‘delete’ the database.
Signing can be used to verify that the entire database has a good, trusted state. For example, it can be used with backups, so that a database to be restored by being copied into the active system was not corrupted in any way, and was last modified by a trusted client. The trusted client not only had access to the PBE password, thereby proving that they were authorized, but also that client left the database in a state that they wanted to preserve, not an intermediate, experimental, incomplete, or suspected incorrect state. A third-party attacker cannot corrupt the signed database either in a random shotgun attack or in a ‘backup attack’ by using blocks read from pairs of backup databases, or even by obtaining the PBE password and altering the database normally. Inadvertent non-malicious file modifications are detected as well. The signing or signature verification processes read the entire database, checking every byte.
Signatures also avoid the weakness of PBE passwords in that the PBE passwords must be provided to all parties that need read or write access to the database, so they are distributed widely. Instead, private/public key pairs can be used for ‘asymmetrical’ cryptography to make key handling far safer. This is a standard route that uses the X509 certificates also used in SSL/TLS (https) security. The private keys are kept safe by individual participants in their own ways. Each signing participant has a private/public key pair, and their public keys are broadcast directly or indirectly to all participants who wish to ‘verify’ the data to determine whether the data is trustworthy. The public keys may either be used alone (‘bare’ public keys) or they can be vouched for with certificates, and a chain of such certificates signing each other can lead to a ‘root’ certificate that is commonly available and trusted by everyone. With such a ‘trust chain’ it is not necessary for trustors to have direct access to any public keys at all, and the trust rules can depend on the set of certificates in the signature in client-implemented custom ways. There can be multiple signatories, with their certificates persisted in the file itself, some signed, some not. This is based on standard technology, and can provide vital, flexible security.
Here are the implementation features.
Encryption and Integrity Checking
Each underlying file block is separately encrypted with a secure random initialization vector using AES-128 or AES-256. Each block is independently integrity checked with HMAC-SHA256 that covers all other block data. The encryption and HMAC keys are independent and securely randomly generated. The block numbers are encrypted and authenticated per-block as well. Every write of a block changes its stored data completely and seemingly-randomly, even for partial block changes or identical block data. Corruption or truncation of a file is immediately detected on read of a corrupted block.
A global SHA256-based hash can be calculated quickly on demand, dependent on only the block data and the file’s logical length, i.e. the ItemSpace content. Either the encrypted or plaintext blocks may be hashed. The encrypted block hash is what is actually signed. Different initial databases will always have different hashes, but a given database that is not modified will continue to have the same hash as long as its content is not modified. The encrypted hash is very fast, and does not require the password. The unencrypted hash requires the password and is slower, but it checks all of the HMAC’s on each block. Both hashes will detect file truncation.
The hash algorithm is not guaranteed to remain unchanged, so in the future if for example SHA256 is compromised or for other reasons, new InfinityDB Encrypted versions will include more algorithms.
The password technology used is the well-established standard ‘key wrapping’, in which the PBE password is converted to a ‘key-encryption key’ or ‘KEK’ internally. The AES-128-based KEK encrypts the final data encryption and HMAC keys producing a ‘wrapped’ key stored in the file. The actual data encryption and HMAC keys are permanent long secure random numbers but do not occur anywhere in the file. They are derived when needed from the PBE password and some data in the file: a 32-byte random salt plus the wrapped key. The PBE password is only kept in memory momentarily before being zeroed after it is used to determine the data encryption key and HMAC key, which then remain in memory while the file is open. Java cannot guarantee that data in memory will not be copied by the garbage collector, so the PBE password should be zeroed as quickly as possible. In principle, the longer-lived encryption and HMAC keys are at risk of being exposed by a memory dump as well, as with almost any crypto system, so such dumps must be kept secret or zeroed. These internal ‘hidden’ keys can be destroyed by destroySensitiveData() and destroyAllPrivateKeys().
A database file can be signed in order to ensure that its entire contents are as expected and not corrupted. Each time the file is signed, a hash of the database content is computed, then a certificate or bare public key along with a private key is used to compute a signature over that hash and then the signature is written into the file in a header. Later, signature verification uses the certificate or public key again to verify the header data, and then the header hash is compared with a re-computed hash.
Multiple certificate chains and bare public keys can co-exist in the file header in ‘SignatureInfo’s. Certificate organization features like duplicate certificate path elimination, trust chain signing sequence checking, and chain sorting on signing sequence are provided. Each SignatureInfo also designates a signing hashing algorithm that further identifies it, such as SHA256 or MD5.
The signing and signature verification hashes the full set of encrypted data blocks at high speed. SignatureInfos can be in either signed or unsigned state, and the state persists until block data actually changes, so multiple signers do not need to have the file open at once. Signing requires only providing the signer’s private key or keys and then invoking sign(), and the private key or keys are automatically matched to the public keys of certificates that become signed. If multiple private keys are provided, the signing process shares a single hash computation. Signature verification does not require the encryption password. Signature verification by default requires that all certificates are signed.
Signature Certificate Validation Strategies
Certificate paths in the SignatureInfos can be validated based on a set of trusted certificates. External storage or availability of signing certificate paths after they are put in the file is not necessary: only private keys for signing are needed, and for signature verification, only trusted public keys or trusted intermediate or root certificates are needed.
Signature verification by default requires that all certificates are signed. However, the full set of SignatureInfos or the signed SignatureInfos can be retrieved from the file and enumerated by client code, so verification can use client-implemented strategies like ‘any signature based on this public key is enough’ or ‘any N signatures is enough’, or ‘any validated signature certificates having certain distinguished name patterns is enough’. Signature certificates can be validated without the password.
Other Databases have tried to retro-fit security, but there are so many remaining issues that the security is still weak or non-existent. Applications, users, and DBAs have to pay considerable attention in order to provide credible protection. It is necessary to pay attention to text log files, transaction logs, backups, slaves, temporary files like sort areas, index content, and any other kind of dangling dumps or copies and so on. In fact in most environments, the data can become so distributed and duplicated that credible security is almost impossible. Encrypted disks and file systems are helpful, but there is no enforcement for their use, and data can ‘leak’ out, even just because of files not being zeroed or ‘bleached’ before deletion. Rewriting applications to handle security at the application level can be very complex and can impose burdens on everyone. In any case, security can be improved by adding or switching to InfinityDB Encrypted.
- Enveloping. This allows the database to be accessible only to selected accessors i.e. database recipients based on public/private key pairs. A set of Envelope certificates are kept in the file, and for each, there is a copy of the PBE password in the file encrypted with the Envelope certificate’s public key. The PBE password can be decrypted with the private key for an envelope certificate. So, instead of keeping track of potentially many PBE passwords, one per file, a given private key can be used to access a whole set of files that have the proper envelope certificate. Since the PBE passwords are no longer necessarily exposed externally, they can be very long and strong, even non-user readable. Because a given file can have multiple envelope certificates, access control is flexible and tight, yet simple. The private keys can be kept tightly secure within each recipient, while the PBE password would tend to be more widely available, since it is a symmetric secret key.
- Multi-threaded encrypted data block hashing also for signing and signature verification to reach very high speeds.
The implementation uses an underlying ‘shim’ called EncryptedRandomAccessFile that provides its overlying InfinityDB database with a logical GeneralizedRandomAccessFile, while physically storing the data as encrypted blocks in a normal RandomAccessFile. The InfinityDB-specific GeneralizedRandomAccessFile is necessary instead of a subclass of RandomAccessFile, because the latter cannot be subclassed (this is considered an original mistake in Java – InputStream and OutputStream are OK though).
The EncryptedRandomAccessFile also contains a ‘header’ before the encrypted blocks that describes the file state, and which contains structure for future extensions, signature information and eventually information for ‘enveloping’. The header itself is variable-length but has a limited fixed space at the front of a particular file – if too much data is attempted to be written in that space, an IOException is thrown, but the file is still usable in its previous state. Currently the size is fixed at 100K but later it will be settable on create(). This should be plenty. The header can change without the hash being changed.
InfinityDB Encrypted has been tested with the Sun security provider as well as Bouncy Castle. Bouncy Castle is the main alternative to Sun, and it adds many features, such as a complete certificate generation capability.
For info and suggestions please email email@example.com.