Included here are notices about operating procedures for absolute safety.
1. Uncontrolled power loss during the (usually short) periods while disk writes are occurring. Just observe these simple rules:
- Enterprise systems: do not configure your storage arrays in write-back mode with no battery backup and without write-through fallback without an uninterruptible power supply and pulling the plug without doing a clean shutdown.
- Consumer systems: do not pull the plug on a desktop system without an uninterruptable power supply while any application including InfinityDB is in the short period when it is writing data to disk. (You can reconfigure the drives for write-through to make the system completely safe.)
2. Use Cold Backups. Do not try to back up an InfinityDB file during the (usually short period) while data is being written to disk during a commit or spill of dirty data to disk. Simply tell your application to stop doing commits during the backup. A hot backup is not guaranteed to be usable.
Below are optional complete discussions of these issues.
Write Through vs Write Back — The Disk Caching Options Provided by Hardware
There are important drive hardware settings to understand in order to make sure that InfinityDB or indeed almost any other application is able to write data to disk absolutely safely. Boiler Bay recommends that servers be set to the Write Through mode or use battery-backup with Write-Back mode and write-through fallback, but a fuller explanation is warranted. The problem is related only to power failure, and every other kind of failure is safe with InfinityDB.
Disk Write Through mode directs the data to be written immediately to disk when requested by the software (such as via fsync). This means that when your InfinityDB client code calls the InfinityDB Commit function, the pending changes will be immediately forced to disk, and InfinityDB considers the Commit complete after that occurs. However, if your Windows or unix or Linux or any other system’s storage hardware is set up to “Write Back” data, and you ask InfinityDB to Commit its data, you are not guaranteed that your data will be written immediately to disk, even though InfinityDB will be told that the Commit (an fsync) has succeeded. Instead, write-back mode may cause the data to be written to disk a bit later, depending on circumstances of the use of the particular disk or raid array employed. Should you recommend or endorse the use of Write Back mode and you want absolute reliability, you will need to warn your customers that they must have battery back-up hardware on the storage hardware in place and operational.
However, if you have disk storage battery backup in place and the hardware determines that the battery back-up hardware is no longer functional, it will normally be set to revert to Write Through mode. That way, data that your client application requests InfinityDB to commit to disk will be directed to disk immediately, rather than entrusted to the Write Back cache for later writing to disk, and safety will be ensured. One way to realize that a storage system has reverted to write-through mode because of battery failure is that the overall performance drops off, because write-through is much slower than write-back. Normally, however, an enterprise system will alert the system administrator of the battery backup failure.
The problem only occurs on hard power failure while data is actually in the process of being written, because even a write-back system will force data to disk in background as quickly as possible. So a power failure with non-battery-backed write-back is a catastrophe for any system that is in the process of writing to disk, including any write to the file system, because there is no way for software to control what subset of dirty blocks in the storage block cache actually get written. The storage system takes advantage of its cache by reordering block writes to maximize performance. (There are low-level SAS and SATA features accessable to the file system layer called ‘tagging’ that allow preventing some re-ordering so that journaling file systems are more reliable.)
Note that any part of the operating system, file system, or application software is subject to these power-failure considerations. Because hard power failures are so dangerous, operating systems like Linux have system commands for orderly shutdown, and if these are not used, not only InfinityDB but the file system itself may in principle be corrupted as well as application data. These operating systems and file systems and DBMS do try hard to avoid the problem but they cannot guarantee success. For example, the fsck unix/Linux command may find errors even on file systems that use logging or journalling. Note that consumer computer disk systems are natively write-back for speed, and there is normally no battery backup disk hardware capability. For desktop consumers, an uninterruptable power supply or ‘UPS’ will help the problem. The window of vulnerability is normally very short, but may be extended if the application is in the process of doing continuous updates and commits over a long period.
Create Cold Backups of the InfinityDB Database File
It is suggested that you create backups of your InfinityDB file on a frequent basis as you would for any other data file. However, you must be sure that no writes are in progress during backing up the InfinityDB database file. If you attempt to back up a database while it is in the process of writing, such as during a commit or spill of dirty data to disk, you will not be copying valid data and the copy of the database may be corrupted. Such a corruption of the copy of the database will not show up immediately or even after a random length of time depending on the data to be accessed later, and an Exception may be thrown. The Exception is not guaranteed, and may be very confusing. A full read of the database will show up some errors. It is easy to write a simple program to scan all Items in the database. The original database from which the copy was made will never be corrupted.