wiki3227: BlockRetirementInETFS (Version 1) |
Block Retirement in ETFS#Overview#The ETFS filesystem is most commonly paired with NAND flash. One characteristic of NAND flash is that blocks may wear out over time. This new feature is intended to allow ETFS to "retire" blocks from service before they fail completely. Since the ETFS filesystem, and the flash driver that it's paired to, has a direct view into the health and activity of the NAND flash, it's uniquely suited to identify blocks as they wear out, and shuffle information around in order to discontinue use of the soon-to-be bad block.Requirements#
Design#Because ETFS is hardware-agnostic, it is up to the "devio" layer to tell ETFS when a block is failing. How the devio layer determines that a block is failing is out of the scope of this change.Once ETFS receives notice from the devio layer that a block is failing, it will copy data from the failing block to other locations within the partition. There are some conditions which may prevent the saving of all data:
In the above cases, filesystem damage is unavoidable. After data has been successfully saved from the unhealthy block, the ETFS filesystem will:
In the current design, the ETFS library is notified, and retires a block, synchronous to a media operation. The media operation will be blocked while the "block retirement" operation is in progress. If the media operation was part of a client I/O request, the client request will also be blocked during this time. If an unhealthy block is detected during the startup scan, the retirement procedure is deferred until the startup scan is complete. Processing of any blocks needing "retiring" detected in this way, will be done before the partition is mounted. |