diff options
Diffstat (limited to 'Documentation/mm/swap-table.rst')
-rw-r--r-- | Documentation/mm/swap-table.rst | 69 |
1 files changed, 69 insertions, 0 deletions
diff --git a/Documentation/mm/swap-table.rst b/Documentation/mm/swap-table.rst new file mode 100644 index 000000000000..da10bb7a0dc3 --- /dev/null +++ b/Documentation/mm/swap-table.rst @@ -0,0 +1,69 @@ +.. SPDX-License-Identifier: GPL-2.0 + +:Author: Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com> + +========== +Swap Table +========== + +Swap table implements swap cache as a per-cluster swap cache value array. + +Swap Entry +---------- + +A swap entry contains the information required to serve the anonymous page +fault. + +Swap entry is encoded as two parts: swap type and swap offset. + +The swap type indicates which swap device to use. +The swap offset is the offset of the swap file to read the page data from. + +Swap Cache +---------- + +Swap cache is a map to look up folios using swap entry as the key. The result +value can have three possible types depending on which stage of this swap entry +was in. + +1. NULL: This swap entry is not used. + +2. folio: A folio has been allocated and bound to this swap entry. This is + the transient state of swap out or swap in. The folio data can be in + the folio or swap file, or both. + +3. shadow: The shadow contains the working set information of the swapped + out folio. This is the normal state for a swapped out page. + +Swap Table Internals +-------------------- + +The previous swap cache is implemented by XArray. The XArray is a tree +structure. Each lookup will go through multiple nodes. Can we do better? + +Notice that most of the time when we look up the swap cache, we are either +in a swap in or swap out path. We should already have the swap cluster, +which contains the swap entry. + +If we have a per-cluster array to store swap cache value in the cluster. +Swap cache lookup within the cluster can be a very simple array lookup. + +We give such a per-cluster swap cache value array a name: the swap table. + +A swap table is an array of pointers. Each pointer is the same size as a +PTE. The size of a swap table for one swap cluster typically matches a PTE +page table, which is one page on modern 64-bit systems. + +With swap table, swap cache lookup can achieve great locality, simpler, +and faster. + +Locking +------- + +Swap table modification requires taking the cluster lock. If a folio +is being added to or removed from the swap table, the folio must be +locked prior to the cluster lock. After adding or removing is done, the +folio shall be unlocked. + +Swap table lookup is protected by RCU and atomic read. If the lookup +returns a folio, the user must lock the folio before use. |