diff options
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r-- | Documentation/filesystems/erofs.rst | 12 | ||||
-rw-r--r-- | Documentation/filesystems/ext4/orphan.rst | 44 | ||||
-rw-r--r-- | Documentation/filesystems/f2fs.rst | 2 | ||||
-rw-r--r-- | Documentation/filesystems/fscrypt.rst | 83 | ||||
-rw-r--r-- | Documentation/filesystems/index.rst | 1 | ||||
-rw-r--r-- | Documentation/filesystems/locks.rst | 17 | ||||
-rw-r--r-- | Documentation/filesystems/netfs_library.rst | 2 | ||||
-rw-r--r-- | Documentation/filesystems/proc.rst | 26 |
8 files changed, 107 insertions, 80 deletions
diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst index b97579b7d8fbc..01df283c7d04f 100644 --- a/Documentation/filesystems/erofs.rst +++ b/Documentation/filesystems/erofs.rst @@ -19,9 +19,10 @@ It is designed as a better filesystem solution for the following scenarios: immutable and bit-for-bit identical to the official golden image for their releases due to security and other considerations and - - hope to save some extra storage space with guaranteed end-to-end performance - by using reduced metadata and transparent file compression, especially - for those embedded devices with limited memory (ex, smartphone); + - hope to minimize extra storage space with guaranteed end-to-end performance + by using compact layout, transparent file compression and direct access, + especially for those embedded devices with limited memory and high-density + hosts with numerous containers; Here is the main features of EROFS: @@ -51,7 +52,9 @@ Here is the main features of EROFS: - Support POSIX.1e ACLs by using xattrs; - Support transparent data compression as an option: - LZ4 algorithm with the fixed-sized output compression for high performance. + LZ4 algorithm with the fixed-sized output compression for high performance; + + - Multiple device support for multi-layer container images. The following git tree provides the file system user-space tools under development (ex, formatting tool mkfs.erofs): @@ -87,6 +90,7 @@ cache_strategy=%s Select a strategy for cached decompression from now on: dax={always,never} Use direct access (no page cache). See Documentation/filesystems/dax.rst. dax A legacy option which is an alias for ``dax=always``. +device=%s Specify a path to an extra device to be used together. =================== ========================================================= On-disk details diff --git a/Documentation/filesystems/ext4/orphan.rst b/Documentation/filesystems/ext4/orphan.rst index bb19ecd1b6269..03cca178864bb 100644 --- a/Documentation/filesystems/ext4/orphan.rst +++ b/Documentation/filesystems/ext4/orphan.rst @@ -12,41 +12,31 @@ track the inode as orphan so that in case of crash extra blocks allocated to the file get truncated. Traditionally ext4 tracks orphan inodes in a form of single linked list where -superblock contains the inode number of the last orphan inode (s\_last\_orphan +superblock contains the inode number of the last orphan inode (s_last_orphan field) and then each inode contains inode number of the previously orphaned -inode (we overload i\_dtime inode field for this). However this filesystem +inode (we overload i_dtime inode field for this). However this filesystem global single linked list is a scalability bottleneck for workloads that result in heavy creation of orphan inodes. When orphan file feature -(COMPAT\_ORPHAN\_FILE) is enabled, the filesystem has a special inode -(referenced from the superblock through s\_orphan_file_inum) with several +(COMPAT_ORPHAN_FILE) is enabled, the filesystem has a special inode +(referenced from the superblock through s_orphan_file_inum) with several blocks. Each of these blocks has a structure: -.. list-table:: - :widths: 8 8 24 40 - :header-rows: 1 - - * - Offset - - Type - - Name - - Description - * - 0x0 - - Array of \_\_le32 entries - - Orphan inode entries - - Each \_\_le32 entry is either empty (0) or it contains inode number of - an orphan inode. - * - blocksize - 8 - - \_\_le32 - - ob\_magic - - Magic value stored in orphan block tail (0x0b10ca04) - * - blocksize - 4 - - \_\_le32 - - ob\_checksum - - Checksum of the orphan block. +============= ================ =============== =============================== +Offset Type Name Description +============= ================ =============== =============================== +0x0 Array of Orphan inode Each __le32 entry is either + __le32 entries entries empty (0) or it contains + inode number of an orphan + inode. +blocksize-8 __le32 ob_magic Magic value stored in orphan + block tail (0x0b10ca04) +blocksize-4 __le32 ob_checksum Checksum of the orphan block. +============= ================ =============== =============================== When a filesystem with orphan file feature is writeably mounted, we set -RO\_COMPAT\_ORPHAN\_PRESENT feature in the superblock to indicate there may +RO_COMPAT_ORPHAN_PRESENT feature in the superblock to indicate there may be valid orphan entries. In case we see this feature when mounting the filesystem, we read the whole orphan file and process all orphan inodes found there as usual. When cleanly unmounting the filesystem we remove the -RO\_COMPAT\_ORPHAN\_PRESENT feature to avoid unnecessary scanning of the orphan +RO_COMPAT_ORPHAN_PRESENT feature to avoid unnecessary scanning of the orphan file and also make the filesystem fully compatible with older kernels. diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst index 09de6ebbbdfa2..6f3c6e91346df 100644 --- a/Documentation/filesystems/f2fs.rst +++ b/Documentation/filesystems/f2fs.rst @@ -283,7 +283,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab For other files, we can still enable compression via ioctl. Note that, there is one reserved special extension '*', it can be set to enable compression for all files. -nocompress_extension=%s Support adding specified extension, so that f2fs can disable +nocompress_extension=%s Support adding specified extension, so that f2fs can disable compression on those corresponding files, just contrary to compression extension. If you know exactly which files cannot be compressed, you can use this. The same extension name can't appear in both compress and nocompress diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst index 0eb799d9d05a2..4d5d50dca65c6 100644 --- a/Documentation/filesystems/fscrypt.rst +++ b/Documentation/filesystems/fscrypt.rst @@ -77,11 +77,11 @@ Side-channel attacks fscrypt is only resistant to side-channel attacks, such as timing or electromagnetic attacks, to the extent that the underlying Linux -Cryptographic API algorithms are. If a vulnerable algorithm is used, -such as a table-based implementation of AES, it may be possible for an -attacker to mount a side channel attack against the online system. -Side channel attacks may also be mounted against applications -consuming decrypted data. +Cryptographic API algorithms or inline encryption hardware are. If a +vulnerable algorithm is used, such as a table-based implementation of +AES, it may be possible for an attacker to mount a side channel attack +against the online system. Side channel attacks may also be mounted +against applications consuming decrypted data. Unauthorized file access ~~~~~~~~~~~~~~~~~~~~~~~~ @@ -176,11 +176,11 @@ Master Keys Each encrypted directory tree is protected by a *master key*. Master keys can be up to 64 bytes long, and must be at least as long as the -greater of the key length needed by the contents and filenames -encryption modes being used. For example, if AES-256-XTS is used for -contents encryption, the master key must be 64 bytes (512 bits). Note -that the XTS mode is defined to require a key twice as long as that -required by the underlying block cipher. +greater of the security strength of the contents and filenames +encryption modes being used. For example, if any AES-256 mode is +used, the master key must be at least 256 bits, i.e. 32 bytes. A +stricter requirement applies if the key is used by a v1 encryption +policy and AES-256-XTS is used; such keys must be 64 bytes. To "unlock" an encrypted directory tree, userspace must provide the appropriate master key. There can be any number of master keys, each @@ -1135,6 +1135,50 @@ where applications may later write sensitive data. It is recommended that systems implementing a form of "verified boot" take advantage of this by validating all top-level encryption policies prior to access. +Inline encryption support +========================= + +By default, fscrypt uses the kernel crypto API for all cryptographic +operations (other than HKDF, which fscrypt partially implements +itself). The kernel crypto API supports hardware crypto accelerators, +but only ones that work in the traditional way where all inputs and +outputs (e.g. plaintexts and ciphertexts) are in memory. fscrypt can +take advantage of such hardware, but the traditional acceleration +model isn't particularly efficient and fscrypt hasn't been optimized +for it. + +Instead, many newer systems (especially mobile SoCs) have *inline +encryption hardware* that can encrypt/decrypt data while it is on its +way to/from the storage device. Linux supports inline encryption +through a set of extensions to the block layer called *blk-crypto*. +blk-crypto allows filesystems to attach encryption contexts to bios +(I/O requests) to specify how the data will be encrypted or decrypted +in-line. For more information about blk-crypto, see +:ref:`Documentation/block/inline-encryption.rst <inline_encryption>`. + +On supported filesystems (currently ext4 and f2fs), fscrypt can use +blk-crypto instead of the kernel crypto API to encrypt/decrypt file +contents. To enable this, set CONFIG_FS_ENCRYPTION_INLINE_CRYPT=y in +the kernel configuration, and specify the "inlinecrypt" mount option +when mounting the filesystem. + +Note that the "inlinecrypt" mount option just specifies to use inline +encryption when possible; it doesn't force its use. fscrypt will +still fall back to using the kernel crypto API on files where the +inline encryption hardware doesn't have the needed crypto capabilities +(e.g. support for the needed encryption algorithm and data unit size) +and where blk-crypto-fallback is unusable. (For blk-crypto-fallback +to be usable, it must be enabled in the kernel configuration with +CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK=y.) + +Currently fscrypt always uses the filesystem block size (which is +usually 4096 bytes) as the data unit size. Therefore, it can only use +inline encryption hardware that supports that data unit size. + +Inline encryption doesn't affect the ciphertext or other aspects of +the on-disk format, so users may freely switch back and forth between +using "inlinecrypt" and not using "inlinecrypt". + Implementation details ====================== @@ -1184,6 +1228,13 @@ keys`_ and `DIRECT_KEY policies`_. Data path changes ----------------- +When inline encryption is used, filesystems just need to associate +encryption contexts with bios to specify how the block layer or the +inline encryption hardware will encrypt/decrypt the file contents. + +When inline encryption isn't used, filesystems must encrypt/decrypt +the file contents themselves, as described below: + For the read path (->readpage()) of regular files, filesystems can read the ciphertext into the page cache and decrypt it in-place. The page lock must be held until decryption has finished, to prevent the @@ -1197,18 +1248,6 @@ buffer. Some filesystems, such as UBIFS, already use temporary buffers regardless of encryption. Other filesystems, such as ext4 and F2FS, have to allocate bounce pages specially for encryption. -Fscrypt is also able to use inline encryption hardware instead of the -kernel crypto API for en/decryption of file contents. When possible, -and if directed to do so (by specifying the 'inlinecrypt' mount option -for an ext4/F2FS filesystem), it adds encryption contexts to bios and -uses blk-crypto to perform the en/decryption instead of making use of -the above read/write path changes. Of course, even if directed to -make use of inline encryption, fscrypt will only be able to do so if -either hardware inline encryption support is available for the -selected encryption algorithm or CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK -is selected. If neither is the case, fscrypt will fall back to using -the above mentioned read/write path changes for en/decryption. - Filename hashing and encoding ----------------------------- diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst index c0ad233963ae9..bee63d42e5eca 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -29,7 +29,6 @@ algorithms work. fiemap files locks - mandatory-locking mount_api quota seq_file diff --git a/Documentation/filesystems/locks.rst b/Documentation/filesystems/locks.rst index c5ae858b1aacb..26429317dbbc3 100644 --- a/Documentation/filesystems/locks.rst +++ b/Documentation/filesystems/locks.rst @@ -57,16 +57,9 @@ fcntl(), with all the problems that implies. 1.3 Mandatory Locking As A Mount Option --------------------------------------- -Mandatory locking, as described in -'Documentation/filesystems/mandatory-locking.rst' was prior to this release a -general configuration option that was valid for all mounted filesystems. This -had a number of inherent dangers, not the least of which was the ability to -freeze an NFS server by asking it to read a file for which a mandatory lock -existed. - -From this release of the kernel, mandatory locking can be turned on and off -on a per-filesystem basis, using the mount options 'mand' and 'nomand'. -The default is to disallow mandatory locking. The intention is that -mandatory locking only be enabled on a local filesystem as the specific need -arises. +Mandatory locking was prior to this release a general configuration option +that was valid for all mounted filesystems. This had a number of inherent +dangers, not the least of which was the ability to freeze an NFS server by +asking it to read a file for which a mandatory lock existed. +Such option was dropped in Kernel v5.14. diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst index 57a6418478186..bb68d39f03b78 100644 --- a/Documentation/filesystems/netfs_library.rst +++ b/Documentation/filesystems/netfs_library.rst @@ -524,3 +524,5 @@ Note that these methods are passed a pointer to the cache resource structure, not the read request structure as they could be used in other situations where there isn't a read request structure as well, such as writing dirty data to the cache. + +.. kernel-doc:: include/linux/netfs.h diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 042c418f40906..8d7f141c6fc75 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -1857,19 +1857,19 @@ For example:: This file contains lines of the form:: 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue - (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11) - - (1) mount ID: unique identifier of the mount (may be reused after umount) - (2) parent ID: ID of parent (or of self for the top of the mount tree) - (3) major:minor: value of st_dev for files on filesystem - (4) root: root of the mount within the filesystem - (5) mount point: mount point relative to the process's root - (6) mount options: per mount options - (7) optional fields: zero or more fields of the form "tag[:value]" - (8) separator: marks the end of the optional fields - (9) filesystem type: name of filesystem of the form "type[.subtype]" - (10) mount source: filesystem specific information or "none" - (11) super options: per super block options + (1)(2)(3) (4) (5) (6) (n…m) (m+1)(m+2) (m+3) (m+4) + + (1) mount ID: unique identifier of the mount (may be reused after umount) + (2) parent ID: ID of parent (or of self for the top of the mount tree) + (3) major:minor: value of st_dev for files on filesystem + (4) root: root of the mount within the filesystem + (5) mount point: mount point relative to the process's root + (6) mount options: per mount options + (n…m) optional fields: zero or more fields of the form "tag[:value]" + (m+1) separator: marks the end of the optional fields + (m+2) filesystem type: name of filesystem of the form "type[.subtype]" + (m+3) mount source: filesystem specific information or "none" + (m+4) super options: per super block options Parsers should ignore all unrecognised optional fields. Currently the possible optional fields are: |