Age | Commit message (Collapse) | Author |
|
__ublk_check_and_get_req() only uses its ublk_queue argument to get the
q_id and tag. Pass those arguments explicitly to save an access to the
ublk_queue.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For ublk servers with many ublk queues, accessing the ublk_queue in
ublk_daemon_register_io_buf() is a frequent cache miss. Get the flags
from the ublk_device instead, which is accessed earlier in
ublk_ch_uring_cmd_local().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For ublk servers with many ublk queues, accessing the ublk_queue in
ublk_register_io_buf() is a frequent cache miss. Get the flags from the
ublk_device instead, which is accessed earlier in
ublk_ch_uring_cmd_local().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Avoid repeating the 2 dereferences to get the ublk_device from the
io_uring_cmd by passing it from ublk_ch_uring_cmd_local() to
ublk_register_io_buf().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For ublk servers with many ublk queues, accessing the ublk_queue in
ublk_ch_{read,write}_iter() is a frequent cache miss. Get the flags and
queue depth from the ublk_device instead, which is accessed just before.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For ublk servers with many ublk queues, accessing the ublk_queue to
handle a ublk command is a frequent cache miss. Get the queue depth from
the ublk_device instead, which is accessed just before.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Introduce ublk_device analogues of the ublk_queue flag helpers:
- ublk_support_zero_copy() -> ublk_dev_support_user_copy()
- ublk_support_auto_buf_reg() -> ublk_dev_support_auto_buf_reg()
- ublk_support_user_copy() -> ublk_dev_support_user_copy()
- ublk_need_map_io() -> ublk_dev_need_map_io()
- ublk_need_req_ref() -> ublk_dev_need_req_ref()
- ublk_need_get_data() -> ublk_dev_need_get_data()
These will be used in subsequent changes to avoid accessing the
ublk_queue just for the flags, and instead use the ublk_device.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
__ublk_fail_req() only uses the ublk_queue to get the ublk_device, which
its caller already has. So just pass the ublk_device directly.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
ublk_queue_cmd_buf_size() only needs the queue depth, which is the same
for all queues. Get the queue depth from the ublk_device instead so the
q_id parameter can be dropped.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
ublk_get_queue() never returns a NULL pointer, so there's no need to
check its return value in ublk_check_and_get_req(). Drop the check.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When building for 32-bit platforms, there are several link (if builtin)
or modpost (if a module) errors due to dividends of type 'sector_t' in
DIV_ROUND_UP:
arm-linux-gnueabi-ld: drivers/md/md-llbitmap.o: in function `llbitmap_resize':
drivers/md/md-llbitmap.c:1017:(.text+0xae8): undefined reference to `__aeabi_uldivmod'
arm-linux-gnueabi-ld: drivers/md/md-llbitmap.c:1020:(.text+0xb10): undefined reference to `__aeabi_uldivmod'
arm-linux-gnueabi-ld: drivers/md/md-llbitmap.o: in function `llbitmap_end_discard':
drivers/md/md-llbitmap.c:1114:(.text+0xf14): undefined reference to `__aeabi_uldivmod'
arm-linux-gnueabi-ld: drivers/md/md-llbitmap.o: in function `llbitmap_start_discard':
drivers/md/md-llbitmap.c:1097:(.text+0x1808): undefined reference to `__aeabi_uldivmod'
arm-linux-gnueabi-ld: drivers/md/md-llbitmap.o: in function `llbitmap_read_sb':
drivers/md/md-llbitmap.c:867:(.text+0x2080): undefined reference to `__aeabi_uldivmod'
arm-linux-gnueabi-ld: drivers/md/md-llbitmap.o:drivers/md/md-llbitmap.c:895: more undefined references to `__aeabi_uldivmod' follow
Use DIV_ROUND_UP_SECTOR_T instead of DIV_ROUND_UP, which exists to
handle this exact situation.
Fixes: 5ab829f1971d ("md/md-llbitmap: introduce new lockless bitmap")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
ublk_mark_io_ready() tracks whether all the ublk_device's I/Os have been
fetched by incrementing ublk_queue's nr_io_ready count and incrementing
ublk_device's nr_queues_ready count if the whole queue is ready.
Simplify the logic by just tracking the total number of fetched I/Os on
each ublk_device. When this count reaches nr_hw_queues * queue_depth,
the ublk_device is ready to receive I/O.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently, raid0_make_request() will remap the original bio to underlying
disks to prevent reordered IO. Now that bio_submit_split_bioset() will put
original bio to the head of current->bio_list, it's safe converting to use
this helper and bio will still be ordered.
CC: Jan Kara <jack@suse.cz>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Unify bio split code, prepare to fix reordered split IO.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Unify bio split code, prepare to fix ordering of split IO.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Unify bio split code, prepare to fix ordering of split IO, the error path
is modified a bit, however no functional changes are intended:
- bio_submit_split_bioset() can fail the original bio directly
by split error, set R10BIO_Uptodate in this case to notify
raid_end_bio_io() that the original bio is returned already.
- set R10BIO_Uptodate and set error value to -EIO is useless now,
for r10_bio without R10BIO_Uptodate, -EIO will be returned for
original bio.
And discard is not handled, because discard is only split for
unaligned head and tail, and this can be considered slow path, the
reorder here does not matter much.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The new helper bio_submit_split_bioset() can failed the orginal bio on
split errors, prepare to handle this case in raid_end_bio_io().
The flag name is refer to the r1bio flag name.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Unify bio split code, and prepare to fix ordering of split IO.
Noted that bio_submit_split_bioset() can fail the original bio directly
by split error, set R1BIO_Returned in this case to notify raid_end_bio_io()
that the original bio is returned already.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Unify bio split code, and prepare to fix ordering of split IO
Noted commit 319ff40a5427 ("md/raid0: Fix performance regression for large
sequential writes") already fix ordering of split IO by remapping bio to
underlying disks before resubmitting it, with the respect
md_submit_bio() already split it by sectors, and raid0_make_request()
will split at most once for unaligned IO. This is a bit hacky and we'll
convert this to solution in general later.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If bio is split by internal handling like chunksize or badblocks, the
corresponding trace_block_split() is missing, resulting in blktrace
inability to catch BIO split events and making it harder to analyze the
BIO sequence.
Cc: stable@vger.kernel.org
Fixes: 4b1faf931650 ("block: Kill bio_pair_split()")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/mdraid/linux into for-6.18/block
Pull MD changes from Yu Kuai:
"Redundant data is used to enhance data fault tolerance, and the storage
method for redundant data vary depending on the RAID levels. And it's
important to maintain the consistency of redundant data.
Bitmap is used to record which data blocks have been synchronized and
which ones need to be resynchronized or recovered. Each bit in the
bitmap represents a segment of data in the array. When a bit is set,
it indicates that the multiple redundant copies of that data segment
may not be consistent. Data synchronization can be performed based on
the bitmap after power failure or readding a disk. If there is no
bitmap, a full disk synchronization is required.
Due to known performance issues with md-bitmap and the unreasonable
implementations:
- self-managed IO submitting like filemap_write_page();
- global spin_lock
I have decided not to continue optimizing based on the current bitmap
implementation, this new bitmap is invented without locking from IO fast
path and can be used with fast disks.
Key features for the new bitmap:
- IO fastpath is lockless, if user issues lots of write IO to the same
bitmap bit in a short time, only the first write has additional
overhead to update bitmap bit, no additional overhead for the
following writes;
- support only resync or recover written data, means in the case
creating new array or replacing with a new disk, there is no need to
do a full disk resync/recovery;"
* tag 'md-6.18-20250909' of gitolite.kernel.org:pub/scm/linux/kernel/git/mdraid/linux: (24 commits)
md/md-llbitmap: introduce new lockless bitmap
md/md-bitmap: make method bitmap_ops->daemon_work optional
md: add a new recovery_flag MD_RECOVERY_LAZY_RECOVER
md/md-bitmap: add a new method blocks_synced() in bitmap_operations
md/md-bitmap: add a new method skip_sync_blocks() in bitmap_operations
md/md-bitmap: delay registration of bitmap_ops until creating bitmap
md/md-bitmap: add a new sysfs api bitmap_type
md: add a new mddev field 'bitmap_id'
md/md-bitmap: support discard for bitmap ops
md: factor out a helper raid_is_456()
md: add a new parameter 'offset' to md_super_write()
md/md-bitmap: introduce CONFIG_MD_BITMAP
md: check before referencing mddev->bitmap_ops
md/dm-raid: check before referencing mddev->bitmap_ops
md/raid5: check before referencing mddev->bitmap_ops
md/raid10: check before referencing mddev->bitmap_ops
md/raid1: check before referencing mddev->bitmap_ops
md/raid1: check bitmap before behind write
md/md-bitmap: handle the case bitmap is not enabled before end_sync()
md/md-bitmap: handle the case bitmap is not enabled before start_sync()
...
|
|
We can now safely provide a block device when extracting user pages for
driver and user passthrough commands. Set the bdev so the caller doesn't
have to do that later. This has an additional benefit of being able to
extract P2P pages in the passthrough path.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We only need to consider data and metadata dma mapping types separately.
The request and bio integrity payload have enough flag bits to
internally track the mapping type for each. Use these so the caller
doesn't need to track them, and provide separete request and integrity
helpers to the common code. This will make it easier to scale new
mappings, like the proposed MMIO attribute, without burdening the caller
to track such things.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
This patch adds a new WQ_PERCPU flag to explicitly request the use of
the per-CPU behavior. Both flags coexist for one release cycle to allow
callers to transition their calls.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
All existing users have been updated accordingly.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.
Adding system_dfl_wq to encourage its use when unbound work should be used.
queue_work() / queue_delayed_work() / mod_delayed_work() will now use the
new unbound wq: whether the user still use the old wq a warn will be
printed along with a wq redirect to the new one.
The old system_unbound_wq will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.
Adding system_dfl_wq to encourage its use when unbound work should be used.
queue_work() / queue_delayed_work() / mod_delayed_work() will now use the
new unbound wq: whether the user still use the old wq a warn will be
printed along with a wq redirect to the new one.
The old system_unbound_wq will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Replace kmalloc() followed by copy_from_user() with memdup_user() to
improve and simplify raw_cmd_copyin().
No functional changes intended.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Bios are embedded into other structures, and at least spare is unhappy
about embedding structures with variable sized arrays. There's no
real need to the array anyway, we can replace it with a helper pointing
to the memory just behind the bio, and with the previous cleanups there
is very few site doing anything special with it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Just a simpler wrapper around bio_init for callers that want to
initialize a bio with inline bvecs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Recently, syzbot started to abuse NBD with all kinds of sockets.
Commit cf1b2326b734 ("nbd: verify socket is supported during setup")
made sure the socket supported a shutdown() method.
Explicitely accept TCP and UNIX stream sockets.
Fixes: cf1b2326b734 ("nbd: verify socket is supported during setup")
Reported-by: syzbot+e1cd6bd8493060bd701d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/CANn89iJ+76eE3A_8S_zTpSyW5hvPRn6V57458hCZGY5hbH_bFA@mail.gmail.com/T/#m081036e8747cd7e2626c1da5d78c8b9d1e55b154
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Yu Kuai <yukuai1@huaweicloud.com>
Cc: linux-block@vger.kernel.org
Cc: nbd@other.debian.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When executing modinfo null_blk, there is an error in the description
of module parameter mbps, and the output information of cache_size is
incomplete.The output of modinfo before and after applying this patch
is as follows:
Before:
[...]
parm: cache_size:ulong
[...]
parm: mbps:Cache size in MiB for memory-backed device.
Default: 0 (none) (uint)
[...]
After:
[...]
parm: cache_size:Cache size in MiB for memory-backed device.
Default: 0 (none) (ulong)
[...]
parm: mbps:Limit maximum bandwidth (in MiB/s).
Default: 0 (no limit) (uint)
[...]
Fixes: 058efe000b31 ("null_blk: add module parameters for 4 options")
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Redundant data is used to enhance data fault tolerance, and the storage
method for redundant data vary depending on the RAID levels. And it's
important to maintain the consistency of redundant data.
Bitmap is used to record which data blocks have been synchronized and which
ones need to be resynchronized or recovered. Each bit in the bitmap
represents a segment of data in the array. When a bit is set, it indicates
that the multiple redundant copies of that data segment may not be
consistent. Data synchronization can be performed based on the bitmap after
power failure or readding a disk. If there is no bitmap, a full disk
synchronization is required.
Due to known performance issues with md-bitmap and the unreasonable
implementations:
- self-managed IO submitting like filemap_write_page();
- global spin_lock
I have decided not to continue optimizing based on the current bitmap
implementation, this new bitmap is invented without locking from IO fast
path and can be used with fast disks.
For designs and details, see the comments in drivers/md-llbitmap.c.
Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-12-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Li Nan <linan122@huawei.com>
|
|
daemon_work() will be called by daemon thread, on the one hand, daemon
thread doesn't have strict wake-up time; on the other hand, too much
work are put to daemon thread, like handle sync IO, handle failed
or specail normal IO, handle recovery, and so on. Hence daemon thread
may be too busy to clear dirty bits in time.
Make bitmap_ops->daemon_work() optional and following patches will use
separate async work to clear dirty bits for the new bitmap.
Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-11-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Li Nan <linan122@huawei.com>
|
|
This flag is used by llbitmap in later patches to skip raid456 initial
recover and delay building initial xor data to first write.
https://lore.kernel.org/linux-raid/20250829080426.1441678-10-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
|
|
Currently, raid456 must perform a whole array initial recovery to build
initail xor data, then IO to the array won't have to read all the blocks
in underlying disks.
This behavior will affect IO performance a lot, and nowadays there are
huge disks and the initial recovery can take a long time. Hence llbitmap
will support lazy initial recovery in following patches. This method is
used to check if data blocks is synced or not, if not then IO will still
have to read all blocks for raid456.
Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-9-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
|
|
This method is used to check if blocks can be skipped before calling
into pers->sync_request(), llbitmap will use this method to skip
resync for unwritten/clean data blocks, and recovery/check/repair for
unwritten data blocks;
Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-8-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Li Nan <linan122@huawei.com>
|
|
Currently bitmap_ops is registered while allocating mddev, this is fine
when there is only one bitmap_ops.
Delay setting bitmap_ops until creating bitmap, so that user can choose
which bitmap to use before running the array.
Link: https://lore.kernel.org/linux-raid/20250721171557.34587-7-yukuai@kernel.org
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Li Nan <linan122@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
The api will be used by mdadm to set bitmap_type while creating new array
or assembling array, prepare to add a new bitmap.
Currently available options are:
cat /sys/block/md0/md/bitmap_type
none [bitmap]
Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-6-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Li Nan <linan122@huawei.com>
|
|
Prepare to store the bitmap id selected by user, also refactor
mddev_set_bitmap_ops a bit in case the value is invalid.
Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-5-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Li Nan <linan122@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
Use two new methods {start, end}_discard in bitmap_ops and a new field 'rw'
in struct md_io_clone to handle discard IO, prepare to support new md
bitmap.
Since all bitmap functions to hanlde write IO are the same, also add
typedef to make code cleaner.
Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-4-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Li Nan <linan122@huawei.com>
|
|
There are no functional changes, the helper will be used by llbitmap in
following patches.
Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-3-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Li Nan <linan122@huawei.com>
|
|
The parameter is always set to 0 for now, following patches will use
this helper to write llbitmap to underlying disks, allow writing
dirty sectors instead of the whole page.
Also rename md_super_write to md_write_metadata since there is nothing
super-block specific.
Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-2-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Li Nan <linan122@huawei.com>
|
|
Now that all implementations are internal, it's sensible to add a config
option for md-bitmap, and it's a good way for isolation.
Link: https://lore.kernel.org/linux-raid/20250707012711.376844-16-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
Prepare to introduce CONFIG_MD_BITMAP.
Link: https://lore.kernel.org/linux-raid/20250707012711.376844-15-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
Prepare to introduce CONFIG_MD_BITMAP.
Link: https://lore.kernel.org/linux-raid/20250707012711.376844-14-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
Prepare to introduce CONFIG_MD_BITMAP.
Link: https://lore.kernel.org/linux-raid/20250707012711.376844-13-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
Prepare to introduce CONFIG_MD_BITMAP.
Link: https://lore.kernel.org/linux-raid/20250707012711.376844-12-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
Prepare to introduce CONFIG_MD_BITMAP.
Link: https://lore.kernel.org/linux-raid/20250707012711.376844-11-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
behind write rely on bitmap, because the number of IO are recorded in
bitmap->behind_writes, and callers rely on bitmap_wait_behind_writes()
to wait for IO to be done.
However, currently callers doesn't check if bitmap is enabeld before
calling into behind methods. Hence if behind write start without bitmap,
readers will not wait for slow write IO to be done and old data can be
read in some corner cases.
Link: https://lore.kernel.org/linux-raid/20250707012711.376844-10-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
This case can be handled without knowing internal implementation.
Prepare to introduce CONFIG_MD_BITMAP.
Link: https://lore.kernel.org/linux-raid/20250707012711.376844-9-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
|