Age | Commit message (Collapse) | Author |
|
Pick up the CXL DVSEC range register emulation for v6.3, and resolve
conflicts with the cxl_port_probe() split (from for-6.3/cxl-ram-region)
and event handling (from for-6.3/cxl-events).
|
|
Remove the CXL_DECODER_F_LOCK check to be permissive of platform BIOSes
that allow CXL.mem to be remapped.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640370085.935665.13128321011001358077.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
For the case where DVSEC range register(s) are active and HDM decoders are
not committed, use RR to provide emulation. A first pass is done to note
whether any decoders are committed. If there are no committed endpoint
decoders, then DVSEC ranges will be used for emulation.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640369536.935665.611974113442400127.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
CXL rev3 spec 8.1.3
RCDs may not have HDM register blocks. Create a fake HDM with information
from the CXL PCIe DVSEC registers. The decoder count will be set to the
HDM count retrieved from the DVSEC cap register.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640368994.935665.15831225724059704620.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
In the case where HDM decoder register block exists but is not programmed
and at the same time the DVSEC range register range is active, populate the
CXL decoder object 'cxl_decoder' with info from DVSEC range registers.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640368454.935665.13806415120298330717.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
With the previous refactoring of DVSEC range registers out of
cxl_hdm_decode_init(), it basically becomes a skeleton function. Squash
__cxl_hdm_decode_init() with cxl_hdm_decode_init() to simplify the code.
cxl_hdm_decode_init() now returns more error codes than just -EBUSY.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640367916.935665.12898404758336059003.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Call cxl_dvsec_rr_decode() in the beginning of cxl_port_probe() and
preserve the decoded information in a local
'struct cxl_endpoint_dvsec_info'. This info can be passed to various
functions later on in order to support the HDM decoder emulation.
The invocation of cxl_dvsec_rr_decode() in cxl_hdm_decode_init() is
removed and a pointer to the 'struct cxl_endpoint_dvsec_info' is passed
in.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640367377.935665.2848747799651019676.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
There are 2 scenarios that requires additional handling. 1. A device that
has active ranges in DVSEC range registers (RR) but no HDM decoder register
block. 2. A device that has both RR active and HDM, but the HDM decoders
are not programmed. The goal is to create emulated decoder software structs
based on the RR.
Move the CXL DVSEC range register decoding code block from
cxl_hdm_decode_init() to its own function. Refactor code in preparation for
the HDM decoder emulation. There is no functionality change to the code.
Name the new function to cxl_dvsec_rr_decode().
The only change is to set range->start and range->end to CXL_RESOURCE_NONE
and skipping the reading of base registers if the range size is 0, which
equates to range not active.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640366839.935665.11816388524993234329.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Pick up the AER unmasking patches for v6.3.
|
|
Pick up some fixes from exposure of for-6.3/cxl-ram-region in
linux-next.
|
|
By default the CXL RAS mask registers bits are defaulted to 1's and
suppress all error reporting. If the kernel has negotiated ownership
of error handling for CXL then unmask the mask registers by writing 0s.
PCI_EXP_DEVCTL capability is checked to see uncorrectable or correctable
errors bits are set before unmasking the respective errors.
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_regs.h
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167639402301.778884.12556849214955646539.stgit@djiang5-mobl3.local
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
With this [1] commit upstream, pci_enable_pci_error_report() is no longer
necessary for the driver to call. Remove call and related cleanups.
[1]: f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167632012093.4153151.5360778069735064322.stgit@djiang5-mobl3.local
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
When device_hmem.o is enabled but dax itself is a loadable module, the
dax_hmem support fails to link because Kbuild never compiles built-in
code under drivers/dax:
ERROR: modpost: "walk_hmem_resources" [drivers/dax/hmem/dax_hmem.ko] undefined!
Make sure that drivers/dax is entered for compiling built-in code
even with CONFIG_DAX=m.
Fixes: 7dab174e2e27 ("dax/hmem: Move hmem device registration to dax_hmem.ko")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230214131913.1431969-1-arnd@kernel.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
There is already a dependency on CXL_REGION, which depends on CXL_BUS,
but since CXL_REGION is a 'bool' symbol, it's possible to configure
DAX as built-in even though CXL itself is a loadable module:
x86_64-linux-ld: drivers/dax/cxl.o: in function `cxl_dax_region_probe':
cxl.c:(.text+0xb): undefined reference to `to_cxl_dax_region'
x86_64-linux-ld: drivers/dax/cxl.o: in function `cxl_dax_region_driver_init':
cxl.c:(.init.text+0x10): undefined reference to `__cxl_driver_register'
x86_64-linux-ld: drivers/dax/cxl.o: in function `cxl_dax_region_driver_exit':
cxl.c:(.exit.text+0x9): undefined reference to `cxl_driver_unregister'
Prevent this with another depndency on the tristate symbol.
Fixes: 09d09e04d2fc ("cxl/dax: Create dax devices for CXL RAM regions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230214103054.1082908-1-arnd@kernel.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The new cxl_add_to_region() function returns an uninitialized
value on success:
drivers/cxl/core/region.c:2628:6: error: variable 'rc' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
if (IS_ERR(cxlr)) {
^~~~~~~~~~~~
drivers/cxl/core/region.c:2654:9: note: uninitialized use occurs here
return rc;
Simplify the logic to have the rc variable always initialized in the
same place.
Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230213101220.3821689-1-arnd@kernel.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
A loop of the form:
while true; do modprobe cxl_pci; modprobe -r cxl_pci; done
...fails with the following crash signature:
BUG: kernel NULL pointer dereference, address: 0000000000000040
[..]
RIP: 0010:cxl_internal_send_cmd+0x5/0xb0 [cxl_core]
[..]
Call Trace:
<TASK>
cxl_pmem_ctl+0x121/0x240 [cxl_pmem]
nvdimm_get_config_data+0xd6/0x1a0 [libnvdimm]
nd_label_data_init+0x135/0x7e0 [libnvdimm]
nvdimm_probe+0xd6/0x1c0 [libnvdimm]
nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm]
really_probe+0xde/0x380
__driver_probe_device+0x78/0x170
driver_probe_device+0x1f/0x90
__device_attach_driver+0x85/0x110
bus_for_each_drv+0x7d/0xc0
__device_attach+0xb4/0x1e0
bus_probe_device+0x9f/0xc0
device_add+0x445/0x9c0
nd_async_device_register+0xe/0x40 [libnvdimm]
async_run_entry_fn+0x30/0x130
...namely that the bottom half of async nvdimm device registration runs
after the CXL has already torn down the context that cxl_pmem_ctl()
needs. Unlike the ACPI NFIT case that benefits from launching multiple
nvdimm device registrations in parallel from those listed in the table,
CXL is already marked PROBE_PREFER_ASYNCHRONOUS. So provide for a
synchronous registration path to preclude this scenario.
Fixes: 21083f51521f ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices")
Cc: <stable@vger.kernel.org>
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Include the support for enumerating and provisioning ram regions for
v6.3. This also include a default policy change for ram / volatile
device-dax instances to assign them to the dax_kmem driver by default.
|
|
Pick up some final miscellaneous updates for v6.3 including support for
communicating 'exclusive' and 'enabled' state of commands.
|
|
It was pointed out that commands not supported by the device or excluded
by the kernel were being returned in cxl_query_cmd().[1]
While libcxl correctly handles failing commands, it is more efficient to
not issue an invalid command in the first place. This can't be done
without additional information being returned from cxl_query_cmd(). In
addition, information about the availability of commands can be useful
for debugging.
Add flags to struct cxl_command_info which reflect if a command is
enabled and/or exclusive to the kernel.
[1] https://lore.kernel.org/all/63b4ec4e37cc1_5178e2941d@dwillia2-xfh.jf.intel.com.notmuch/
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20221222-cxl-misc-v4-3-62f701c1cdd1@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
CXL_CMD_FLAG_NONE is not used, remove it.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20221222-cxl-misc-v4-1-62f701c1cdd1@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
While platform firmware takes some responsibility for mapping the RAM
capacity of CXL devices present at boot, the OS is responsible for
mapping the remainder and hot-added devices. Platform firmware is also
responsible for identifying the platform general purpose memory pool,
typically DDR attached DRAM, and arranging for the remainder to be 'Soft
Reserved'. That reservation allows the CXL subsystem to route the memory
to core-mm via memory-hotplug (dax_kmem), or leave it for dedicated
access (device-dax).
The new 'struct cxl_dax_region' object allows for a CXL memory resource
(region) to be published, but also allow for udev and module policy to
act on that event. It also prevents cxl_core.ko from having a module
loading dependency on any drivers/dax/ modules.
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167602003896.1924368.10335442077318970468.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The default mode for device-dax instances is backwards for RAM-regions
as evidenced by the fact that it tends to catch end users by surprise.
"Where is my memory?". Recall that platforms are increasingly shipping
with performance-differentiated memory pools beyond typical DRAM and
NUMA effects. This includes HBM (high-bandwidth-memory) and CXL (dynamic
interleave, varied media types, and future fabric attached
possibilities).
For this reason the EFI_MEMORY_SP (EFI Special Purpose Memory => Linux
'Soft Reserved') attribute is expected to be applied to all memory-pools
that are not the general purpose pool. This designation gives an
Operating System a chance to defer usage of a memory pool until later in
the boot process where its performance properties can be interrogated
and administrator policy can be applied.
'Soft Reserved' memory can be anything from too limited and precious to
be part of the general purpose pool (HBM), too slow to host hot kernel
data structures (some PMEM media), or anything in between. However, in
the absence of an explicit policy, the memory should at least be made
usable by default. The current device-dax default hides all
non-general-purpose memory behind a device interface.
The expectation is that the distribution of users that want the memory
online by default vs device-dedicated-access by default follows the
Pareto principle. A small number of enlightened users may want to do
userspace memory management through a device, but general users just
want the kernel to make the memory available with an option to get more
advanced later.
Arrange for all device-dax instances not backed by PMEM to default to
attaching to the dax_kmem driver. From there the baseline memory hotplug
policy (CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE / memhp_default_state=)
gates whether the memory comes online or stays offline. Where, if it
stays offline, it can be reliably converted back to device-mode where it
can be partitioned, or fronted by a userspace allocator.
So, if someone wants device-dax instances for their 'Soft Reserved'
memory:
1/ Build a kernel with CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n or boot
with memhp_default_state=offline, or roll the dice and hope that the
kernel has not pinned a page in that memory before step 2.
2/ Write a udev rule to convert the target dax device(s) from
'system-ram' mode to 'devdax' mode:
daxctl reconfigure-device $dax -m devdax -f
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167602003336.1924368.6809503401422267885.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
In preparation for the CXL region driver to take over the responsibility
of registering device-dax instances for CXL regions, move the
registration of "hmem" devices to dax_hmem.ko.
Previously the builtin component of this enabling
(drivers/dax/hmem/device.o) would register platform devices for each
address range and trigger the dax_hmem.ko module to load and attach
device-dax instances to those devices. Now, the ranges are collected
from the HMAT and EFI memory map walking, but the device creation is
deferred. A new "hmem_platform" device is created which triggers
dax_hmem.ko to load and register the platform devices.
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167602002771.1924368.5653558226424530127.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
In preparation for hmem platform devices to be unregistered, stop using
platform_device_add_resources() to convey the address range. The
platform_device_add_resources() API causes an existing "Soft Reserved"
iomem resource to be re-parented under an inserted platform device
resource. When that platform device is deleted it removes the platform
device resource and all children.
Instead, it is sufficient to convey just the address range and let
request_mem_region() insert resources to indicate the devices active in
the range. This allows the "Soft Reserved" resource to be re-enumerated
upon the next probe event.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167602002217.1924368.7036275892522551624.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Empty driver remove callbacks can just be elided.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167602001664.1924368.9102029637928071240.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
In preparation for moving more filtering of "hmem" ranges into the
dax_hmem.ko module, update the initcall levels. HMAT range registration
moves to subsys_initcall() to be done before Soft Reservation probing,
and Soft Reservation probing is moved to device_initcall() to be done
before dax_hmem.ko initialization if it is built-in.
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167602001107.1924368.11562316181038595611.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Take two endpoints attached to the first switch on the first host-bridge
in the cxl_test topology and define a pre-initialized region. This is a
x2 interleave underneath a x1 CXL Window.
$ modprobe cxl_test
$ # cxl list -Ru
{
"region":"region3",
"resource":"0xf010000000",
"size":"512.00 MiB (536.87 MB)",
"interleave_ways":2,
"interleave_granularity":4096,
"decode_state":"commit"
}
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167602000547.1924368.11613151863880268868.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Region autodiscovery is an asynchronous state machine advanced by
cxl_port_probe(). After the decoders on an endpoint port are enumerated
they are scanned for actively enabled instances. Each active decoder is
flagged for auto-assembly CXL_DECODER_F_AUTO and attached to a region.
If a region does not already exist for the address range setting of the
decoder one is created. That creation process may race with other
decoders of the same region being discovered since cxl_port_probe() is
asynchronous. A new 'struct cxl_root_decoder' lock, @range_lock, is
introduced to mitigate that race.
Once all decoders have arrived, "p->nr_targets == p->interleave_ways",
they are sorted by their relative decode position. The sort algorithm
involves finding the point in the cxl_port topology where one leg of the
decode leads to deviceA and the other deviceB. At that point in the
topology the target order in the 'struct cxl_switch_decoder' indicates
the relative position of those endpoint decoders in the region.
>From that point the region goes through the same setup and validation
steps as user-created regions, but instead of programming the decoders
it validates that driver would have written the same values to the
decoders as were already present.
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167601999958.1924368.9366954455835735048.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Jonathan points out that the shared code between the switch and endpoint
case is small. Before adding another is_cxl_endpoint() conditional,
just split the two cases.
Rather than duplicate the "Couldn't enumerate decoders" error message
take the opportunity to improve the error messages in
devm_cxl_enumerate_decoders().
Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167601999378.1924368.15071142145866277623.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Add help text and a label so the CXL_REGION config option can be
toggled. This is mainly to enable compile testing without region
support.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/167601998765.1924368.258370414771847699.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
In support of the CXL subsystem's use of 'struct range' to track decode
address ranges, add a common range_contains() implementation with
identical semantics as resource_contains();
The existing 'range_contains()' in lib/stackinit_kunit.c is namespaced
with a 'stackinit_' prefix.
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/167601998163.1924368.6067392174077323935.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
In preparation for region autodiscovery, that needs all devices
discovered before their relative position in the region can be
determined, consolidate all position dependent validation in a helper.
Recall that in the on-demand region creation flow the end-user picks the
position of a given endpoint decoder in a region. In the autodiscovery
case the position of an endpoint decoder can only be determined after
all other endpoint decoders that claim to decode the region's address
range have been enumerated and attached. So, in the autodiscovery case
endpoint decoders may be attached before their relative position is
known. Once all decoders arrive, then positions can be determined and
validated with cxl_region_validate_position() the same as user initiated
on-demand creation.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167601997584.1924368.4615769326126138969.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Jonathan noticed that the target list setup is not unwound completely
upon error. Undo all the setup in the 'err_decrement:' exit path.
Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Link: http://lore.kernel.org/r/20230208123031.00006990@Huawei.com
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167601996980.1924368.390423634911157277.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Region autodiscovery is the process of kernel creating 'struct
cxl_region' object to represent active CXL memory ranges it finds
already active in hardware when the driver loads. Typically this happens
when platform firmware establishes CXL memory regions and then publishes
them in the memory map. However, this can also happen in the case of
kexec-reboot after the kernel has created regions.
In the autodiscovery case the region creation process starts with a
known endpoint decoder. Refactor attach_target() into a helper that is
suitable to be called from either sysfs, for runtime region creation, or
from cxl_port_probe() after it has enumerated all endpoint decoders.
The cxl_port_probe() context is an async device-core probing context, so
it is not appropriate to allow SIGTERM to interrupt the assembly
process. Refactor attach_target() to take @cxled and @state as arguments
where @state indicates whether waiting from the region rwsem is
interruptible or not.
No behavior change is intended.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/167601996393.1924368.2202255054618600069.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Expand the region creation infrastructure to enable 'ram'
(volatile-memory) regions. The internals of create_pmem_region_store()
and create_pmem_region_show() are factored out into helpers
__create_region() and __create_region_show() for the 'ram' case to
reuse.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/167601995775.1924368.352616146815830591.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
In preparation for a new region mode, do not, for example, allow
'ram' decoders to be assigned to 'pmem' regions and vice versa.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/167601995111.1924368.7459128614177994602.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Shipping versions of the cxl-cli utility expect all regions to have a
'uuid' attribute. In preparation for 'ram' regions, update the 'uuid'
attribute to return an empty string which satisfies the current
expectations of 'cxl list -R'. Otherwise, 'cxl list -R' fails in the
presence of regions with the 'uuid' attribute missing. Force the
attribute to be read-only as there is no facility or expectation for a
'ram' region to recall its uuid from one boot to the next.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167601994558.1924368.12612811533724694444.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
In preparation for a new region type, "ram" regions, add a mode
attribute to clarify the mode of the decoders that can be added to a
region. Share the internals of mode_show() (for decoders) with the
region case.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/167601993930.1924368.4305018565539515665.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Testing of ram region support [1], stimulates a long standing bug in
cxl_detach_ep() where some cxl_ep_remove() cleanup is skipped due to
inability to walk ports after dports have been unregistered. That
results in a failure to re-register a memdev after the port is
re-enabled leading to a crash like the following:
cxl_port_setup_targets: cxl region4: cxl_host_bridge.0:port4 iw: 1 ig: 256
general protection fault, ...
[..]
RIP: 0010:cxl_region_setup_targets+0x897/0x9e0 [cxl_core]
dev_name at include/linux/device.h:700
(inlined by) cxl_port_setup_targets at drivers/cxl/core/region.c:1155
(inlined by) cxl_region_setup_targets at drivers/cxl/core/region.c:1249
[..]
Call Trace:
<TASK>
attach_target+0x39a/0x760 [cxl_core]
? __mutex_unlock_slowpath+0x3a/0x290
cxl_add_to_region+0xb8/0x340 [cxl_core]
? lockdep_hardirqs_on+0x7d/0x100
discover_region+0x4b/0x80 [cxl_port]
? __pfx_discover_region+0x10/0x10 [cxl_port]
device_for_each_child+0x58/0x90
cxl_port_probe+0x10e/0x130 [cxl_port]
cxl_bus_probe+0x17/0x50 [cxl_core]
Change the port ancestry walk to be by depth rather than by dport. This
ensures that even if a port has unregistered its dports a deferred
memdev cleanup will still be able to cleanup the memdev's interest in
that port.
The parent_port->dev.driver check is only needed for determining if the
bottom up removal beat the top-down removal, but cxl_ep_remove() can
always proceed given the port is pinned. That is, the two sources of
cxl_ep_remove() are in cxl_detach_ep() and cxl_port_release(), and
cxl_port_release() can not run if cxl_detach_ep() holds a reference.
Fixes: 2703c16c75ae ("cxl/core/port: Add switch port enumeration")
Link: http://lore.kernel.org/r/167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com [1]
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/167601992789.1924368.8083994227892600608.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
For ID allocations we want 0-(max-1), ie: smatch complains:
error: Calling ida_alloc_range() with a 'max' argument which is a power of 2. -1 missing?
Correct this and also replace the call to use the max() flavor instead.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230208181944.240261-1-dave@stgolabs.net
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Add the CXL event and interrupt support for the v6.3 update.
|
|
Merge the general CXL updates with fixes targeting v6.2-rc for v6.3.
Resolve a conflict with the fix and move of cxl_report_and_clear() from
pci.c to core/pci.c.
|
|
A passthrough decoder is a decoder that maps only 1 target. It is a
special case because it does not impose any constraints on the
interleave-math as compared to a decoder with multiple targets. Extend
the passthrough case to multi-target-capable decoders that only have one
target selected. I.e. the current code was only considering passthrough
*ports* which are only a subset of the potential passthrough decoder
scenarios.
Fixes: e4f6dfa9ef75 ("cxl/region: Fix 'distance' calculation with passthrough ports")
Cc: <stable@vger.kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167564540422.847146.13816934143225777888.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Not all decoders have a reset callback.
The CXL specification allows a host bridge with a single root port to
have no explicit HDM decoders. Currently the region driver assumes there
are none. As such the CXL core creates a special pass through decoder
instance without a commit/reset callback.
Prior to this patch, the ->reset() callback was called unconditionally when
calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge,
1 Root Port, and one directly attached CXL type 3 device or multiple CXL
type 3 devices attached to downstream ports of a switch can cause a null
pointer dereference.
Before the fix, a kernel crash was observed when we destroy the region, and
a pass through decoder is reset.
The issue can be reproduced as below,
1) create a region with a CXL setup which includes a HB with a
single root port under which a memdev is attached directly.
2) destroy the region with cxl destroy-region regionX -f.
Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
Cc: <stable@vger.kernel.org>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Link: https://lore.kernel.org/r/20221215170909.2650271-1-fan.ni@samsung.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The IRQ core expects that users of the default hardirq handler specify
IRQF_ONESHOT to keep interrupts disabled until the threaded handler
runs. That meets the CXL driver's expectations since it is an edge
triggered MSI and this flag would have been passed by default using
pci_request_irq() instead of devm_request_threaded_irq().
Fixes: a49aa8141b65 ("cxl/mem: Wire up event interrupts")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
CXL r3.0 section 8.2.9.4.2 "Set Timestamp" recommends that the host sets
the timestamp after every Conventional or CXL Reset to ensure accurate
timestamps. This should include on initial boot up. The time base that
is being set is used by a device for the poison list overflow timestamp
and all event timestamps. Note that the command is optional and if
not supported and the device cannot return accurate timestamps it will
fill the fields in with an appropriate marker (see the specification
description of each timestamp).
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230130151327.32415-1-Jonathan.Cameron@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Kernel-doc should be complete, so add documentation for the status
parameter.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230130153437.3153-1-Jonathan.Cameron@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Start checking for -mindirect-branch-cs-prefix clang support too now
that LLVM 16 will support it
- Fix a NULL ptr deref when suspending with Xen PV
- Have a SEV-SNP guest check explicitly for features enabled by the
hypervisor and fail gracefully if some are unsupported by the guest
instead of failing in a non-obvious and hard-to-debug way
- Fix a MSI descriptor leakage under Xen
- Mark Xen's MSI domain as supporting MSI-X
- Prevent legacy PIC interrupts from being resent in software by
marking them level triggered, as they should be, which lead to a NULL
ptr deref
* tag 'x86_urgent_for_v6.2_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Move '-mindirect-branch-cs-prefix' out of GCC-only block
acpi: Fix suspend with Xen PV
x86/sev: Add SEV-SNP guest feature negotiation support
x86/pci/xen: Fixup fallout from the PCI/MSI overhaul
x86/pci/xen: Set MSI_FLAG_PCI_MSIX support in Xen MSI domain
x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- touchpads on HP 15-* laptops switched back to PS/2 emulation mode
- a quirk for Clevo PCX0DX/TUXEDO XP1511 to make sure keyboard is
responding after resume
* tag 'input-for-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: i8042 - add Clevo PCX0DX to i8042 quirk table
Revert "Input: synaptics - switch touchpad on HP Laptop 15-da3001TU to RMI mode"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl fixes from Dan Williams:
"A couple of fixes for bugs introduced during the merge window. One is
a regression, the other was a bug in the CXL AER handler:
- Fix a crash regression due to module load order of cxl_pmem.ko
- Fix wrong register offset read in CXL AER handling path"
* tag 'cxl-fixes-for-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/pmem: Fix nvdimm unregistration when cxl_pmem driver is absent
cxl: fix cxl_report_and_clear() RAS UE addr mis-assignment
|