summaryrefslogtreecommitdiff
path: root/drivers/cxl/core/hdm.c
AgeCommit message (Collapse)Author
2025-01-10driver core: Correct API device_for_each_child_reverse_from() prototypeZijun Hu
For API device_for_each_child_reverse_from(..., const void *data, int (*fn)(struct device *dev, const void *data)) - Type of @data is const pointer, and means caller's data @*data is not allowed to be modified, but that usually is not proper for such non finding device iterating API. - Types for both @data and @fn are not consistent with all other for_each device iterating APIs device_for_each_child(_reverse)(), bus_for_each_dev() and (driver|class)_for_each_device(). Correct its prototype by removing const from parameter types, then adapt for various existing usages. An dedicated typedef device_iter_t will be introduced as @fn() type for various for_each device interating APIs later. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20250105-class_fix-v6-6-3a2f1768d4d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-02module: Convert symbol namespace to string literalPeter Zijlstra
Clean up the existing export namespace code along the same lines of commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ && $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]*\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-22Merge tag 'cxl-for-6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl updates from Dave Jiang: - Constify range_contains() input parameters to prevent changes - Add support for displaying RCD capabilities in sysfs to support lspci for CXL device - Downgrade warning message to debug in cxl_probe_component_regs() - Add support for adding a printf specifier '%pra' to emit 'struct range' content: - Add sanity tests for 'struct resource' - Add documentation for special case - Add %pra for 'struct range' - Add %pra usage in CXL code - Add preparation code for DCD support: - Add range_overlaps() - Add CDAT DSMAS table shared and read only flag in ACPICA - Add documentation to 'struct dev_dax_range' - Delay event buffer allocation in CXL PCI code until needed - Use guard() in cxl_dpa_set_mode() - Refactor create region code to consolidate common code * tag 'cxl-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Refactor common create region code cxl/hdm: Use guard() in cxl_dpa_set_mode() cxl/pci: Delay event buffer allocation dax: Document struct dev_dax_range ACPI/CDAT: Add CDAT/DSMAS shared and read only flag values range: Add range_overlaps() cxl/cdat: Use %pra for dpa range outputs printf: Add print format (%pra) for struct range Documentation/printf: struct resource add start == end special case test printf: Add very basic struct resource tests cxl: downgrade a warning message to debug level in cxl_probe_component_regs() cxl/pci: Add sysfs attribute for CXL 1.1 device link status cxl/core/regs: Add rcd_pcie_cap initialization kernel/range: Const-ify range_contains parameters
2024-11-08cxl/hdm: Use guard() in cxl_dpa_set_mode()Ira Weiny
Additional DCD functionality is being added to this call which will be simplified by the use of guard() with the cxl_dpa_rwsem. Convert the function to use guard() prior to adding DCD functionality. Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20241107-dcd-type2-upstream-v7-5-56a84e66bc36@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-10-25cxl/port: Fix use-after-free, permit out-of-order decoder shutdownDan Williams
In support of investigating an initialization failure report [1], cxl_test was updated to register mock memory-devices after the mock root-port/bus device had been registered. That led to cxl_test crashing with a use-after-free bug with the following signature: cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1 cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1 cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0 1) cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1 [..] cxld_unregister: cxl decoder14.0: cxl_region_decode_reset: cxl_region region3: mock_decoder_reset: cxl_port port3: decoder3.0 reset 2) mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1 cxl_endpoint_decoder_release: cxl decoder14.0: [..] cxld_unregister: cxl decoder7.0: 3) cxl_region_decode_reset: cxl_region region3: Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: <TASK> cxl_region_decode_reset+0x69/0x190 [cxl_core] cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x5d/0x60 [cxl_core] At 1) a region has been established with 2 endpoint decoders (7.0 and 14.0). Those endpoints share a common switch-decoder in the topology (3.0). At teardown, 2), decoder14.0 is the first to be removed and hits the "out of order reset case" in the switch decoder. The effect though is that region3 cleanup is aborted leaving it in-tact and referencing decoder14.0. At 3) the second attempt to teardown region3 trips over the stale decoder14.0 object which has long since been deleted. The fix here is to recognize that the CXL specification places no mandate on in-order shutdown of switch-decoders, the driver enforces in-order allocation, and hardware enforces in-order commit. So, rather than fail and leave objects dangling, always remove them. In support of making cxl_region_decode_reset() always succeed, cxl_region_invalidate_memregion() failures are turned into warnings. Crashing the kernel is ok there since system integrity is at risk if caches cannot be managed around physical address mutation events like CXL region destruction. A new device_for_each_child_reverse_from() is added to cleanup port->commit_end after all dependent decoders have been disabled. In other words if decoders are allocated 0->1->2 and disabled 1->2->0 then port->commit_end only decrements from 2 after 2 has been disabled, and it decrements all the way to zero since 1 was disabled previously. Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Cc: stable@vger.kernel.org Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-06-25cxl/region: check interleave capabilityYao Xingtao
Since interleave capability is not verified, if the interleave capability of a target does not match the region need, committing decoder should have failed at the device end. In order to checkout this error as quickly as possible, driver needs to check the interleave capability of target during attaching it to region. Per CXL specification r3.1(8.2.4.20.1 CXL HDM Decoder Capability Register), bits 11 and 12 indicate the capability to establish interleaving in 3, 6, 12 and 16 ways. If these bits are not set, the target cannot be attached to a region utilizing such interleave ways. Additionally, bits 8 and 9 represent the capability of the bits used for interleaving in the address, Linux tracks this in the cxl_port interleave_mask. Per CXL specification r3.1(8.2.4.20.13 Decoder Protection): eIW means encoded Interleave Ways. eIG means encoded Interleave Granularity. in HPA: if eIW is 0 or 8 (interleave ways: 1, 3), all the bits of HPA are used, the interleave bits are none, the following check is ignored. if eIW is less than 8 (interleave ways: 2, 4, 8, 16), the interleave bits start at bit position eIG + 8 and end at eIG + eIW + 8 - 1. if eIW is greater than 8 (interleave ways: 6, 12), the interleave bits start at bit position eIG + 8 and end at eIG + eIW - 1. if the interleave mask is insufficient to cover the required interleave bits, the target cannot be attached to the region. Fixes: 384e624bb211 ("cxl/region: Attach endpoint decoders") Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240614084755.59503-2-yaoxt.fnst@fujitsu.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/hdm: Debug, use decoder name functionIra Weiny
The decoder enum has a name conversion function defined now. Use that instead of open coding. Suggested-by: Navneet Singh <navneet.singh@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230604-dcd-type2-upstream-v2-1-f740c47e7916@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/hdm: dev_warn() on unsupported mixed mode decoderAlison Schofield
A mixed mode decoder is programmed with device physical addresses that span both ram and pmem partitions of a memdev. Linux does not support mixed mode decoders. The driver rejects sysfs writes that try to set decoder mode to mixed, and if a resource bieng allocated is not wholly contained in either the pmem or ram partition of a memdev, it is also rejected. Basically, the CXL region driver is not going to create regions with mixed mode decoders, but the BIOS could. If the kernel driver sees the mixed mode decoder, it will fail to enable the region, and emit a dev_dbg() message. A dev_dbg() is not noisy enough in this case. Change the message to be a dev_warn() that explicitly says mixed mode is not supported. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230218013834.31237-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/hdm: Add debug message for invalid interleave granularityHuang Ying
There's no debug message for invalid interleave granularity. This makes it hard to debug related bugs. So, this is added in this patch. Signed-off-by: Huang, Ying <ying.huang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240402061016.388408-1-ying.huang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2023-12-07cxl/hdm: Fix dpa translation lockingDan Williams
The helper, cxl_dpa_resource_start(), snapshots the dpa-address of an endpoint-decoder after acquiring the cxl_dpa_rwsem. However, it is sufficient to assert that cxl_dpa_rwsem is held rather than acquire it in the helper. Otherwise, it triggers multiple lockdep reports: 1/ Tracing callbacks are in an atomic context that can not acquire sleeping locks: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1525 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1288, name: bash preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 [..] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230524-3.fc38 05/24/2023 Call Trace: <TASK> dump_stack_lvl+0x71/0x90 __might_resched+0x1b2/0x2c0 down_read+0x1a/0x190 cxl_dpa_resource_start+0x15/0x50 [cxl_core] cxl_trace_hpa+0x122/0x300 [cxl_core] trace_event_raw_event_cxl_poison+0x1c9/0x2d0 [cxl_core] 2/ The rwsem is already held in the inject poison path: WARNING: possible recursive locking detected 6.7.0-rc2+ #12 Tainted: G W OE N -------------------------------------------- bash/1288 is trying to acquire lock: ffffffffc05f73d0 (cxl_dpa_rwsem){++++}-{3:3}, at: cxl_dpa_resource_start+0x15/0x50 [cxl_core] but task is already holding lock: ffffffffc05f73d0 (cxl_dpa_rwsem){++++}-{3:3}, at: cxl_inject_poison+0x7d/0x1e0 [cxl_core] [..] Call Trace: <TASK> dump_stack_lvl+0x71/0x90 __might_resched+0x1b2/0x2c0 down_read+0x1a/0x190 cxl_dpa_resource_start+0x15/0x50 [cxl_core] cxl_trace_hpa+0x122/0x300 [cxl_core] trace_event_raw_event_cxl_poison+0x1c9/0x2d0 [cxl_core] __traceiter_cxl_poison+0x5c/0x80 [cxl_core] cxl_inject_poison+0x1bc/0x1e0 [cxl_core] This appears to have been an issue since the initial implementation and uncovered by the new cxl-poison.sh test [1]. That test is now passing with these changes. Fixes: 28a3ae4ff66c ("cxl/trace: Add an HPA to cxl_poison trace events") Link: http://lore.kernel.org/r/e4f2716646918135ddbadf4146e92abb659de734.1700615159.git.alison.schofield@intel.com [1] Cc: <stable@vger.kernel.org> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-11-22cxl/hdm: Fix a benign lockdep splatDave Jiang
The new helper "cxl_num_decoders_committed()" added a lockdep assertion to validate that port->commit_end is protected against modification. That assertion fires in init_hdm_decoder() where it is initializing port->commit_end. Given that it is both accessing and writing that property it obstensibly needs the lock. In practice, CXL decoder commit rules (must commit in order) and the in-order discovery of device decoders makes the manipulation of ->commit_end in init_hdm_decoder() safe. However, rather than rely on the subtle rules of CXL hardware, just make the implementation obviously correct from a software perspective. The Fixes: tag is only for cleaning up a lockdep splat, there is no functional issue addressed by this fix. Fixes: 458ba8189cb4 ("cxl: Add cxl_decoders_committed() helper") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170025232811.2147250.16376901801315194121.stgit@djiang5-mobl3 Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-31cxl/hdm: Remove broken error pathDan Williams
Dan reports that cxl_decoder_commit() potentially leaks a hold of cxl_dpa_rwsem. The potential error case is a "should not" happen scenario, turn it into a "can not" happen scenario by adding the error check to cxl_port_setup_targets() where other setting validation occurs. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: http://lore.kernel.org/r/63295673-5d63-4919-b851-3b06d48734c0@moroto.mountain Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-31cxl/hdm: Fix && vs || bugDan Carpenter
If "info" is NULL then this code will crash. || was intended instead of &&. Fixes: 8ce520fdea24 ("cxl/hdm: Use stored Component Register mappings to map HDM decoder capability") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/60028378-d3d5-4d6d-90fd-f915f061e731@moroto.mountain Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-31Merge branch 'for-6.7/cxl-commited' into cxl/nextDan Williams
Add the committed decoder sysfs attribute for v6.7.
2023-10-31Merge branch 'for-6.7/cxl-rch-eh' into cxl/nextDan Williams
Restricted CXL Host (RCH) Error Handling undoes the topology munging of CXL 1.1 to enabled some AER recovery, and lands some base infrastructure for handling Root-Complex-Event-Collectors (RCECs) with CXL. Include this long running series finally for v6.7.
2023-10-27cxl: Add cxl_decoders_committed() helperDave Jiang
Add a helper to retrieve the number of decoders committed for the port. Replace all the open coding of the calculation with the helper. Link: https://lore.kernel.org/linux-cxl/651c98472dfed_ae7e729495@dwillia2-xfh.jf.intel.com.notmuch/ Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/169747906849.272156.1729290904857372335.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27cxl/hdm: Use stored Component Register mappings to map HDM decoder capabilityRobert Richter
Now, that the Component Register mappings are stored, use them to enable and map the HDM decoder capabilities. The Component Registers do not need to be probed again for this, remove probing code. The HDM capability applies to Endpoints, USPs and VH Host Bridges. The Endpoint's component register mappings are located in the cxlds and else in the port's structure. Duplicate the cxlds->reg_map in port->reg_map for endpoint ports. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> [rework to drop cxl_port_get_comp_map()] Link: https://lore.kernel.org/r/20231018171713.1883517-8-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27cxl/core/regs: Rename @dev to @host in struct cxl_register_mapRobert Richter
The primary role of @dev is to host the mappings for devm operations. @dev is too ambiguous as a name. I.e. when does @dev refer to the 'struct device *' instance that the registers belong, and when does @dev refer to the 'struct device *' instance hosting the mapping for devm operations? Clarify the role of @dev in cxl_register_map by renaming it to @host. Also, rename local variables to 'host' where map->host is used. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-3-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-06cxl/memdev: Fix sanitize vs decoder setup lockingDan Williams
The sanitize operation is destructive and the expectation is that the device is unmapped while in progress. The current implementation does a lockless check for decoders being active, but then does nothing to prevent decoders from racing to be committed. Introduce state tracking to resolve this race. This incidentally cleans up unpriveleged userspace from triggering mmio read cycles by spinning on reading the 'security/state' attribute. Which at a minimum is a waste since the kernel state machine can cache the completion result. Lastly cxl_mem_sanitize() was mistakenly marked EXPORT_SYMBOL() in the original implementation, but an export was never required. Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery") Cc: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25Merge branch 'for-6.5/cxl-rch-eh' into for-6.5/cxlDan Williams
Pick up the first half of the RCH error handling series. The back half needs some fixups for test regressions. Small conflicts with the PMU work around register enumeration and setup helpers.
2023-06-25cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEMDan Williams
In preparation for device-memory region creation, arrange for decoders of CXL_DEVTYPE_DEVMEM memdevs to default to CXL_DECODER_DEVMEM for their target type. Revisit this if a device ever shows up that wants to offer mixed HDM-H (Host-Only Memory) and HDM-DB support, or an CXL_DEVTYPE_DEVMEM device that supports HDM-H. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679261945.3436160.11673393474107374595.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTONLYMEM, DEVMEM}Dan Williams
In preparation for support for HDM-D and HDM-DB configuration (device-memory, and device-memory with back-invalidate). Rename the current type designators to use HOSTONLYMEM and DEVMEM as a suffix. HDM-DB can be supported by devices that are not accelerators, so DEVMEM is a more generic term for that case. Fixup one location where this type value was open coded. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679261369.3436160.7042443847605280593.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25cxl/core/regs: Add @dev to cxl_register_mapRobert Richter
The corresponding device of a register mapping is used for devm operations and logging. For operations with struct cxl_register_map the device needs to be kept track separately. To simpify the involved function interfaces, add @dev to cxl_register_map. While at it also reorder function arguments of cxl_map_device_regs() and cxl_map_component_regs() to have the object @cxl_register_map first. As a result a bunch of functions are available to be used with a @cxl_register_map object. This patch is in preparation of reworking the component register setup code. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-7-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-18cxl/hdm: Add more HDM decoder debug messages at startupDan Williams
A recent debug session yielded a couple debug messages that were useful for determining the reason why the driver was or was not falling back to CXL range register emulation, and for identifying decoder setting enumeration problems. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/168149845668.792294.11814353796371419167.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-18cxl/port: Scan single-target ports for decodersDan Williams
Do not assume that a single-target port falls back to a passthrough decoder configuration. Scan for decoders and only fallback after probing that the HDM decoder capability is not present. One user visible affect of this bug is the inability to enumerate present CXL regions as the decoder settings for the present decoders are skipped. Fixes: d17d0540a0db ("cxl/core/hdm: Add CXL standard decoder enumeration to the core") Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: http://lore.kernel.org/r/20230227153128.8164-1-Jonathan.Cameron@huawei.com Cc: <stable@vger.kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/168149845130.792294.3210421233937427962.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-18cxl/hdm: Use 4-byte reads to retrieve HDM decoder base+limitDan Williams
The CXL specification mandates that 4-byte registers must be accessed with 4-byte access cycles. CXL 3.0 8.2.3 "Component Register Layout and Definition" states that the behavior is undefined if (2) 32-bit registers are accessed as an 8-byte quantity. It turns out that at least one hardware implementation is sensitive to this in practice. The @size variable results in zero with: size = readq(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); ...and the correct size with: lo = readl(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); hi = readl(hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(which)); size = (hi << 32) + lo; Fixes: d17d0540a0db ("cxl/core/hdm: Add CXL standard decoder enumeration to the core") Cc: <stable@vger.kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/168149844056.792294.8224490474529733736.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-18cxl/hdm: Fail upon detecting 0-sized decodersDan Williams
Decoders committed with 0-size lead to later crashes on shutdown as __cxl_dpa_release() assumes a 'struct resource' has been established in the in 'cxlds->dpa_res'. Just fail the driver load in this instance since there are deeper problems with the enumeration or the setup when this happens. Fixes: 9c57cde0dcbd ("cxl/hdm: Enumerate allocated DPA") Cc: <stable@vger.kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/168149843516.792294.11872242648319572632.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-04cxl/hdm: Extend DVSEC range register emulation for region enumerationDan Williams
One motivation for mapping range registers to decoder objects is to use those settings for region autodiscovery. The need to map a region for devices programmed to use range registers is especially urgent now that the kernel no longer routes "Soft Reserved" ranges in the memory map to device-dax by default. The CXL memory range loses all access mechanisms. Complete the implementation by marking the DPA reservation and setting the endpoint-decoder state to signal autodiscovery. Note that the default settings of ways=1 and granularity=4096 set in cxl_decode_init() do not need to be updated. Fixes: 09d09e04d2fc ("cxl/dax: Create dax devices for CXL RAM regions") Tested-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Gregory Price <gregory.price@memverge.com> Link: https://lore.kernel.org/r/168012575521.221280.14177293493678527326.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-04cxl/hdm: Limit emulation to the number of range registersDan Williams
Recall that range register emulation seeks to treat the 2 potential range registers as Linux CXL "decoder" objects. The number of range registers can be 1 or 2, while HDM decoder ranges can include more than 2. Be careful not to confuse DVSEC range count with HDM capability decoder count. Commit to range register earlier in devm_cxl_setup_hdm(). Otherwise, a device with more HDM decoders than range registers can set @cxlhdm->decoder_count to an invalid value. Avoid introducing a forward declaration by just moving the definition of should_emulate_decoders() earlier in the file. should_emulate_decoders() is unchanged. Tested-by: Dave Jiang <dave.jiang@intel.com> Fixes: d7a2153762c7 ("cxl/hdm: Add emulation when HDM decoders are not committed") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168012574932.221280.15944705098679646436.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-04cxl/hdm: Skip emulation when driver manages mem_enableDan Williams
If the driver is allowed to enable memory operation itself then it can also turn on HDM decoder support at will. With this the second call to cxl_setup_hdm_decoder_from_dvsec(), when an HDM decoder is not committed, is not needed. Fixes: b777e9bec960 ("cxl/hdm: Emulate HDM decoder from DVSEC range registers") Link: http://lore.kernel.org/r/20230220113657.000042e1@huawei.com Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167703068474.185722.664126485486344246.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-04cxl/hdm: Fix double allocation of @cxlhdmDan Williams
devm_cxl_setup_emulated_hdm() reallocates an instance of @cxlhdm that was already allocated at the start of devm_cxl_setup_hdm(). Only one is needed and devm_cxl_setup_emulated_hdm() does not do enough to warrant being an explicit helper. Fixes: 4474ce565ee4 ("cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders") Tested-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167703067936.185722.7908921750127154779.stgit@dwillia2-xfh.jf.intel.com Link: https://lore.kernel.org/r/168012574357.221280.5001364964799725366.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14Merge branch 'for-6.3/cxl-rr-emu' into cxl/nextDan Williams
Pick up the CXL DVSEC range register emulation for v6.3, and resolve conflicts with the cxl_port_probe() split (from for-6.3/cxl-ram-region) and event handling (from for-6.3/cxl-events).
2023-02-14cxl/hdm: Add emulation when HDM decoders are not committedDave Jiang
For the case where DVSEC range register(s) are active and HDM decoders are not committed, use RR to provide emulation. A first pass is done to note whether any decoders are committed. If there are no committed endpoint decoders, then DVSEC ranges will be used for emulation. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640369536.935665.611974113442400127.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decodersDave Jiang
CXL rev3 spec 8.1.3 RCDs may not have HDM register blocks. Create a fake HDM with information from the CXL PCIe DVSEC registers. The decoder count will be set to the HDM count retrieved from the DVSEC cap register. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640368994.935665.15831225724059704620.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14cxl/hdm: Emulate HDM decoder from DVSEC range registersDave Jiang
In the case where HDM decoder register block exists but is not programmed and at the same time the DVSEC range register range is active, populate the CXL decoder object 'cxl_decoder' with info from DVSEC range registers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640368454.935665.13806415120298330717.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-10tools/testing/cxl: Define a fixed volatile configuration to parseDan Williams
Take two endpoints attached to the first switch on the first host-bridge in the cxl_test topology and define a pre-initialized region. This is a x2 interleave underneath a x1 CXL Window. $ modprobe cxl_test $ # cxl list -Ru { "region":"region3", "resource":"0xf010000000", "size":"512.00 MiB (536.87 MB)", "interleave_ways":2, "interleave_granularity":4096, "decode_state":"commit" } Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167602000547.1924368.11613151863880268868.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-10cxl/region: Add region autodiscoveryDan Williams
Region autodiscovery is an asynchronous state machine advanced by cxl_port_probe(). After the decoders on an endpoint port are enumerated they are scanned for actively enabled instances. Each active decoder is flagged for auto-assembly CXL_DECODER_F_AUTO and attached to a region. If a region does not already exist for the address range setting of the decoder one is created. That creation process may race with other decoders of the same region being discovered since cxl_port_probe() is asynchronous. A new 'struct cxl_root_decoder' lock, @range_lock, is introduced to mitigate that race. Once all decoders have arrived, "p->nr_targets == p->interleave_ways", they are sorted by their relative decode position. The sort algorithm involves finding the point in the cxl_port topology where one leg of the decode leads to deviceA and the other deviceB. At that point in the topology the target order in the 'struct cxl_switch_decoder' indicates the relative position of those endpoint decoders in the region. >From that point the region goes through the same setup and validation steps as user-created regions, but instead of programming the decoders it validates that driver would have written the same values to the decoders as were already present. Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167601999958.1924368.9366954455835735048.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-10cxl/port: Split endpoint and switch port probeDan Williams
Jonathan points out that the shared code between the switch and endpoint case is small. Before adding another is_cxl_endpoint() conditional, just split the two cases. Rather than duplicate the "Couldn't enumerate decoders" error message take the opportunity to improve the error messages in devm_cxl_enumerate_decoders(). Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167601999378.1924368.15071142145866277623.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-05cxl: update names for interleave ways conversion macrosDave Jiang
Change names for interleave ways macros to clearly indicate which variable is encoded and which is the actual ways value. ways == interleave ways eiw == encoded interleave ways Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167027516228.3124679.11265039496968588580.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-05cxl: update names for interleave granularity conversion macrosDave Jiang
Change names for granularity macros to clearly indicate which variable is encoded and which is the actual granularity. granularity == interleave granularity eig == encoded interleave granularity Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167027493237.3124429.8948852388671827664.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-03cxl/pci: Prepare for mapping RAS Capability StructureDan Williams
The RAS Capabilitiy Structure is a CXL Component register capability block. Unlike the HDM Decoder Capability, it will be referenced by the cxl_pci driver in response to PCIe AER events. Due to this it is no longer the case that cxl_map_component_regs() can assume that it should map all component registers. Plumb a bitmask of capability ids to map through cxl_map_component_regs(). For symmetry cxl_probe_device_regs() is updated to populate @id in 'struct cxl_reg_map' even though cxl_map_device_regs() does not have a need to map a subset of the device registers per caller. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974412214.1608150.11487843455070795378.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-03cxl/port: Limit the port driver to just the HDM Decoder CapabilityDan Williams
Update the port driver to use cxl_map_component_registers() so that the component register block can be shared between the cxl_pci driver and the cxl_port driver. I.e. stop the port driver from reserving the entire component register block for itself via request_region() when it only needs the HDM Decoder Capability subset. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974411625.1608150.7149373371599960307.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05cxl/hdm: Fix skip allocations vs multiple pmem allocationsDan Williams
Vishal notes that when attempting to define a second pmem region on a device the DPA allocation fails with a message of the form: decoder11.1: failed to reserve skipped space Recall that the skip setting is used when there is a pmem allocation in the presence of free ram DPA space. The first pmem allocation skips over the free ram and subsequent pmem allocations do not require a skip. The bug is that a skip is still attempted and the DPA reservation code flags the double skip allocation conflict. Fixes: cf880423b6a0 ("cxl/hdm: Add support for allocating DPA to an endpoint decoder") Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973754730.1558392.15466392461645857658.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05cxl/region: Move HPA setup to cxl_region_attach()Dan Williams
A recent bug fix added the setup of the endpoint decoder interleave geometry settings to cxl_region_attach(). Move the HPA setup there as well to keep all endpoint decoder parameter setting in a central location. For symmetry, move endpoint HPA teardown to cxl_region_detach(), and for switches move HPA setup / teardown to cxl_port_{setup,reset}_targets(). Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973126020.1526540.14701949254436069807.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-01cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetimeDan Williams
After adding support for emulating platform firmware established DPA reservations, the cxl-topology.sh [1] unit test started crashing with the following signature: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: <TASK> __cxl_dpa_release+0x1b/0xd0 [cxl_core] cxl_dpa_release+0x1d/0x30 [cxl_core] release_nodes+0x63/0x90 devres_release_all+0x88/0xc0 ...i.e. a use after free of a 'struct cxl_endpoint_decoder' object. This results from the ordering of init_hdm_decoder() before add_hdm_decoder() where, at release time, the decoder is unregistered and released before the DPA reservation. Fix this by extending the life of the object until all DPA reservations have been released which also preserves platform decoder settings being settled by the time the decoder is published in sysfs (KOBJ_ADD time). Note that the @len == 0 case in __cxl_dpa_reserve() is avoided in practice as this function is only called for committed decoders and new non-zero DPA allocations. Link: https://github.com/pmem/ndctl/blob/pending/test/cxl-topology.sh [1] Fixes: 9c57cde0dcbd ("cxl/hdm: Enumerate allocated DPA") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/165896020625.3546860.12390103413706292760.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-25cxl/hdm: Commit decoder state to hardwareDan Williams
After all the soft validation of the region has completed, convey the region configuration to hardware while being careful to commit decoders in specification mandated order. In addition to programming the endpoint decoder base-address, interleave ways and granularity, the switch decoder target lists are also established. While the kernel can enforce spec-mandated commit order, it can not enforce spec-mandated reset order. For example, the kernel can't stop someone from removing an endpoint device that is occupying decoderN in a switch decoder where decoderN+1 is also committed. To reset decoderN, decoderN+1 must be torn down first. That "tear down the world" implementation is saved for a follow-on patch. Callback operations are provided for the 'commit' and 'reset' operations. While those callbacks may prove useful for CXL accelerators (Type-2 devices with memory) the primary motivation is to enable a simple way for cxl_test to intercept those operations. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784338418.1758207.14659830845389904356.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-25cxl/region: Enable the assignment of endpoint decoders to regionsDan Williams
The region provisioning process involves allocating DPA to a set of endpoint decoders, and HPA plus the region geometry to a region device. Then the decoder is assigned to the region. At this point several validation steps can be performed to validate that the decoder is suitable to participate in the region. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/165784336184.1758207.16403282029203949622.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/port: Move dport tracking to an xarrayDan Williams
Reduce the complexity and the overhead of walking the topology to determine endpoint connectivity to root decoder interleave configurations. Note that cxl_detach_ep(), after it determines that the last @ep has departed and decides to delete the port, now needs to walk the dport array with the device_lock() held to remove entries. Previously list_splice_init() could be used atomically delete all dport entries at once and then perform entry tear down outside the lock. There is no list_splice_init() equivalent for the xarray. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784331647.1758207.6345820282285119339.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/hdm: Add support for allocating DPA to an endpoint decoderDan Williams
The region provisioning flow will roughly follow a sequence of: 1/ Allocate DPA to a set of decoders 2/ Allocate HPA to a region 3/ Associate decoders with a region and validate that the DPA allocations and topologies match the parameters of the region. For now, this change (step 1) arranges for DPA capacity to be allocated and deleted from non-committed decoders based on the decoder's mode / partition selection. Capacity is allocated from the lowest DPA in the partition and any 'pmem' allocation blocks out all remaining ram capacity in its 'skip' setting. DPA allocations are enforced in decoder instance order. I.e. decoder N + 1 always starts at a higher DPA than instance N, and deleting allocations must proceed from the highest-instance allocated decoder to the lowest. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784329399.1758207.16732038126938632700.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/hdm: Track next decoder to allocateDan Williams
The CXL specification enforces that endpoint decoders are committed in hw instance id order. In preparation for adding dynamic DPA allocation, record the hw instance id in endpoint decoders, and enforce allocations to occur in hw instance id order. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784328827.1758207.9627538529944559954.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>