1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
|
What: /sys/kernel/debug/cxl/memX/inject_poison
Date: April, 2023
KernelVersion: v6.4
Contact: linux-cxl@vger.kernel.org
Description:
(WO) When a Device Physical Address (DPA) is written to this
attribute, the memdev driver sends an inject poison command to
the device for the specified address. The DPA must be 64-byte
aligned and the length of the injected poison is 64-bytes. If
successful, the device returns poison when the address is
accessed through the CXL.mem bus. Injecting poison adds the
address to the device's Poison List and the error source is set
to Injected. In addition, the device adds a poison creation
event to its internal Informational Event log, updates the
Event Status register, and if configured, interrupts the host.
It is not an error to inject poison into an address that
already has poison present and no error is returned. If the
device returns 'Inject Poison Limit Reached' an -EBUSY error
is returned to the user. The inject_poison attribute is only
visible for devices supporting the capability.
TEST-ONLY INTERFACE: This interface is intended for testing
and validation purposes only. It is not a data repair mechanism
and should never be used on production systems or live data.
DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
poison injection can result in permanent data loss. Injected
poison may render data permanently inaccessible even after
clearing, as the clear operation writes zeros and does not
recover original data.
SYSTEM STABILITY RISK: For volatile memory, poison injection
can cause kernel crashes, system instability, or unpredictable
behavior if the poisoned addresses are accessed by running code
or critical kernel structures.
What: /sys/kernel/debug/cxl/memX/clear_poison
Date: April, 2023
KernelVersion: v6.4
Contact: linux-cxl@vger.kernel.org
Description:
(WO) When a Device Physical Address (DPA) is written to this
attribute, the memdev driver sends a clear poison command to
the device for the specified address. Clearing poison removes
the address from the device's Poison List and writes 0 (zero)
for 64 bytes starting at address. It is not an error to clear
poison from an address that does not have poison set. If the
device cannot clear poison from the address, -ENXIO is returned.
The clear_poison attribute is only visible for devices
supporting the capability.
TEST-ONLY INTERFACE: This interface is intended for testing
and validation purposes only. It is not a data repair mechanism
and should never be used on production systems or live data.
CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
specified address range and removes the address from the poison
list. It does NOT recover or restore original data that may have
been present before poison injection. Any original data at the
cleared address is permanently lost and replaced with zeros.
CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
purposes only and should not be used as a data repair tool.
Clearing poison is fundamentally different from data recovery
or error correction.
What: /sys/kernel/debug/cxl/regionX/inject_poison
Date: August, 2025
Contact: linux-cxl@vger.kernel.org
Description:
(WO) When a Host Physical Address (HPA) is written to this
attribute, the region driver translates it to a Device
Physical Address (DPA) and identifies the corresponding
memdev. It then sends an inject poison command to that memdev
at the translated DPA. Refer to the memdev ABI entry at:
/sys/kernel/debug/cxl/memX/inject_poison for the detailed
behavior. This attribute is only visible if all memdevs
participating in the region support both inject and clear
poison commands.
TEST-ONLY INTERFACE: This interface is intended for testing
and validation purposes only. It is not a data repair mechanism
and should never be used on production systems or live data.
DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
poison injection can result in permanent data loss. Injected
poison may render data permanently inaccessible even after
clearing, as the clear operation writes zeros and does not
recover original data.
SYSTEM STABILITY RISK: For volatile memory, poison injection
can cause kernel crashes, system instability, or unpredictable
behavior if the poisoned addresses are accessed by running code
or critical kernel structures.
What: /sys/kernel/debug/cxl/regionX/clear_poison
Date: August, 2025
Contact: linux-cxl@vger.kernel.org
Description:
(WO) When a Host Physical Address (HPA) is written to this
attribute, the region driver translates it to a Device
Physical Address (DPA) and identifies the corresponding
memdev. It then sends a clear poison command to that memdev
at the translated DPA. Refer to the memdev ABI entry at:
/sys/kernel/debug/cxl/memX/clear_poison for the detailed
behavior. This attribute is only visible if all memdevs
participating in the region support both inject and clear
poison commands.
TEST-ONLY INTERFACE: This interface is intended for testing
and validation purposes only. It is not a data repair mechanism
and should never be used on production systems or live data.
CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
specified address range and removes the address from the poison
list. It does NOT recover or restore original data that may have
been present before poison injection. Any original data at the
cleared address is permanently lost and replaced with zeros.
CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
purposes only and should not be used as a data repair tool.
Clearing poison is fundamentally different from data recovery
or error correction.
What: /sys/kernel/debug/cxl/einj_types
Date: January, 2024
KernelVersion: v6.9
Contact: linux-cxl@vger.kernel.org
Description:
(RO) Prints the CXL protocol error types made available by
the platform in the format:
0x<error number> <error type>
The possible error types are (as of ACPI v6.5):
0x1000 CXL.cache Protocol Correctable
0x2000 CXL.cache Protocol Uncorrectable non-fatal
0x4000 CXL.cache Protocol Uncorrectable fatal
0x8000 CXL.mem Protocol Correctable
0x10000 CXL.mem Protocol Uncorrectable non-fatal
0x20000 CXL.mem Protocol Uncorrectable fatal
The <error number> can be written to einj_inject to inject
<error type> into a chosen dport.
What: /sys/kernel/debug/cxl/$dport_dev/einj_inject
Date: January, 2024
KernelVersion: v6.9
Contact: linux-cxl@vger.kernel.org
Description:
(WO) Writing an integer to this file injects the corresponding
CXL protocol error into $dport_dev ($dport_dev will be a device
name from /sys/bus/pci/devices). The integer to type mapping for
injection can be found by reading from einj_types. If the dport
was enumerated in RCH mode, a CXL 1.1 error is injected, otherwise
a CXL 2.0 error is injected.
|