Age | Commit message (Collapse) | Author |
|
Make it possible to handle not only single-, but also multi-
segment iterators in copy_oldmem_iter() callback. Change the
semantics of called functions to match the iterator model -
instead of an error code the exact number of bytes copied is
returned.
The swap page used to copy data to user space is adopted for
kernel space too. That does not bring any performance impact.
Suggested-by: Matthew Wilcox <willy@infradead.org>
Fixes: cc02e6e21aa5 ("s390/crash: add missing iterator advance in copy_oldmem_page()")
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Link: https://lore.kernel.org/r/5af6da3a0bffe48a90b0b7139ecf6a818b2d18e8.1658206891.git.agordeev@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
Memory buffer used for reading out data from hardware system
area is not protected against concurrent access.
Reported-by: Matthew Wilcox <willy@infradead.org>
Fixes: 411ed3225733 ("[S390] zfcpdump support.")
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Link: https://lore.kernel.org/r/e68137f0f9a0d2558f37becc20af18e2939934f6.1658206891.git.agordeev@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
Make sure the uvdevice driver will be automatically loaded when
facility 158 is available.
Signed-off-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713125644.16121-4-seiden@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
Rework cpufeature implementation to allow for various cpu feature
indications, which is not only limited to hwcap bits. This is achieved
by adding a sequential list of cpu feature numbers, where each of them
is mapped to an entry which indicates what this number is about.
Each entry contains a type member, which indicates what feature
name space to look into (e.g. hwcap, or cpu facility). If wanted this
allows also to automatically load modules only in e.g. z/VM
configurations.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713125644.16121-2-seiden@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
This patch implements two new AP driver callbacks:
void (*on_config_changed)(struct ap_config_info *new_config_info,
struct ap_config_info *old_config_info);
void (*on_scan_complete)(struct ap_config_info *new_config_info,
struct ap_config_info *old_config_info);
The on_config_changed callback is invoked at the start of the AP bus scan
function when it determines that the host AP configuration information
has changed since the previous scan.
The vfio_ap device driver registers a callback function for this callback
that performs the following operations:
1. Unplugs the adapters, domains and control domains removed from the
host's AP configuration from the guests to which they are
assigned in a single operation.
2. Stores bitmaps identifying the adapters, domains and control domains
added to the host's AP configuration with the structure representing
the mediated device. When the vfio_ap device driver's probe callback is
subsequently invoked, the probe function will recognize that the
queue is being probed due to a change in the host's AP configuration
and the plugging of the queue into the guest will be bypassed.
The on_scan_complete callback is invoked after the ap bus scan is
completed if the host AP configuration data has changed. The vfio_ap
device driver registers a callback function for this callback that hot
plugs each queue and control domain added to the AP configuration for each
guest using them in a single hot plug operation.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
The matrix of adapters and domains configured in a guest's APCB may
differ from the matrix of adapters and domains assigned to the matrix mdev,
so this patch introduces a sysfs attribute to display the matrix of
adapters and domains that are or will be assigned to the APCB of a guest
that is or will be using the matrix mdev. For a matrix mdev denoted by
$uuid, the guest matrix can be displayed as follows:
cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
Let's implement the callback to indicate when an APQN
is in use by the vfio_ap device driver. The callback is
invoked whenever a change to the apmask or aqmask would
result in one or more queue devices being removed from the driver. The
vfio_ap device driver will indicate a resource is in use
if the APQN of any of the queue devices to be removed are assigned to
any of the matrix mdevs under the driver's control.
There is potential for a deadlock condition between the
matrix_dev->guests_lock used to lock the guest during assignment of
adapters and domains and the ap_perms_mutex locked by the AP bus when
changes are made to the sysfs apmask/aqmask attributes.
The AP Perms lock controls access to the objects that store the adapter
numbers (ap_perms) and domain numbers (aq_perms) for the sysfs
/sys/bus/ap/apmask and /sys/bus/ap/aqmask attributes. These attributes
identify which queues are reserved for the zcrypt default device drivers.
Before allowing a bit to be removed from either mask, the AP bus must check
with the vfio_ap device driver to verify that none of the queues are
assigned to any of its mediated devices.
The apmask/aqmask attributes can be written or read at any time from
userspace, so care must be taken to prevent a deadlock with asynchronous
operations that might be taking place in the vfio_ap device driver. For
example, consider the following:
1. A system administrator assigns an adapter to a mediated device under the
control of the vfio_ap device driver. The driver will need to first take
the matrix_dev->guests_lock to potentially hot plug the adapter into
the KVM guest.
2. At the same time, a system administrator sets a bit in the sysfs
/sys/bus/ap/ap_mask attribute. To complete the operation, the AP bus
must:
a. Take the ap_perms_mutex lock to update the object storing the values
for the /sys/bus/ap/ap_mask attribute.
b. Call the vfio_ap device driver's in-use callback to verify that the
queues now being reserved for the default zcrypt drivers are not
assigned to a mediated device owned by the vfio_ap device driver. To
do the verification, the in-use callback function takes the
matrix_dev->guests_lock, but has to wait because it is already held
by the operation in 1 above.
3. The vfio_ap device driver calls an AP bus function to verify that the
new queues resulting from the assignment of the adapter in step 1 are
not reserved for the default zcrypt device driver. This AP bus function
tries to take the ap_perms_mutex lock but gets stuck waiting for the
waiting for the lock due to step 2a above.
Consequently, we have the following deadlock situation:
matrix_dev->guests_lock locked (1)
ap_perms_mutex lock locked (2a)
Waiting for matrix_dev->gusts_lock (2b) which is currently held (1)
Waiting for ap_perms_mutex lock (3) which is currently held (2a)
To prevent this deadlock scenario, the function called in step 3 will no
longer take the ap_perms_mutex lock and require the caller to take the
lock. The lock will be the first taken by the adapter/domain assignment
functions in the vfio_ap device driver to maintain the proper locking
order.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
When an adapter or domain is unassigned from an mdev attached to a KVM
guest, one or more of the guest's queues may get dynamically removed. Since
the removed queues could get re-assigned to another mdev, they need to be
reset. So, when an adapter or domain is unassigned from the mdev, the
queues that are removed from the guest's AP configuration (APCB) will be
reset.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
When an AP queue device is probed or removed, if the mediated device is
attached to a KVM guest, the mediated device's adapter, domain and
control domain bitmaps must be filtered to update the guest's APCB and if
any changes are detected, the guest's APCB must then be hot plugged into
the guest to reflect those changes to the guest.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
Let's hot plug an adapter, domain or control domain into the guest when it
is assigned to a matrix mdev that is attached to a KVM guest. Likewise,
let's hot unplug an adapter, domain or control domain from the guest when
it is unassigned from a matrix_mdev that is attached to a KVM guest.
Whenever an assignment or unassignment of an adapter, domain or control
domain is performed, the APQNs and control domains assigned to the matrix
mdev will be filtered and assigned to the AP control block
(APCB) that supplies the AP configuration to the guest so that no
adapter, domain or control domain that is not in the host's AP
configuration nor any APQN that does not reference a queue device bound
to the vfio_ap device driver is assigned.
After updating the APCB, if the mdev is in use by a KVM guest, it is
hot plugged into the guest to dynamically provide access to the adapters,
domains and control domains provided via the newly refreshed APCB.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
The callback functions for probing and removing a queue device must take
and release the locks required to perform a dynamic update of a guest's
APCB in the proper order.
The proper order for taking the locks is:
matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock
The proper order for releasing the locks is:
matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock
A new helper function is introduced to be used by the probe callback to
acquire the required locks. Since the probe callback only has
access to a queue device when it is called, the helper function will find
the ap_matrix_mdev object to which the queue device's APQN is assigned and
return it so the KVM guest to which the mdev is attached can be dynamically
updated.
Note that in order to find the ap_matrix_mdev (matrix_mdev) object, it is
necessary to search the matrix_dev->mdev_list. This presents a
locking order dilemma because the matrix_dev->mdevs_lock can't be taken to
protect against changes to the list while searching for the matrix_mdev to
which a queue device's APQN is assigned. This is due to the fact that the
proper locking order requires that the matrix_dev->mdevs_lock be taken
after both the matrix_mdev->kvm->lock and the matrix_dev->mdevs_lock.
Consequently, the matrix_dev->guests_lock will be used to protect against
removal of a matrix_mdev object from the list while a queue device is
being probed. This necessitates changes to the mdev probe/remove
callback functions to take the matrix_dev->guests_lock prior to removing
a matrix_mdev object from the list.
A new macro is also introduced to acquire the locks required to dynamically
update the guest's APCB in the proper order when a queue device is
removed.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
The functions backing the matrix mdev's sysfs attribute interfaces to
assign/unassign adapters, domains and control domains must take and
release the locks required to perform a dynamic update of a guest's APCB
in the proper order.
The proper order for taking the locks is:
matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock
The proper order for releasing the locks is:
matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock
Two new macros are introduced for this purpose: One to take the locks and
the other to release the locks. These macros will be used by the
assignment/unassignment functions to prepare for dynamic update of
the KVM guest's APCB.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
The group notifier that handles the VFIO_GROUP_NOTIFY_SET_KVM event must
use the required locks in proper locking order to dynamically update the
guest's APCB. The proper locking order is:
1. matrix_dev->guests_lock: required to use the KVM pointer to
update a KVM guest's APCB.
2. matrix_mdev->kvm->lock: required to update a KVM guest's APCB.
3. matrix_dev->mdevs_lock: required to store or access the data
stored in a struct ap_matrix_mdev instance.
Two macros are introduced to acquire and release the locks in the proper
order. These macros are now used by the group notifier functions.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
The vfio_ap device driver registers for notification when the pointer to
the KVM object for a guest is set. Recall that the KVM lock (kvm->lock)
mutex must be taken outside of the matrix_dev->lock mutex to prevent the
reporting by lockdep of a circular locking dependency (a.k.a., a lockdep
splat):
* see commit 0cc00c8d4050 ("Fix circular lockdep when setting/clearing
crypto masks")
* see commit 86956e70761b ("replace open coded locks for
VFIO_GROUP_NOTIFY_SET_KVM notification")
With the introduction of support for hot plugging/unplugging AP devices
passed through to a KVM guest, a new guests_lock mutex is introduced to
ensure the proper locking order is maintained:
struct ap_matrix_dev {
...
struct mutex guests_lock;
...
}
The matrix_dev->guests_lock controls access to the matrix_mdev instances
that hold the state for AP devices that have been passed through to a
KVM guest. This lock must be held to control access to the KVM pointer
(matrix_mdev->kvm) while the vfio_ap device driver is using it to
plug/unplug AP devices passed through to the KVM guest.
Keep in mind, the proper locking order must be maintained whenever
dynamically updating a KVM guest's APCB to plug/unplug adapters, domains
and control domains:
1. matrix_dev->guests_lock: required to use the KVM pointer - stored in
a struct ap_matrix_mdev instance - to update a KVM guest's APCB
2. matrix_mdev->kvm->lock: required to update a guest's APCB
3. matrix_dev->mdevs_lock: required to access data stored in a
struct ap_matrix_mdev instance.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
The matrix_dev->lock mutex is being renamed to matrix_dev->mdevs_lock to
better reflect its purpose, which is to control access to the state of the
mediated devices under the control of the vfio_ap device driver.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
The current implementation does not allow assignment of an AP adapter or
domain to an mdev device if each APQN resulting from the assignment
does not reference an AP queue device that is bound to the vfio_ap device
driver. This patch allows assignment of AP resources to the matrix mdev as
long as the APQNs resulting from the assignment:
1. Are not reserved by the AP BUS for use by the zcrypt device drivers.
2. Are not assigned to another matrix mdev.
The rationale behind this is that the AP architecture does not preclude
assignment of APQNs to an AP configuration profile that are not available
to the system.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
Refresh the guest's APCB by filtering the APQNs and control domain numbers
assigned to the matrix mdev.
Filtering of APQNs:
-----------------
APQNs that do not reference an AP queue device bound to the vfio_ap device
driver must be filtered from the APQNs assigned to the matrix mdev before
they can be assigned to the guest's APCB. Given that the APQNs are
configured in the guest's APCB as a matrix of APIDs (adapters) and APQIs
(domains), it is not possible to filter an individual APQN. For example,
suppose the matrix of APQNs is structured as follows:
APIDs
3 4 5
0 (3,0) (4,0) (5,0)
APQIs 1 (3,1) (4,1) (5,1)
2 (3,2) (4,2) (5,2)
Now suppose APQN (4,1) does not reference a queue device bound to the
vfio_ap device driver. If we filter APID 4, the APQNs (4,0), (4,1) and
(4,2) will be removed. Similarly, if we filter domain 1, APQNs (3,1),
(4,1) and (5,1) will be removed.
To resolve this dilemma, the choice was made to filter the APID - in this
case 4 - from the guest's APCB. The reason for this design decision is
because the APID references an AP adapter which is a real hardware device
that can be physically installed, removed, enabled or disabled; whereas, a
domain is a partition within the adapter. It therefore better reflects
reality to remove the APID from the guest's APCB.
Filtering of control domains:
----------------------------
Any control domains that are not assigned to the host's AP configuration
will be filtered from those assigned to the matrix mdev before assigning
them to the guest's APCB.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
The APCB is a field within the CRYCB that provides the AP configuration
to a KVM guest. Let's introduce a shadow copy of the KVM guest's APCB and
maintain it for the lifespan of the guest.
The shadow APCB serves the following purposes:
1. The shadow APCB can be maintained even when the mediated device is not
currently in use by a KVM guest. Since the mediated device's AP
configuration is filtered to ensure that no AP queues are passed through
to the KVM guest that are not bound to the vfio_ap device driver or
available to the host, the mediated device's AP configuration may differ
from the guest's. Having a shadow of a guest's APCB allows us to provide
a sysfs interface to view the guest's APCB even if the mediated device
is not currently passed through to a KVM guest. This can aid in
problem determination when the guest is unexpectedly missing AP
resources.
2. If filtering was done in-place for the real APCB, the guest could pick
up a transient state. Doing the filtering on a shadow and transferring
the AP configuration to the real APCB after the guest is started or when
AP resources are assigned to or unassigned from the mediated device, or
when the host configuration changes, the guest's AP configuration will
never be in a transient state.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
Let's create links between each queue device bound to the vfio_ap device
driver and the matrix mdev to which the queue's APQN is assigned. The idea
is to facilitate efficient retrieval of the objects representing the queue
devices and matrix mdevs as well as to verify that a queue assigned to
a matrix mdev is bound to the driver.
The links will be created as follows:
* When the queue device is probed, if its APQN is assigned to a matrix
mdev, the structures representing the queue device and the matrix mdev
will be linked.
* When an adapter or domain is assigned to a matrix mdev, for each new
APQN assigned that references a queue device bound to the vfio_ap
device driver, the structures representing the queue device and the
matrix mdev will be linked.
The links will be removed as follows:
* When the queue device is removed, if its APQN is assigned to a matrix
mdev, the link from the structure representing the matrix mdev to the
structure representing the queue will be removed. Since the storage
allocated for the vfio_ap_queue will be freed, there is no need to
remove the link to the matrix_mdev to which the queue's APQN is
assigned.
* When an adapter or domain is unassigned from a matrix mdev, for each
APQN unassigned that references a queue device bound to the vfio_ap
device driver, the structures representing the queue device and the
matrix mdev will be unlinked.
* When an mdev is removed, the link from any queues assigned to the mdev
to the mdev will be removed.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
Let's move the probe and remove callbacks into the vfio_ap_ops.c
file to keep all code related to managing queues in a single file. This
way, all functions related to queue management can be removed from the
vfio_ap_private.h header file defining the public interfaces for the
vfio_ap device driver.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
This patch refactors the vfio_ap device driver to use the AP bus's
ap_get_qdev() function to retrieve the vfio_ap_queue struct containing
information about a queue that is bound to the vfio_ap device driver.
The bus's ap_get_qdev() function retrieves the queue device from a
hashtable keyed by APQN. This is much more efficient than looping over
the list of devices attached to the AP bus by several orders of
magnitude.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:
- Fix building of out-of-tree kernel modules without a pre-built kernel
in case CONFIG_EXPOLINE_EXTERN=y.
- Fix a reference counting error that could prevent unloading of zcrypt
modules.
* tag 's390-5.19-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ap: fix error handling in __verify_queue_reservations()
s390/nospec: remove unneeded header includes
s390/nospec: build expoline.o for modules_prepare target
|
|
The AP bus's __verify_queue_reservations function increments the ref count
for the device driver passed in as a parameter, but fails to decrement it
before returning control to the caller. This will prevents any subsequent
removal of the module.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reported-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Fixes: 4f8206b88286 ("s390/ap: driver callback to indicate resource in use")
Link: https://lore.kernel.org/r/20220706222619.602094-1-akrowiak@linux.ibm.com
Cc: stable@vger.kernel.org
[agordeev@linux.ibm.com fixed description, added Fixes and Link]
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
When doing device passthrough where interrupts are being forwarded from
host to guest, we wish to use a pinned section of guest memory as the
vector (the same memory used by the guest as the vector). To accomplish
this, add a new parameter for airq_iv_create which allows passing an
existing vector to be used instead of allocating a new one. The caller
is responsible for ensuring the vector is pinned in memory as well as for
unpinning the memory when the vector is no longer needed.
A subsequent patch will use this new parameter for zPCI interpretation.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220606203325.110625-7-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
|
|
A subsequent patch will introduce an airq handler that requires additional
TPI information beyond directed vs floating, so pass the entire tpi_info
structure via the handler. Only pci actually uses this information today,
for the other airq handlers this is effectively a no-op.
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220606203325.110625-6-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
|
|
Detect the Adapter Interruption Suppression Interpretation facility.
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220606203325.110625-5-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
|
|
Detect the Adapter Event Notification Interpretation facility.
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220606203325.110625-4-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
|
|
Detect the Adapter Interruption Source ID Interpretation facility.
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220606203325.110625-3-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
|
|
Detect the zPCI Load/Store Interpretation facility.
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220606203325.110625-2-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
|
|
There is an unexpected word "the" in the comments that needs to be dropped.
file: ./drivers/s390/scsi/zfcp_diag.h
line: 5
* Definitions for handling diagnostics in the the zfcp device driver.
changed to
* Definitions for handling diagnostics in the zfcp device driver.
Link: https://lore.kernel.org/r/20220621114207.106405-1-jiangjian@cdjrlc.com
Link: https://lore.kernel.org/r/4c02fefa19ab46f0f163990bde3ce10bd9c7caf1.1657122360.git.bblock@linux.ibm.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
These attributes are now only accessed through the
zfcp_sysfs_sdev_attr_group.
Link: https://lore.kernel.org/r/0791b9149ebfa39e6b8eab093113cd2527dbf3d3.1657122360.git.bblock@linux.ibm.com
Fixes: d8d7cf3f7d07 ("scsi: zfcp: Switch to attribute groups")
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Part of the confusion that has existed is the FSM lifecycle of
subchannels between the common CSS driver and the vfio-ccw driver.
During configuration, the FSM state goes from NOT_OPER to STANDBY
to IDLE, but then back to NOT_OPER. For example:
vfio_ccw_sch_probe: VFIO_CCW_STATE_NOT_OPER
vfio_ccw_sch_probe: VFIO_CCW_STATE_STANDBY
vfio_ccw_mdev_probe: VFIO_CCW_STATE_IDLE
vfio_ccw_mdev_remove: VFIO_CCW_STATE_NOT_OPER
vfio_ccw_sch_remove: VFIO_CCW_STATE_NOT_OPER
vfio_ccw_sch_shutdown: VFIO_CCW_STATE_NOT_OPER
Rearrange the open/close events to align with the mdev open/close,
to better manage the memory and state of the devices as time
progresses. Specifically, make mdev_open() perform the FSM open,
and mdev_close() perform the FSM close instead of reset (which is
both close and open).
This makes the NOT_OPER state a dead-end path, indicating the
device is probably not recoverable without fully probing and
re-configuring the device.
This has the nice side-effect of removing a number of special-cases
where the FSM state is managed outside of the FSM itself (such as
the aforementioned mdev_close() routine).
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220707135737.720765-12-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Use both the FSM Close and Open events when resetting an mdev,
rather than making a separate call to cio_enable_subchannel().
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220707135737.720765-11-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Refactor the vfio_ccw_sch_quiesce() routine to extract the bit that
disables the subchannel and affects the FSM state. Use this to form
the basis of a CLOSE event that will mirror the OPEN event, and move
the subchannel back to NOT_OPER state.
A key difference with that mirroring is that while OPEN handles the
transition from NOT_OPER => STANDBY, the later probing of the mdev
handles the transition from STANDBY => IDLE. On the other hand,
the CLOSE event will move from one of the operating states {IDLE,
CP_PROCESSING, CP_PENDING} => NOT_OPER. That is, there is no stop
in a STANDBY state on the deconfigure path.
Add a call to cp_free() in this event, such that it is captured for
the various permutations of this event.
In the unlikely event that cio_disable_subchannel() returns -EBUSY,
the remaining logic of vfio_ccw_sch_quiesce() can still be used.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220707135737.720765-10-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Move the process of enabling a subchannel for use by vfio-ccw
into the FSM, such that it can manage the sequence of lifecycle
events for the device.
That is, if the FSM state is NOT_OPER(erational), then do the work
that would enable the subchannel and move the FSM to STANDBY state.
An attempt to perform this event again from any of the other operating
states (IDLE, CP_PROCESSING, CP_PENDING) will convert the device back
to NOT_OPER so the configuration process can be started again.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220707135737.720765-9-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
We currently cut a very basic trace whenever the FSM directs
control to the not operational routine.
Convert this to a message, so it's alongside the other configuration
related traces (create, remove, etc.), and record both the event
that brought us here and the current state of the device.
This will provide some better footprints if things go bad.
Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220707135737.720765-8-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The vfio_ccw_mdev_(un)reg routines are merely vfio-ccw routines that
pass control to mdev_(un)register_device. Since there's only one
caller of each, let's just call the mdev routines directly.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220707135737.720765-7-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The FSM has an enumerated list of events defined.
Use that as the argument passed to the jump table,
instead of a regular int.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220707135737.720765-6-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
There are no remaining users of private->mdev. Remove it.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220707135737.720765-5-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The routine vfio_ccw_sch_event() is tasked with handling subchannel events,
specifically machine checks, on behalf of vfio-ccw. It correctly calls
cio_update_schib(), and if that fails (meaning the subchannel is gone)
it makes an FSM event call to mark the subchannel Not Operational.
If that worked, however, then it decides that if the FSM state was already
Not Operational (implying the subchannel just came back), then it should
simply change the FSM to partially- or fully-open.
Remove this trickery, since a subchannel returning will require more
probing than simply "oh all is well again" to ensure it works correctly.
Fixes: bbe37e4cb8970 ("vfio: ccw: introduce a finite state machine")
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220707135737.720765-4-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The FSM is in STANDBY state when arriving in vfio_ccw_mdev_probe(),
and this routine converts it to IDLE as part of its processing.
The error exit sets it to IDLE (again) but clears the private->mdev
pointer.
The FSM should of course be managing the state itself, but the
correct thing for vfio_ccw_mdev_probe() to do would be to put
the state back the way it found it.
The corresponding check of private->mdev in vfio_ccw_sch_io_todo()
can be removed, since the distinction is unnecessary at this point.
Fixes: 3bf1311f351ef ("vfio/ccw: Convert to use vfio_register_emulated_iommu_dev()")
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220707135737.720765-3-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
As vfio-ccw devices are created/destroyed, the uuid of the associated
mdevs that are recorded in $S390DBF/vfio_ccw_msg/sprintf get lost.
This is because a pointer to the UUID is stored instead of the UUID
itself, and that memory may have been repurposed if/when the logs are
examined. The result is usually garbage UUID data in the logs, though
there is an outside chance of an oops happening here.
Simply remove the UUID from the traces, as the subchannel number will
provide useful configuration information for problem determination,
and is stored directly into the log instead of a pointer.
As we were the only consumer of mdev_uuid(), remove that too.
Cc: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Michael Kawano <mkawano@linux.ibm.com>
Fixes: 60e05d1cf0875 ("vfio-ccw: add some logging")
Fixes: b7701dfbf9832 ("vfio-ccw: Register a chp_event callback for vfio-ccw")
[farman: reworded commit message, added Fixes: tags]
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Link: https://lore.kernel.org/r/20220707135737.720765-2-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
With new API blk_mq_is_reserved_rq() we can tell if a request is from
the reserved pool, so stop passing 'reserved' arg. There is actually
only a single user of that arg for all the callback implementations, which
can use blk_mq_is_reserved_rq() instead.
This will also allow us to stop passing the same 'reserved' around the
blk-mq iter functions next.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/1657109034-206040-4-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:
- Fix purgatory build process so bin2c tool does not get built
unnecessarily and the Makefile is more consistent with other
architectures.
- Return earlier simple design of arch_get_random_seed_long|int() and
arch_get_random_long|int() callbacks as result of changes in generic
RNG code.
- Fix minor comment typos and spelling mistakes.
* tag 's390-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/qdio: Fix spelling mistake
s390/sclp: Fix typo in comments
s390/archrandom: simplify back to earlier design and initialize earlier
s390/purgatory: remove duplicated build rule of kexec-purgatory.o
s390/purgatory: hard-code obj-y in Makefile
s390: remove unneeded 'select BUILD_BIN2C'
|
|
Remove the repeated word 'and' from comments
Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220622142713.14187-1-jiangjian@cdjrlc.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
blk_cleanup_disk is nothing but a trivial wrapper for put_disk now,
so remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Set the queue dying flag and call blk_mq_exit_queue from del_gendisk for
all disks that do not have separately allocated queues, and thus remove
the need to call blk_cleanup_queue for them.
Rename blk_cleanup_disk to blk_mq_destroy_queue to make it clear that
this function is intended only for separately allocated blk-mq queues.
This saves an extra queue freeze for devices without a separately
allocated queue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20220619060552.1850436-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The use of 'extern' in function prototypes has been disrecommended in
the kernel coding style for several years now, remove them from all vfio
related files so contributors no longer need to decide between style and
consistency.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/165471414407.203056.474032786990662279.stgit@omen
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
We try to harden virtio device notifications in 8b4ec69d7e09 ("virtio:
harden vring IRQ"). It works with the assumption that the driver or
core can properly call virtio_device_ready() at the right
place. Unfortunately, this seems to be not true and uncover various
bugs of the existing drivers, mainly the issue of using
virtio_device_ready() incorrectly.
So let's add a Kconfig option and disable it by default. It gives
us time to fix the drivers and then we can consider re-enabling it.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220622012940.21441-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core changes for 5.19-rc1.
Lots of tiny driver core changes and cleanups happened this cycle, but
the two major things are:
- firmware_loader reorganization and additions including the ability
to have XZ compressed firmware images and the ability for userspace
to initiate the firmware load when it needs to, instead of being
always initiated by the kernel. FPGA devices specifically want this
ability to have their firmware changed over the lifetime of the
system boot, and this allows them to work without having to come up
with yet-another-custom-uapi interface for loading firmware for
them.
- physical location support added to sysfs so that devices that know
this information, can tell userspace where they are located in a
common way. Some ACPI devices already support this today, and more
bus types should support this in the future.
Smaller changes include:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten
any linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time-outs"
* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
driver core: fix deadlock in __device_attach
kernfs: Separate kernfs_pr_cont_buf and rename_lock.
topology: Remove unused cpu_cluster_mask()
driver core: Extend deferred probe timeout on driver registration
MAINTAINERS: add Russ Weight as a firmware loader maintainer
driver: base: fix UAF when driver_attach failed
test_firmware: fix end of loop test in upload_read_show()
driver core: location: Add "back" as a possible output for panel
driver core: location: Free struct acpi_pld_info *pld
driver core: Add "*" wildcard support to driver_async_probe cmdline param
driver core: location: Check for allocations failure
arch_topology: Trace the update thermal pressure
kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
export: fix string handling of namespace in EXPORT_SYMBOL_NS
rpmsg: use local 'dev' variable
rpmsg: Fix calling device_lock() on non-initialized device
firmware_loader: describe 'module' parameter of firmware_upload_register()
firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
firmware_loader: Fix configs for sysfs split
selftests: firmware: Add firmware upload selftests
...
|