Age | Commit message (Collapse) | Author |
|
This commit enables basic support for Hopper GPUs, and is intended
primarily as a base supporting Blackwell GPUs, which reuse most of
the code added here.
Advanced features such as Confidential Compute are not supported.
Beyond a few miscellaneous register moves and HW class ID plumbing,
the bulk of the changes implemented here are to support the GSP-RM
boot sequence used on Hopper/Blackwell GPUs, as well as a new page
table layout.
There should be no changes here that impact prior GPUs.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Co-developed-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This is a hack around a bug exposed with the GSP code, I'm not sure
what is happening exactly, but it appears some of our flushes don't
result in proper tlb invalidation for out BAR2 and we get a BAR2
fault from GSP and it all dies.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130010852.4034774-1-airlied@gmail.com
|
|
- Valid VRAM regions are read from GSP-RM, and used to construct our MM
- BAR1/BAR2 VMMs modified to be shared with RM
- Client VMMs have RM VASPACE objects created for them
- Adds FBSR to backup system objects in VRAM across suspend
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-37-skeggsb@gmail.com
|
|
This was cargo-culted from traces of RM when the code was written, but
we probably shouldn't be touching NV_PFB regs while GSP-RM is running.
From traces, it looks like NVIDIA dropped this sometime between 510.54
and 515.48.07, so I guess we can too.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-2-skeggsb@gmail.com
|
|
Fixes some issues when running on top of RM.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Danilo Krummrich <me@dakr.org>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230919220442.202488-5-lyude@redhat.com
|
|
nvkm_subdev.mutex is going away.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
This causes us to invalidate MMU only at the level we made modifications -
ie: if we've only modified PTEs, there's no need to have MMU dump the PDs
it's fetched into L2.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Some GPU units are capable of supporting "replayable" page faults, where
the execution unit will wait for SW to fixup GPU page tables rather than
triggering a channel-fatal fault.
This feature isn't useful (it's harmful, even) unless something like HMM
is being used to manage events appearing in the replayable fault buffer,
so, it's disabled by default.
This commit allows a client to request it be enabled.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Host methods exist to do at least some of what we need, but we are not
currently pushing replay/cancels through a channel like UVM does as it's
not clear whether it's necessary in our case (UVM also updates PTEs with
the GPU).
UVM also pushes a software method for fault cancels on Pascal, seemingly
because the host methods don't appear to be sufficient. If/when we want
to push the replay/cancel on the GPU, we can re-purpose the cancellation
code here to implement that swmthd.
Keep it simple for now, until we figure out exactly what we need here.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
NVKM is currently responsible for managing the allocation of a client's
GPU address-space, but there's various use-cases (ie. HMM address-space
mirroring) where giving a client more direct control is desirable.
This commit allows for a VMM to be created where the area allocated for
NVKM is limited to a client-specified window, the remainder of address-
space is controlled directly by the client.
Leaving a window is necessary to support various internal requirements,
but also to support existing allocation interfaces as not all of the HW
is capable of working with a HMM allocation.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|