summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/nouveau/nouveau_svm.c
AgeCommit message (Collapse)Author
2020-10-30drm/nouveau/nouveau: fix the start/end range for migrationRalph Campbell
The user level OpenCL code shouldn't have to align start and end addresses to a page boundary. That is better handled in the nouveau driver. The npages field is also redundant since it can be computed from the start and end addresses. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-08-05Merge tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm updates from Dave Airlie: "New xilinx displayport driver, AMD support for two new GPUs (more header files), i915 initial support for RocketLake and some work on their DG1 (discrete chip). The core also grew some lockdep annotations to try and constrain what drivers do with dma-fences, and added some documentation on why the idea of indefinite fences doesn't work. The long list is below. I do have some fixes trees outstanding, but I'll follow up with those later. core: - add user def flag to cmd line modes - dma_fence_wait added might_sleep - dma-fence lockdep annotations - indefinite fences are bad documentation - gem CMA functions used in more drivers - struct mutex removal - more drm_ debug macro usage - set/drop master api fixes - fix for drm/mm hole size comparison - drm/mm remove invalid entry optimization - optimise drm/mm hole handling - VRR debugfs added - uncompressed AFBC modifier support - multiple display id blocks in EDID - multiple driver sg handling fixes - __drm_atomic_helper_crtc_reset in all drivers - managed vram helpers ttm: - ttm_mem_reg handling cleanup - remove bo offset field - drop CMA memtype flag - drop mappable flag xilinx: - New Xilinx ZynqMP DisplayPort Subsystem driver nouveau: - add CRC support - start using NVIDIA published class header files - convert all push buffer emission to new macros - Proper push buffer space management for EVO/NVD channels. - firmware loading fixes - 2MiB system memory pages support on Pascal and newer vkms: - larger cursor support i915: - Rocketlake platform enablement - Early DG1 enablement - Numerous GEM refactorings - DP MST fixes - FBC, PSR, Cursor, Color, Gamma fixes - TGL, RKL, EHL workaround updates - TGL 8K display support fixes - SDVO/HDMI/DVI fixes amdgpu: - Initial support for Sienna Cichlid GPU - Initial support for Navy Flounder GPU - SI UVD/VCE support - expose rotation property - Add support for unique id on Arcturus - Enable runtime PM on vega10 boards that support BACO - Skip BAR resizing if the bios already did id - Major swSMU code cleanup - Fixes for DCN bandwidth calculations amdkfd: - Track SDMA usage per process - SMI events interface radeon: - Default to on chip GART for AGP boards on all arches - Runtime PM reference count fixes msm: - headers regenerated causing churn - a650/a640 display and GPU enablement - dpu dither support for 6bpc panels - dpu cursor fix - dsi/mdp5 enablement for sdm630/sdm636/sdm66 tegra: - video capture prep support - reflection support mediatek: - convert mtk_dsi to bridge API meson: - FBC support sun4i: - iommu support rockchip: - register locking fix - per-pixel alpha support PX30 VOP mgag200: - ported to simple and shmem helpers - device init cleanups - use managed pci functions - dropped hw cursor support ast: - use managed pci functions - use managed VRAM helpers - rework cursor support malidp: - dev_groups support hibmc: - refactor hibmc_drv_vdac: vc4: - create TXP CRTC imx: - error path fixes and cleanups etnaviv: - clock handling and error handling cleanups - use pin_user_pages" * tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm: (1747 commits) drm/msm: use kthread_create_worker instead of kthread_run drm/msm/mdp5: Add MDP5 configuration for SDM636/660 drm/msm/dsi: Add DSI configuration for SDM660 drm/msm/mdp5: Add MDP5 configuration for SDM630 drm/msm/dsi: Add phy configuration for SDM630/636/660 drm/msm/a6xx: add A640/A650 hwcg drm/msm/a6xx: hwcg tables in gpulist drm/msm/dpu: add SM8250 to hw catalog drm/msm/dpu: add SM8150 to hw catalog drm/msm/dpu: intf timing path for displayport drm/msm/dpu: set missing flush bits for INTF_2 and INTF_3 drm/msm/dpu: don't use INTF_INPUT_CTRL feature on sdm845 drm/msm/dpu: move some sspp caps to dpu_caps drm/msm/dpu: update UBWC config for sm8150 and sm8250 drm/msm/dpu: use right setup_blend_config for sm8150 and sm8250 drm/msm/a6xx: set ubwc config for A640 and A650 drm/msm/adreno: un-open-code some packets drm/msm: sync generated headers drm/msm/a6xx: add build_bw_table for A640/A650 drm/msm/a6xx: fix crashstate capture for A650 ...
2020-08-05Merge tag 'for-linus-hmm' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull hmm updates from Jason Gunthorpe: "Ralph has been working on nouveau's use of hmm_range_fault() and migrate_vma() which resulted in this small series. It adds reporting of the page table order from hmm_range_fault() and some optimization of migrate_vma(): - Report the size of the page table mapping out of hmm_range_fault(). This makes it easier to establish a large/huge/etc mapping in the device's page table. - Allow devices to ignore the invalidations during migration in cases where the migration is not going to change pages. For instance migrating pages to a device does not require the device to invalidate pages already in the device. - Update nouveau and hmm_tests to use the above" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: mm/hmm/test: use the new migration invalidation nouveau/svm: use the new migration invalidation mm/notifier: add migration invalidation type mm/migrate: add a flags parameter to migrate_vma nouveau: fix storing invalid ptes nouveau/hmm: support mapping large sysmem pages nouveau: fix mapping 2MB sysmem pages nouveau/hmm: fault one page at a time mm/hmm: add tests for hmm_pfn_to_map_order() mm/hmm: provide the page mapping order in hmm_range_fault()
2020-07-28nouveau/svm: use the new migration invalidationRalph Campbell
Use the new MMU_NOTIFY_MIGRATE event to skip GPU MMU invalidations of device private memory and handle the invalidation in the driver as part of migrating device private memory. Link: https://lore.kernel.org/r/20200723223004.9586-5-rcampbell@nvidia.com Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-24drm/nouveau/nvif: give every notify object a human-readable nameBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2020-07-24drm/nouveau/nvif: give every vmm object a human-readable identifierBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2020-07-24drm/nouveau/nvif: give every object a human-readable identifierBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2020-07-10nouveau/hmm: support mapping large sysmem pagesRalph Campbell
Nouveau currently only supports mapping PAGE_SIZE sized pages of system memory when shared virtual memory (SVM) is enabled. Use the new hmm_pfn_to_map_order() function to support mapping system memory pages that are PMD_SIZE. Link: https://lore.kernel.org/r/20200701225352.9649-5-rcampbell@nvidia.com Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10nouveau/hmm: fault one page at a timeRalph Campbell
The SVM page fault handler groups faults into a range of contiguous virtual addresses and requests hmm_range_fault() to populate and return the page frame number of system memory mapped by the CPU. In preparation for supporting large pages to be mapped by the GPU, process faults one page at a time. In addition, use the hmm_range default_flags to fix a corner case where the input hmm_pfns array is not reinitialized after hmm_range_fault() returns -EBUSY and must be called again. Link: https://lore.kernel.org/r/20200701225352.9649-2-rcampbell@nvidia.com Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-08drm/nouveau/nouveau: fix page fault on device private memoryRalph Campbell
If system memory is migrated to device private memory and no GPU MMU page table entry exists, the GPU will fault and call hmm_range_fault() to get the PFN for the page. Since the .dev_private_owner pointer in struct hmm_range is not set, hmm_range_fault returns an error which results in the GPU program stopping with a fatal fault. Fix this by setting .dev_private_owner appropriately. Fixes: 08ddddda667b ("mm/hmm: check the device private page owner in hmm_range_fault()") Cc: stable@vger.kernel.org Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-06-09mmap locking API: use coccinelle to convert mmap_sem rwsem call sitesMichel Lespinasse
This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock | -down_write +mmap_write_lock | -down_write_killable +mmap_write_lock_killable | -down_write_trylock +mmap_write_trylock | -up_write +mmap_write_unlock | -downgrade_write +mmap_write_downgrade | -down_read +mmap_read_lock | -down_read_killable +mmap_read_lock_killable | -down_read_trylock +mmap_read_trylock | -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02Merge tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm updates from Dave Airlie: "Highlights: - Core DRM had a lot of refactoring around managed drm resources to make drivers simpler. - Intel Tigerlake support is on by default - amdgpu now support p2p PCI buffer sharing and encrypted GPU memory Details: core: - uapi: error out EBUSY when existing master - uapi: rework SET/DROP MASTER permission handling - remove drm_pci.h - drm_pci* are now legacy - introduced managed DRM resources - subclassing support for drm_framebuffer - simple encoder helper - edid improvements - vblank + writeback documentation improved - drm/mm - optimise tree searches - port drivers to use devm_drm_dev_alloc dma-buf: - add flag for p2p buffer support mst: - ACT timeout improvements - remove drm_dp_mst_has_audio - don't use 2nd TX slot - spec recommends against it bridge: - dw-hdmi various improvements - chrontel ch7033 support - fix stack issues with old gcc hdmi: - add unpack function for drm infoframe fbdev: - misc fbdev driver fixes i915: - uapi: global sseu pinning - uapi: OA buffer polling - uapi: remove generated perf code - uapi: per-engine default property values in sysfs - Tigerlake GEN12 enabled. - Lots of gem refactoring - Tigerlake enablement patches - move to drm_device logging - Icelake gamma HW readout - push MST link retrain to hotplug work - bandwidth atomic helpers - ICL fixes - RPS/GT refactoring - Cherryview full-ppgtt support - i915 locking guidelines documented - require linear fb stride to be 512 multiple on gen9 - Tigerlake SAGV support amdgpu: - uapi: encrypted GPU memory handling - uapi: add MEM_SYNC IB flag - p2p dma-buf support - export VRAM dma-bufs - FRU chip access support - RAS/SR-IOV updates - Powerplay locking fixes - VCN DPG (powergating) enablement - GFX10 clockgating fixes - DC fixes - GPU reset fixes - navi SDMA fix - expose FP16 for modesetting - DP 1.4 compliance fixes - gfx10 soft recovery - Improved Critical Thermal Faults handling - resizable BAR on gmc10 amdkfd: - uapi: GWS resource management - track GPU memory per process - report PCI domain in topology radeon: - safe reg list generator fixes nouveau: - HD audio fixes on recent systems - vGPU detection (fail probe if we're on one, for now) - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it) - SVM improvements/fixes - NVIDIA format modifier support - Misc other fixes. adv7511: - HDMI SPDIF support ast: - allocate crtc state size - fix double assignment - fix suspend bochs: - drop connector register cirrus: - move to tiny drivers. exynos: - fix imported dma-buf mapping - enable runtime PM - fixes and cleanups mediatek: - DPI pin mode swap - config mipi_tx current/impedance lima: - devfreq + cooling device support - task handling improvements - runtime PM support pl111: - vexpress init improvements - fix module auto-load rcar-du: - DT bindings conversion to YAML - Planes zpos sanity check and fix - MAINTAINERS entry for LVDS panel driver mcde: - fix return value mgag200: - use managed config init stm: - read endpoints from DT vboxvideo: - use PCI managed functions - drop WC mtrr vkms: - enable cursor by default rockchip: - afbc support virtio: - various cleanups qxl: - fix cursor notify port hisilicon: - 128-byte stride alignment fix sun4i: - improved format handling" * tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm: (1401 commits) drm/amd/display: Fix potential integer wraparound resulting in a hang drm/amd/display: drop cursor position check in atomic test drm/amdgpu: fix device attribute node create failed with multi gpu drm/nouveau: use correct conflicting framebuffer API drm/vblank: Fix -Wformat compile warnings on some arches drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode drm/amd/display: Handle GPU reset for DC block drm/amdgpu: add apu flags (v2) drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven drm/amdgpu: fix pm sysfs node handling (v2) drm/amdgpu: move gpu_info parsing after common early init drm/amdgpu: move discovery gfx config fetching drm/nouveau/dispnv50: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau/debugfs: fix runtime pm imbalance on error drm/nouveau/nouveau/hmm: fix migrate zero page to GPU drm/nouveau/nouveau/hmm: fix nouveau_dmem_chunk allocations drm/nouveau/kms/nv50-: Share DP SST mode_valid() handling with MST drm/nouveau/kms/nv50-: Move 8BPC limit for MST into nv50_mstc_get_modes() ...
2020-05-22drm/nouveau/svm: map pages after migrationRalph Campbell
When memory is migrated to the GPU, it is likely to be accessed by GPU code soon afterwards. Instead of waiting for a GPU fault, map the migrated memory into the GPU page tables with the same access permissions as the source CPU page table entries. This preserves copy on write semantics. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-11mm/hmm: remove the customizable pfn format from hmm_range_faultJason Gunthorpe
Presumably the intent here was that hmm_range_fault() could put the data into some HW specific format and thus avoid some work. However, nothing actually does that, and it isn't clear how anything actually could do that as hmm_range_fault() provides CPU addresses which must be DMA mapped. Perhaps there is some special HW that does not need DMA mapping, but we don't have any examples of this, and the theoretical performance win of avoiding an extra scan over the pfns array doesn't seem worth the complexity. Plus pfns needs to be scanned anyhow to sort out any DEVICE_PRIVATE pages. This version replaces the uint64_t with an usigned long containing a pfn and fixed flags. On input flags is filled with the HMM_PFN_REQ_* values, on successful output it is filled with HMM_PFN_* values, describing the state of the pages. amdgpu is simple to convert, it doesn't use snapshot and doesn't use per-page flags. nouveau uses only 16 hmm_pte entries at most (ie fits in a few cache lines), and it sweeps over its pfns array a couple of times anyhow. It also has a nasty call chain before it reaches the dma map and hardware suggesting performance isn't important: nouveau_svm_fault(): args.i.m.method = NVIF_VMM_V0_PFNMAP nouveau_range_fault() nvif_object_ioctl() client->driver->ioctl() struct nvif_driver nvif_driver_nvkm: .ioctl = nvkm_client_ioctl nvkm_ioctl() nvkm_ioctl_path() nvkm_ioctl_v0[type].func(..) nvkm_ioctl_mthd() nvkm_object_mthd() struct nvkm_object_func nvkm_uvmm: .mthd = nvkm_uvmm_mthd nvkm_uvmm_mthd() nvkm_uvmm_mthd_pfnmap() nvkm_vmm_pfn_map() nvkm_vmm_ptes_get_map() func == gp100_vmm_pgt_pfn struct nvkm_vmm_desc_func gp100_vmm_desc_spt: .pfn = gp100_vmm_pgt_pfn nvkm_vmm_iter() REF_PTES == func == gp100_vmm_pgt_pfn() dma_map_page() Link: https://lore.kernel.org/r/5-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-11mm/hmm: remove HMM_PFN_SPECIALJason Gunthorpe
This is just an alias for HMM_PFN_ERROR, nothing cares that the error was because of a special page vs any other error case. Link: https://lore.kernel.org/r/4-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-11mm/hmm: make hmm_range_fault return 0 or -1Jason Gunthorpe
hmm_vma_walk->last is supposed to be updated after every write to the pfns, so that it can be returned by hmm_range_fault(). However, this is not done consistently. Fortunately nothing checks the return code of hmm_range_fault() for anything other than error. More importantly last must be set before returning -EBUSY as it is used to prevent reading an output pfn as an input flags when the loop restarts. For clarity and simplicity make hmm_range_fault() return 0 or -ERRNO. Only set last when returning -EBUSY. Link: https://lore.kernel.org/r/2-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-07Merge tag 'drm-next-2020-04-08' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "This is a set of fixes that have queued up, I think I might have another pull with some more before rc1 but I'd like to dequeue what I have now just in case Easter is more eggciting that expected. The main thing in here is a fix for a longstanding nouveau power management issues on certain laptops, it should help runtime suspend/resume for a lot of people. There is also a reverted patch for some drm_mm behaviour in atomic contexts. Summary: core: - revert drm_mm atomic patch - dt binding fixes fbcon: - null ptr error fix i915: - GVT fixes nouveau: - runpm fix - svm fixes amdgpu: - HDCP fixes - gfx10 fix - Misc display fixes - BACO fixes amdkfd: - Fix memory leak vboxvideo: - remove conflicting fbs vc4: - mode validation fix xen: - fix PTR_ERR usage" * tag 'drm-next-2020-04-08' of git://anongit.freedesktop.org/drm/drm: (41 commits) drm/nouveau/kms/nv50-: wait for FIFO space on PIO channels drm/nouveau/nvif: protect waits against GPU falling off the bus drm/nouveau/nvif: access PTIMER through usermode class, if available drm/nouveau/gr/gp107,gp108: implement workaround for HW hanging during init drm/nouveau: workaround runpm fail by disabling PCI power management on certain intel bridges drm/nouveau/svm: remove useless SVM range check drm/nouveau/svm: check for SVM initialized before migrating drm/nouveau/svm: fix vma range check for migration drm/nouveau: remove checks for return value of debugfs functions drm/nouveau/ttm: evict other IO mappings when running out of BAR1 space drm/amdkfd: kfree the wrong pointer drm/amd/display: increase HDCP authentication delay drm/amd/display: Correctly cancel future watchdog and callback events drm/amd/display: Don't try hdcp1.4 when content_type is set to type1 drm/amd/powerplay: move the ASIC specific nbio operation out of smu_v11_0.c drm/amd/powerplay: drop redundant BIF doorbell interrupt operations drm/amd/display: Fix dcn21 num_states drm/amd/display: Enable BT2020 in COLOR_ENCODING property drm/amd/display: LFC not working on 2.0x range monitors (v2) drm/amd/display: Support plane level CTM ...
2020-04-07drm/nouveau/svm: remove useless SVM range checkRalph Campbell
When nouveau processes GPU faults, it checks to see if the fault address falls within the "unmanaged" range which is reserved for fixed allocations instead of addresses chosen by the core mm code. If start is greater than or equal to svmm->unmanaged.limit, then limit will also be greater than svmm->unmanaged.limit which is greater than svmm->unmanaged.start and the start = max_t(u64, start, svmm->unmanaged.limit) will change nothing. Just remove the useless lines of code. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-04-07drm/nouveau/svm: check for SVM initialized before migratingRalph Campbell
When migrating system memory to GPU memory, check that SVM has been enabled. Even though most errors can be ignored since migration is a performance optimization, return an error because this is a violation of the API. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-04-07drm/nouveau/svm: fix vma range check for migrationRalph Campbell
find_vma_intersection(mm, start, end) only guarantees that end is greater than or equal to vma->vm_start but doesn't guarantee that start is greater than or equal to vma->vm_start. The calculation for the intersecting range in nouveau_svmm_bind() isn't accounting for this and can call migrate_vma_setup() with a starting address less than vma->vm_start. This results in migrate_vma_setup() returning -EINVAL for the range instead of nouveau skipping that part of the range and migrating the rest. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-03-27mm/hmm: remove HMM_FAULT_SNAPSHOTJason Gunthorpe
Now that flags are handled on a fine-grained per-page basis this global flag is redundant and has a confusing overlap with the pfn_flags_mask and default_flags. Normalize the HMM_FAULT_SNAPSHOT behavior into one place. Callers needing the SNAPSHOT behavior should set a pfn_flags_mask and default_flags that always results in a cleared HMM_PFN_VALID. Then no pages will be faulted, and HMM_FAULT_SNAPSHOT is not a special flow that overrides the masking mechanism. As this is the last flag, also remove the flags argument. If future flags are needed they can be part of the struct hmm_range function arguments. Link: https://lore.kernel.org/r/20200327200021.29372-5-jgg@ziepe.ca Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26mm: simplify device private page handling in hmm_range_faultChristoph Hellwig
Remove the HMM_PFN_DEVICE_PRIVATE flag, no driver has ever set this flag on input, and the only place that uses it on output can be trivially changed to use is_device_private_page(). This removes the ability to request that device_private pages are faulted back into system memory. Link: https://lore.kernel.org/r/20200316193216.920734-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23nouveau: use mmu_interval_notifier instead of hmm_mirrorJason Gunthorpe
Remove the hmm_mirror object and use the mmu_interval_notifier API instead for the range, and use the normal mmu_notifier API for the general invalidation callback. While here re-organize the pagefault path so the locking pattern is clear. nouveau is the only driver that uses a temporary range object and instead forwards nearly every invalidation range directly to the HW. While this is not how the mmu_interval_notifier was intended to be used, the overheads on the pagefaulting path are similar to the existing hmm_mirror version. Particularly since the interval tree will be small. Link: https://lore.kernel.org/r/20191112202231.3856-10-jgg@ziepe.ca Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23nouveau: use mmu_notifier directly for invalidate_range_startJason Gunthorpe
There is no reason to get the invalidate_range_start() callback via an indirection through hmm_mirror, just register a normal notifier directly. Link: https://lore.kernel.org/r/20191112202231.3856-9-jgg@ziepe.ca Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-07mm/hmm: remove the page_shift member from struct hmm_rangeChristoph Hellwig
All users pass PAGE_SIZE here, and if we wanted to support single entries for huge pages we should really just add a HMM_FAULT_HUGEPAGE flag instead that uses the huge page size instead of having the caller calculate that size once, just for the hmm code to verify it. Link: https://lore.kernel.org/r/20190806160554.14046-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-07mm/hmm: remove superfluous arguments from hmm_range_registerChristoph Hellwig
The start, end and page_shift values are all saved in the range structure, so we might as well use that for argument passing. Link: https://lore.kernel.org/r/20190806160554.14046-7-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-07nouveau: pass struct nouveau_svmm to nouveau_range_faultChristoph Hellwig
We'll need the nouveau_svmm structure to improve the function soon. For now this allows using the svmm->mm reference to unlock the mmap_sem, and thus the same dereference chain that the caller uses to lock and unlock it. Link: https://lore.kernel.org/r/20190806160554.14046-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-26mm/hmm: remove hmm_range vmaRalph Campbell
Since hmm_range_fault() doesn't use the struct hmm_range vma field, remove it. Link: https://lore.kernel.org/r/20190726005650.2566-8-rcampbell@nvidia.com Suggested-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-26mm/hmm: replace the block argument to hmm_range_fault with a flags valueChristoph Hellwig
This allows easier expansion to other flags, and also makes the callers a little easier to read. Link: https://lore.kernel.org/r/20190726005650.2566-4-rcampbell@nvidia.com Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-26mm/hmm: replace hmm_update with mmu_notifier_rangeRalph Campbell
The hmm_mirror_ops callback function sync_cpu_device_pagetables() passes a struct hmm_update which is a simplified version of struct mmu_notifier_range. This is unnecessary so replace hmm_update with mmu_notifier_range directly. Link: https://lore.kernel.org/r/20190726005650.2566-2-rcampbell@nvidia.com Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> [jgg: white space tuning] Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-25nouveau: return -EBUSY when hmm_range_wait_until_valid failsChristoph Hellwig
-EAGAIN has a magic meaning for non-blocking faults, so don't overload it. Given that the caller doesn't check for specific error codes this change is purely cosmetic. Link: https://lore.kernel.org/r/20190724065258.16603-6-hch@lst.de Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-25nouveau: unlock mmap_sem on all errors from nouveau_range_faultChristoph Hellwig
Currently nouveau_svm_fault expects nouveau_range_fault to never unlock mmap_sem, but the latter unlocks it for a random selection of error codes. Fix this up by always unlocking mmap_sem for non-zero return values in nouveau_range_fault, and only unlocking it in the caller for successful returns. Link: https://lore.kernel.org/r/20190724065258.16603-5-hch@lst.de Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-25nouveau: remove the block parameter to nouveau_range_faultChristoph Hellwig
The parameter is always false, so remove it as well as the -EAGAIN handling that can only happen for the non-blocking case. Link: https://lore.kernel.org/r/20190724065258.16603-4-hch@lst.de Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-25mm/hmm: move hmm_vma_range_done and hmm_vma_fault to nouveauChristoph Hellwig
These two functions are marked as a legacy APIs to get rid of, but seem to suit the current nouveau flow. Move it to the only user in preparation for fixing a locking bug involving caller and callee. All comments referring to the old API have been removed as this now is a driver private helper. Link: https://lore.kernel.org/r/20190724065258.16603-3-hch@lst.de Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_registerJason Gunthorpe
Ralph observes that hmm_range_register() can only be called by a driver while a mirror is registered. Make this clear in the API by passing in the mirror structure as a parameter. This also simplifies understanding the lifetime model for struct hmm, as the hmm pointer must be valid as part of a registered mirror so all we need in hmm_register_range() is a simple kref_get. Suggested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-02-20drm/nouveau/svm: new ioctl to migrate process memory to GPU memoryJérôme Glisse
This add an ioctl to migrate a range of process address space to the device memory. On platform without cache coherent bus (x86, ARM, ...) this means that CPU can not access that range directly, instead CPU will fault which will migrate the memory back to system memory. This is behind a staging flag so that we can evolve the API. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2019-02-20drm/nouveau/dmem: device memory helpers for SVMJérôme Glisse
Device memory can be use in SVM, in which case we do not have any of the existing buffer object. This commit add infrastructure to allow use of device memory without nouveau_bo. Again this is a temporary solution until a rework of GPU memory management. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2019-02-20drm/nouveau/svm: initial support for shared virtual memoryBen Skeggs
This uses HMM to mirror a process' CPU page tables into a channel's page tables, and keep them synchronised so that both the CPU and GPU are able to access the same memory at the same virtual address. While this code also supports Volta/Turing, it's only enabled for Pascal GPUs currently due to channel recovery being unreliable right now on the later GPUs. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>