Age | Commit message (Collapse) | Author |
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
appropriate
To avoid wasting compression tags when using 64KiB pages, we need to
enable this so we can select between upper/lower comptagline in PTEs.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Another transition step to allow finer-grained patches transitioning to
new MMU backends.
Old backends will continue operate as before (accessing nvkm_mem::tag),
and new backends will get a reference to the tags allocated here.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This is a transition step, to enable finer-grained commits while
transitioning to new MMU interfaces.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Upcoming MMU changes use nvkm_memory as its basic representation of memory,
so we need to be able to allocate VRAM like this.
The code is basically identical to the current chipset-specific allocators,
minus support for compression tags (which will be handled elsewhere anyway).
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
We need to be able to prevent memory from being freed while it's still
mapped in a GPU's address-space.
Will be used by upcoming MMU changes.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
We probably don't want to destroy compression data when doing multiple
mappings of a memory object.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
We're moving towards having a central place to handle comptag allocation,
and as some GPUs don't have a ram submodule (ie. Tegra), we need to move
the mm somewhere else.
It probably never belonged in ram anyways.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Different sections of VRAM may have different properties (ie. can't be used
for compression/display, can't be mapped, etc).
We currently already support this, but it's a bit magic. This change makes
it more obvious where we're allocating from.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
MMU will need to know this during its constructor, so we can't delay
deciding this until init-time.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
These are used for accesses to sparse mappings, and we want reads of
such mappings to return 0.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This reg has moved on Pascal, and causes a bus fault.
We never use the value anyway, so just remove the read.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
A missing u64 cast causes a 32-Bit wraparound from
4096 MiB to 0 MiB and therefore total 0 MiB VRAM detected
if card has 4096 Mib per FBP.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
GP10B's FB is largely compatible with the GP100 implementation.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This commit reworks the RAM detection algorithm, using RAM-per-LTC to
determine whether a board has a mixed-memory configuration instead of
using RAM-per-FBPA. I'm not certain the algorithm is perfect, but it
should handle all currently known configurations in the very least.
This should fix GTX 970 boards with 4GiB of RAM where the last 512MiB
isn't fully accessible, as well as only detecting half the VRAM on
GF108 boards.
As a nice side-effect, GP10x memory detection now reuses the majority
of the code from earlier chipsets.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
GF108/GM107 implementations will want slightly different functions for
the upcoming RAM detection improvements.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
The issue was recorded in fd.o bug 27501, not 25701.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
We never have any need for a double-linked list here, and as there's
generally a large number of these objects, replace it with a single-
linked list in order to save some memory.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
In this situation, we'd have ended up detecting less VRAM than we have.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
gm20b's FB has the same capabilities as gm200, minus the ability to
allocate RAM. Create a device that reflects this instead of re-using the
gk20a device which may be incorrect.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-By: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
gk20a's FB is not special compared to other Kepler chips, besides the
fact it does not have VRAM. Use the regular gf100 hooks instead of the
incomplete versions we rewrote.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-By: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
The gf100 constructor should be called, otherwise we will allocate a
smaller object than expected. This was without effect so far because
gk20a did not allocate a page, but with gf100's page allocation moved
to the oneinit() hook this problem has become apparent.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
We get a few warnings when building kernel with W=1:
drivers/gpu/drm/nouveau/nvkm/subdev/bios/fan.c:29:1: warning: no previous prototype for 'nvbios_fan_table' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/bios/fan.c:56:1: warning: no previous prototype for 'nvbios_fan_entry' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c:184:1: warning: no previous prototype for 'gt215_clk_info' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:99:1: warning: no previous prototype for 'gt215_link_train_calc' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:153:1: warning: no previous prototype for 'gt215_link_train' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:271:1: warning: no previous prototype for 'gt215_link_train_init' [-Wmissing-prototypes]
....
In fact, both functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
So this patch marks these functions with 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
We get a few warnings when building kernel with W=1:
drivers/gpu/drm/nouveau/nvkm/core/firmware.c:34:1: warning: no previous prototype for 'nvkm_firmware_get' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/core/firmware.c:58:1: warning: no previous prototype for 'nvkm_firmware_put' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/sddr3.c:69:1: warning: no previous prototype for 'nvkm_sddr3_calc' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/sddr2.c:60:1: warning: no previous prototype for 'nvkm_sddr2_calc' [-Wmissing-prototypes]
....
In fact, these functions are declared in
drivers/gpu/drm/nouveau/include/nvkm/core/firmware.h
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ram.h
drivers/gpu/drm/nouveau/nvkm/subdev/volt/priv.h
drivers/gpu/drm/nouveau/nvkm/engine/gr/nv50.h
drivers/gpu/drm/nouveau/dispnv04/disp.h.
So this patch adds missing header dependencies.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
The 100c08 scratch page is mapped using dma_map_page() before the TTM
layer has had a chance to set the DMA mask. This means we are still
running with the default of 32 when this code executes, and this causes
problems for platforms with no memory below 4 GB (such as AMD Seattle)
So move the dma_map_page() to the .oneinit hook, which executes after the
DMA mask has been set.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
The 100c10 scratch page is mapped using dma_map_page() before the TTM
layer has had a chance to set the DMA mask. This means we are still
running with the default of 32 when this code executes, and this causes
problems for platforms with no memory below 4 GB (such as AMD Seattle)
So move the dma_map_page() to the .oneinit hook, which executes after the
DMA mask has been set.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This enables memory reclocking on Maxwell. Sadly without a PMU firmware it
is useless for gm20x gpus.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Tested-by: Aidan Epstein <aidan@jmad.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
GFxxx/GM1xx support the selection of 64/128KiB big pages globally.
GM2xx supports the same, as well as another mode where the page size
can be selected per-instance.
We default to 128KiB pages (With per-instance for GM200, but the current
code selects 128KiB there already) as the MMU code isn't currently able
to handle otherwise.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Later chipsets require setting this up both in FB and GR, so let's just
move the allocation to FB.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
These are now specified directly in the MC subdev.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This patch uses an approach closer to the nvidia driver to configure
both PLLs for high gddr5 memory clocks (usually above 2400MHz)
Previously nouveau used the one PLL as it was used for the lower clocks
and just adjusted the second PLL to get as close as possible to the
requested clock. This means for my card, that I got a 4050 MHz clock
although 4008 MHz was requested.
Now the driver iterates over a list of PLL configuration also used by
the nvidia driver and then adjust the second PLL to get near the
requested clock. Also it hold to some restriction I found while
analyzing the PLL configurations
This won't fix all gddr5 high clock issues itself, but it should be
fine on hybrid gpu systems as found on many laptops these days. Also
switching while normal desktop usage should be a lot more stable than
before.
v2: move the pll code into ramgk104
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Avoids waiting for VBLANKS that never arrive on headless or otherwise
unconventional set-ups. Strategy taken from MEMX.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|