Age | Commit message (Collapse) | Author |
|
We separate some of the common code source files to different
folders for a better maintainability and testability.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
DRAM physical page sizes depend of the amount of HBMs available in
the device. this number is device-dependent and may also be subject
to binning when one or more of the DRAM controllers are found to
to be faulty. Such a configuration may lead to partitioning the DRAM
to non-power-of-2 pages.
To support this feature we also need to add infrastructure of address
scarmbling.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Often WARN is defined in data-centers as BUG and we would like to
avoid hanging the entire server on some internal error of the driver
(important as it might be).
Therefore, use dev_crit instead.
Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
When using Deep learning framework such as tensorflow or pytorch, there
are tens of thousands of host memory mappings. When the user frees
all those mappings at the same time, the process of unmapping and
unpinning them can take a long time, which may cause a soft lockup
bug.
To prevent this, we need to free the core to do other things during
the unmapping process. For now, we chose to do it every 32K unmappings
(each unmap is a single 4K page).
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
We introduce a new wrapper which allows us to mmu map any size
to any host va_range available. In addition we remove duplicated
code from various places in driver and using this new wrapper
instead.
This wrapper supports mapping only contiguous physical
memory blocks and will be used for mappings that are done to the
driver ASID.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
After the MMU-code refactoring, the existing MMU debugfs operations
are no longer working so we need to fix them.
In addition, remove the duplicate code that was in the debugfs code
and use the already existing MMU-code.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Multiple locks are usually a source of problems, which in the MMU
case can be avoided since it is relatively rare that both MMU
tables are updated at the same time.
Therefore, use a single shared lock instead of two separate ones.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
This commit refactors the MMU code to support PCI MMU page tables
residing on host and DCORE MMU residing on the device DRAM at the
same time.
This is needed for future devices as on GAUDI and GOYA we have
a single MMU where its page tables always reside on DRAM.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
This commit fixes an incorrect error message
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
As preparation to MMU v2, rework MMU to be device oriented
instantiated according to the device in hand.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
In the future we will have MMU v2 code, so we need to prepare the
driver for it. The first step is to rename the current MMU file to
mmu_v1.c.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
vmalloc can return different return code than NULL and a valid
pointer. We must validate it in order to dereference a non valid
pointer.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
There's no need to try to be cute with the include file locations in the
Makefile, so just specify exactly where the files are.
Bonus is this fixes the problem of building with O= as well as trying to
just build the subdirectory alone.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Omer Shpigelman <oshpigelman@habana.ai>
Cc: Tomer Tayar <ttayar@habana.ai>
Cc: Moti Haimovski <mhaimovski@habana.ai>
Cc: Ofir Bitton <obitton@habana.ai>
Cc: Ben Segal <bpsegal20@gmail.com>
Cc: Christine Gharzuzi <cgharzuzi@habana.ai>
Cc: Pawel Piskorski <ppiskorski@habana.ai>
Link: https://lore.kernel.org/r/20200728171851.55842-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
For internal needs of our CI we need to move all the common code into a
common folder instead of putting them in the root folder of the driver.
Same applies to the common header files under include/
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
|