summaryrefslogtreecommitdiff
path: root/drivers/edac/skx_edac.c
AgeCommit message (Collapse)Author
2019-02-02EDAC, skx_edac: Delete duplicated codeQiuxu Zhuo
Delete the duplicated code from skx_edac.c and rename skx_edac.c to skx_base.c. Update the Makefile to build the skx_edac driver from skx_base.c and skx_common.c. Add SPDX to skx_base.c and clean out unnecessary #include lines. [ bp: Drop the license boilerplate - there's an SPDX identifier now. ] Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/20190130191519.15393-4-tony.luck@intel.com
2018-11-16EDAC, skx: Let EDAC core show the decoded result for debugfsQiuxu Zhuo
Current debugfs shows the decoded result in its own print format which is inconvenient for analysis/statistics. Use skx_mce_check_error() instead of skx_decode() for debugfs, then the decoded result is showed via EDAC core in a more readable format like "CPU_SrcID#[0-9]_MC#[0-9]_Chan#[0-9]_DIMM#[0-9]". Print a warning the first time this interface is used so the administrator can see the console log that error(s) have been faked. Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: Mauro Carvalho Chehab <mchehab@kernel.org> CC: arozansk@redhat.com CC: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1542353705-13531-1-git-send-email-qiuxu.zhuo@intel.com
2018-11-16EDAC, skx: Move debugfs node under EDAC's hierarchyQiuxu Zhuo
The debugfs node is /sys/kernel/debug/skx_edac_test. Rename it and move under EDAC debugfs root directory. Remove the unused 'skx_fake_addr' and remove the 'skx_test' on error. Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: Mauro Carvalho Chehab <mchehab@kernel.org> CC: arozansk@redhat.com CC: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1542353684-13496-1-git-send-email-qiuxu.zhuo@intel.com
2018-11-16EDAC, skx: Prepend hex formatting with '0x'Qiuxu Zhuo
Some debug/error strings in hex formatting do not have the '0x' prefix. Prepend hex formatting with '0x' for them, but with one exception: "Couldn't enable %04x:%04x", instead of putting '0x' in this line, add the word 'device'. We commonly use 8086:1234 without the leading '0x' (e.g. as '-d' argument to lspci(8) and setpci(8) commands). Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: Mauro Carvalho Chehab <mchehab@kernel.org> CC: Tony Luck <tony.luck@intel.com> CC: arozansk@redhat.com CC: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1542353660-13458-1-git-send-email-qiuxu.zhuo@intel.com
2018-11-16EDAC, skx: Fix function calling order in skx_exit()Qiuxu Zhuo
The order of function calling in skx_exit() is not the reversed order in skx_init(). Fix it. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: Mauro Carvalho Chehab <mchehab@kernel.org> CC: Tony Luck <tony.luck@intel.com> CC: arozansk@redhat.com CC: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1542353616-13421-1-git-send-email-qiuxu.zhuo@intel.com
2018-10-25EDAC, skx_edac: Add address translation for non-volatile DIMMsQiuxu Zhuo
Currently, this driver doesn't support address translation for non-volatile DIMMs. The ACPI ADXL DSM method provides address translation for both volatile and non-volatile DIMMs. Enable it to use the ACPI DSM methods if they are supported and there are non-volatile DIMMs populated on the system. Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: Mauro Carvalho Chehab <mchehab@kernel.org> CC: arozansk@redhat.com CC: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1540106336-5212-1-git-send-email-qiuxu.zhuo@intel.com
2018-10-09EDAC, skx_edac: Fix logical channel intermediate decodingQiuxu Zhuo
The code "lchan = (lchan << 1) | ~lchan" for logical channel intermediate decoding is wrong. The wrong intermediate decoding result is {0xffffffff, 0xfffffffe}. Fix it by replacing '~' with '!'. The correct intermediate decoding result is {0x1, 0x2}. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: Aristeu Rozanski <aris@redhat.com> CC: Mauro Carvalho Chehab <mchehab@kernel.org> CC: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20181009172025.18594-1-tony.luck@intel.com
2018-09-29EDAC, {i7core,sb,skx}_edac: Fix uncorrected error countingTony Luck
The count of errors is picked up from bits 52:38 of the machine check bank status register. But this is the count of *corrected* errors. If an uncorrected error is being logged, the h/w sets this field to 0. Which means that when edac_mc_handle_error() is called, the EDAC core will carefully add zero to the appropriate uncorrected error counts. Signed-off-by: Tony Luck <tony.luck@intel.com> [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Cc: Aristeu Rozanski <aris@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180928213934.19890-1-tony.luck@intel.com
2018-09-22EDAC: Correct DIMM capacity unit symbolQiuxu Zhuo
The {i3200|i7core|sb|skx}_edac drivers show DIMM capacity using the wrong unit symbol: 'Mb' - megabit. Fix them by replacing 'Mb' with 'MiB' - mebibyte. [Tony: These are all "edac_dbg()" messages, so this won't break scripts that parse console logs.] Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Aristeu Rozanski <aris@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac@vger.kernel.org Link: https://lkml.kernel.org/r/20180919003433.16475-1-tony.luck@intel.com
2018-03-15EDAC, skx_edac: Detect non-volatile DIMMsTony Luck
This just covers the topology function of the EDAC driver. We locate which DIMM slots are populated with NVDIMMs and query the NFIT and SMBIOS tables to get the size. Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Aristeu Rozanski <aris@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: linux-acpi@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-nvdimm@lists.01.org Link: http://lkml.kernel.org/r/20180312182430.10335-6-tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-28EDAC, skx_edac: Handle systems with segmented PCI bussesTony Luck
Large systems separate their PCI busses into segments since the limit of only 256 PCI busses can be too restrictive. Extend this driver to check whether <segment, bus-number> matches when deciding how to group memory controller PCI devices to CPU sockets. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Charles Rose <charles.rose@dell.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/f58abfd10bf73c8bc5adc1fe4de7408128b00625.1506358467.git.tony.luck@intel.com [ Make skx_dev.seg an int. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-27EDAC, skx_edac: Fix detection of single-rank DIMMsCharles Rose
Single-rank DIMMs did not get detected on Skylake/Kabylake systems due to wrong limit check. The single rank DIMM check is a simple typo. "0" is a legal value in this field meaning single rank. Signed-off-by: Charles Rose <charles.rose@dell.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/66df72d327c265fbf92fe25df96daa228a35f076.1506358467.git.tony.luck@intel.com [ Also fix debug message to print number of ranks. ] Signed-off-by: Tony Luck <tony.luck@intel.com> [ Expand commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25EDAC: Add owner check to the x86 platform driversToshi Kani
Change x86 EDAC platform drivers to verify the module owner at the beginning of their module init functions. This allows them to fail their init immediately when ghes_edac is enabled. Similar change can be made to other edac drivers if necessary. Also, remove ".c" from module names of pnp2_edac, sb_edac, and skx_edac. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Suggested-by: Borislav Petkov <bp@alien8.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170823225447.15608-6-toshi.kani@hpe.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-21EDAC: Handle return value of kasprintf()Arvind Yadav
kasprintf() can fail and we must check its return value. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: linux-edac@vger.kernel.org [ Merged into a single patch, small formatting fixups. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-07-17EDAC: Get rid of mci->mod_verBorislav Petkov
It is a write-only variable so get rid of it. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Robert Richter <rric@kernel.org> Acked-by: Michal Simek <michal.simek@xilinx.com> Acked-by: Thor Thayer <thor.thayer@linux.intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Jason Baron <jbaron@akamai.com> Cc: "Sören Brinkmann" <soren.brinkmann@xilinx.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Loc Ho <lho@apm.com> Cc: linux-edac@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@linux-mips.org
2017-04-10EDAC: Rename report status accessorsBorislav Petkov
Change them to have the edac_ prefix. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-24x86/ras, EDAC, acpi: Assign MCE notifier handlers a priorityBorislav Petkov
Assign all notifiers on the MCE decode chain a priority so that they get called in the correct order. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-15edac: rename edac_core.h to edac_mc.hMauro Carvalho Chehab
Now, all left at edac_core.h are at drivers/edac/edac_mc.c, so rename it to edac_mc.h. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-10-22EDAC, skx_edac: Fix non static symbol warningsWei Yongjun
Fix the following sparse warnings: drivers/edac/skx_edac.c:266:25: warning: symbol 'skx_cpuids' was not declared. Should it be static? drivers/edac/skx_edac.c:1040:12: warning: symbol 'skx_init' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1477147098-2842-1-git-send-email-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2016-10-19EDAC, {sb,skx}_edac: Use Intel model macros instead of open-coding themDave Hansen
We now have symbolic names for a bunch of Intel CPU models via asm/intel-family.h. The original conversion missed the EDAC drivers. Convert them. Signed-off-by: Dave Hansen <dave.hansen@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160929204321.9FAE5F84@viggo.jf.intel.com [ Remove comment, macro name is descriptive enough. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-21EDAC, skx_edac: Add EDAC driver for SkylakeTony Luck
This is an entirely new driver instead of yet another set of patches to sb_edac.c because: 1) Mapping from PCI devices to socket/memory controller is significantly different. Skylake scatters devices on a socket across a number of PCI buses. 2) There is an extra level of interleaving via the "mcroute" register that would be a little messy to squeeze into the old driver. 3) Validation is getting too expensive. Changes to sb_edac need to be checked against Sandy Bridge, Ivy Bridge, Haswell, Broadwell and Knights Landing. Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>