summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-04-25kobject: fix kset_find_obj() race with concurrent last kobject_put()Linus Torvalds
commit a49b7e82cab0f9b41f483359be83f44fbb6b4979 upstream. Anatol Pomozov identified a race condition that hits module unloading and re-loading. To quote Anatol: "This is a race codition that exists between kset_find_obj() and kobject_put(). kset_find_obj() might return kobject that has refcount equal to 0 if this kobject is freeing by kobject_put() in other thread. Here is timeline for the crash in case if kset_find_obj() searches for an object tht nobody holds and other thread is doing kobject_put() on the same kobject: THREAD A (calls kset_find_obj()) THREAD B (calls kobject_put()) splin_lock() atomic_dec_return(kobj->kref), counter gets zero here ... starts kobject cleanup .... spin_lock() // WAIT thread A in kobj_kset_leave() iterate over kset->list atomic_inc(kobj->kref) (counter becomes 1) spin_unlock() spin_lock() // taken // it does not know that thread A increased counter so it remove obj from list spin_unlock() vfree(module) // frees module object with containing kobj // kobj points to freed memory area!! kobject_put(kobj) // OOPS!!!! The race above happens because module.c tries to use kset_find_obj() when somebody unloads module. The module.c code was introduced in commit 6494a93d55fa" Anatol supplied a patch specific for module.c that worked around the problem by simply not using kset_find_obj() at all, but rather than make a local band-aid, this just fixes kset_find_obj() to be thread-safe using the proper model of refusing the get a new reference if the refcount has already dropped to zero. See examples of this proper refcount handling not only in the kref documentation, but in various other equivalent uses of this pattern by grepping for atomic_inc_not_zero(). [ Side note: the module race does indicate that module loading and unloading is not properly serialized wrt sysfs information using the module mutex. That may require further thought, but this is the correct fix at the kobject layer regardless. ] Reported-analyzed-and-tested-by: Anatol Pomozov <anatol.pomozov@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25kref: Implement kref_get_unless_zero v3Thomas Hellstrom
commit 4b20db3de8dab005b07c74161cb041db8c5ff3a7 upstream. This function is intended to simplify locking around refcounting for objects that can be looked up from a lookup structure, and which are removed from that lookup structure in the object destructor. Operations on such objects require at least a read lock around lookup + kref_get, and a write lock around kref_put + remove from lookup structure. Furthermore, RCU implementations become extremely tricky. With a lookup followed by a kref_get_unless_zero *with return value check* locking in the kref_put path can be deferred to the actual removal from the lookup structure and RCU lookups become trivial. v2: Formatting fixes. v3: Invert the return value. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com> [bwh: Backported to 3.2: - Adjust context - Add #include <linux/atomic.h>] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25Btrfs: make sure nbytes are right after log replayJosef Bacik
commit 4bc4bee4595662d8bff92180d5c32e3313a704b0 upstream. While trying to track down a tree log replay bug I noticed that fsck was always complaining about nbytes not being right for our fsynced file. That is because the new fsync stuff doesn't wait for ordered extents to complete, so the inodes nbytes are not necessarily updated properly when we log it. So to fix this we need to set nbytes to whatever it is on the inode that is on disk, so when we replay the extents we can just add the bytes that are being added as we replay the extent. This makes it work for the case that we have the wrong nbytes or the case that we logged everything and nbytes is actually correct. With this I'm no longer getting nbytes errors out of btrfsck. Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25tracing: Fix possible NULL pointer dereferencesNamhyung Kim
commit 6a76f8c0ab19f215af2a3442870eeb5f0e81998d upstream. Currently set_ftrace_pid and set_graph_function files use seq_lseek for their fops. However seq_open() is called only for FMODE_READ in the fops->open() so that if an user tries to seek one of those file when she open it for writing, it sees NULL seq_file and then panic. It can be easily reproduced with following command: $ cd /sys/kernel/debug/tracing $ echo 1234 | sudo tee -a set_ftrace_pid In this example, GNU coreutils' tee opens the file with fopen(, "a") and then the fopen() internally calls lseek(). Link: http://lkml.kernel.org/r/1365663302-2170-1-git-send-email-namhyung@kernel.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> [bwh: Backported to 3.2: ftrace_regex_lseek() is static] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25target: Fix incorrect fallthrough of ALUA Standby/Offline/Transition CDBsNicholas Bellinger
commit 30f359a6f9da65a66de8cadf959f0f4a0d498bba upstream. This patch fixes a bug where a handful of informational / control CDBs that should be allowed during ALUA access state Standby/Offline/Transition where incorrectly returning CHECK_CONDITION + ASCQ_04H_ALUA_TG_PT_*. This includes INQUIRY + REPORT_LUNS, which would end up preventing LUN registration when LUN scanning occured during these ALUA access states. Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25target: Fix MAINTENANCE_IN service action CDB checks to use lower 5 bitsNicholas Bellinger
commit ba539743b70cd160c84bab1c82910d0789b820f8 upstream. This patch fixes the MAINTENANCE_IN service action type checks to only look at the proper lower 5 bits of cdb byte 1. This addresses the case where MI_REPORT_TARGET_PGS w/ extended header using the upper three bits of cdb byte 1 was not processed correctly in transport_generic_cmd_sequencer, as well as the three cases for standby, unavailable, and transition ALUA primary access state checks. Also add MAINTENANCE_IN to the excluded list in transport_generic_prepare_cdb() to prevent the PARAMETER DATA FORMAT bits from being cleared. Cc: Hannes Reinecke <hare@suse.de> Cc: Rob Evers <revers@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metalBoris Ostrovsky
commit 511ba86e1d386f671084b5d0e6f110bb30b8eeb2 upstream. Invoking arch_flush_lazy_mmu_mode() results in calls to preempt_enable()/disable() which may have performance impact. Since lazy MMU is not used on bare metal we can patch away arch_flush_lazy_mmu_mode() so that it is never called in such environment. [ hpa: the previous patch "Fix vmalloc_fault oops during lazy MMU updates" may cause a minor performance regression on bare metal. This patch resolves that performance regression. It is somewhat unclear to me if this is a good -stable candidate. ] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: http://lkml.kernel.org/r/1364045796-10720-2-git-send-email-konrad.wilk@oracle.com Tested-by: Josh Boyer <jwboyer@redhat.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updatesSamu Kallio
commit 1160c2779b826c6f5c08e5cc542de58fd1f667d5 upstream. In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops when lazy MMU updates are enabled, because set_pgd effects are being deferred. One instance of this problem is during process mm cleanup with memory cgroups enabled. The chain of events is as follows: - zap_pte_range enables lazy MMU updates - zap_pte_range eventually calls mem_cgroup_charge_statistics, which accesses the vmalloc'd mem_cgroup per-cpu stat area - vmalloc_fault is triggered which tries to sync the corresponding PGD entry with set_pgd, but the update is deferred - vmalloc_fault oopses due to a mismatch in the PUD entries The OOPs usually looks as so: ------------[ cut here ]------------ kernel BUG at arch/x86/mm/fault.c:396! invalid opcode: 0000 [#1] SMP .. snip .. CPU 1 Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1 RIP: e030:[<ffffffff816271bf>] [<ffffffff816271bf>] vmalloc_fault+0x11f/0x208 .. snip .. Call Trace: [<ffffffff81627759>] do_page_fault+0x399/0x4b0 [<ffffffff81004f4c>] ? xen_mc_extend_args+0xec/0x110 [<ffffffff81624065>] page_fault+0x25/0x30 [<ffffffff81184d03>] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50 [<ffffffff81186f78>] __mem_cgroup_uncharge_common+0xd8/0x350 [<ffffffff8118aac7>] mem_cgroup_uncharge_page+0x57/0x60 [<ffffffff8115fbc0>] page_remove_rmap+0xe0/0x150 [<ffffffff8115311a>] ? vm_normal_page+0x1a/0x80 [<ffffffff81153e61>] unmap_single_vma+0x531/0x870 [<ffffffff81154962>] unmap_vmas+0x52/0xa0 [<ffffffff81007442>] ? pte_mfn_to_pfn+0x72/0x100 [<ffffffff8115c8f8>] exit_mmap+0x98/0x170 [<ffffffff810050d9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff81059ce3>] mmput+0x83/0xf0 [<ffffffff810624c4>] exit_mm+0x104/0x130 [<ffffffff8106264a>] do_exit+0x15a/0x8c0 [<ffffffff810630ff>] do_group_exit+0x3f/0xa0 [<ffffffff81063177>] sys_exit_group+0x17/0x20 [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the changes visible to the consistency checks. RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737 Tested-by: Josh Boyer <jwboyer@redhat.com> Reported-and-Tested-by: Krishna Raman <kraman@redhat.com> Signed-off-by: Samu Kallio <samu.kallio@aberdeencloud.com> Link: http://lkml.kernel.org/r/1364045796-10720-1-git-send-email-konrad.wilk@oracle.com Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25tracing: Fix double free when function profile init failedNamhyung Kim
commit 83e03b3fe4daffdebbb42151d5410d730ae50bd1 upstream. On the failure path, stat->start and stat->pages will refer same page. So it'll attempt to free the same page again and get kernel panic. Link: http://lkml.kernel.org/r/1364820385-32027-1-git-send-email-namhyung@kernel.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25spinlocks and preemption points need to be at least compiler barriersLinus Torvalds
commit 386afc91144b36b42117b0092893f15bc8798a80 upstream. In UP and non-preempt respectively, the spinlocks and preemption disable/enable points are stubbed out entirely, because there is no regular code that can ever hit the kind of concurrency they are meant to protect against. However, while there is no regular code that can cause scheduling, we _do_ end up having some exceptional (literally!) code that can do so, and that we need to make sure does not ever get moved into the critical region by the compiler. In particular, get_user() and put_user() is generally implemented as inline asm statements (even if the inline asm may then make a call instruction to call out-of-line), and can obviously cause a page fault and IO as a result. If that inline asm has been scheduled into the middle of a preemption-safe (or spinlock-protected) code region, we obviously lose. Now, admittedly this is *very* unlikely to actually ever happen, and we've not seen examples of actual bugs related to this. But partly exactly because it's so hard to trigger and the resulting bug is so subtle, we should be extra careful to get this right. So make sure that even when preemption is disabled, and we don't have to generate any actual *code* to explicitly tell the system that we are in a preemption-disabled region, we need to at least tell the compiler not to move things around the critical region. This patch grew out of the same discussion that caused commits 79e5f05edcbf ("ARC: Add implicit compiler barrier to raw_local_irq* functions") and 3e2e0d2c222b ("tile: comment assumption about __insn_mtspr for <asm/irqflags.h>") to come about. Note for stable: use discretion when/if applying this. As mentioned, this bug may never have actually bitten anybody, and gcc may never have done the required code motion for it to possibly ever trigger in practice. Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.2: drop sched_preempt_enable_no_resched()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25ASoC: wm8903: Fix the bypass to HP/LINEOUT when no DAC or ADC is runningAlban Bedel
commit f1ca493b0b5e8f42d3b2dc8877860db2983f47b6 upstream. The Charge Pump needs the DSP clock to work properly, without it the bypass to HP/LINEOUT is not working properly. This requirement is not mentioned in the datasheet but has been confirmed by Mark Brown from Wolfson. Signed-off-by: Alban Bedel <alban.bedel@avionic-design.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25can: gw: use kmem_cache_free() instead of kfree()Wei Yongjun
commit 3480a2125923e4b7a56d79efc76743089bf273fc upstream. Memory allocated by kmem_cache_alloc() should be freed using kmem_cache_free(), not kfree(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()Huacai Chen
commit 6f389a8f1dd22a24f3d9afc2812b30d639e94625 upstream. As commit 40dc166c (PM / Core: Introduce struct syscore_ops for core subsystems PM) say, syscore_ops operations should be carried with one CPU on-line and interrupts disabled. However, after commit f96972f2d (kernel/sys.c: call disable_nonboot_cpus() in kernel_restart()), syscore_shutdown() is called before disable_nonboot_cpus(), so break the rules. We have a MIPS machine with a 8259A PIC, and there is an external timer (HPET) linked at 8259A. Since 8259A has been shutdown too early (by syscore_shutdown()), disable_nonboot_cpus() runs without timer interrupt, so it hangs and reboot fails. This patch call syscore_shutdown() a little later (after disable_nonboot_cpus()) to avoid reboot failure, this is the same way as poweroff does. For consistency, add disable_nonboot_cpus() to kernel_halt(). Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25ftrace: Consistently restore trace function on sysctl enablingJan Kiszka
commit 5000c418840b309251c5887f0b56503aae30f84c upstream. If we reenable ftrace via syctl, we currently set ftrace_trace_function based on the previous simplistic algorithm. This is inconsistent with what update_ftrace_function does. So better call that helper instead. Link: http://lkml.kernel.org/r/5151D26F.1070702@siemens.com Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25sched_clock: Prevent 64bit inatomicity on 32bit systemsThomas Gleixner
commit a1cbcaa9ea87b87a96b9fc465951dcf36e459ca2 upstream. The sched_clock_remote() implementation has the following inatomicity problem on 32bit systems when accessing the remote scd->clock, which is a 64bit value. CPU0 CPU1 sched_clock_local() sched_clock_remote(CPU0) ... remote_clock = scd[CPU0]->clock read_low32bit(scd[CPU0]->clock) cmpxchg64(scd->clock,...) read_high32bit(scd[CPU0]->clock) While the update of scd->clock is using an atomic64 mechanism, the readout on the remote cpu is not, which can cause completely bogus readouts. It is a quite rare problem, because it requires the update to hit the narrow race window between the low/high readout and the update must go across the 32bit boundary. The resulting misbehaviour is, that CPU1 will see the sched_clock on CPU1 ~4 seconds ahead of it's own and update CPU1s sched_clock value to this bogus timestamp. This stays that way due to the clamping implementation for about 4 seconds until the synchronization with CLOCK_MONOTONIC undoes the problem. The issue is hard to observe, because it might only result in a less accurate SCHED_OTHER timeslicing behaviour. To create observable damage on realtime scheduling classes, it is necessary that the bogus update of CPU1 sched_clock happens in the context of an realtime thread, which then gets charged 4 seconds of RT runtime, which results in the RT throttler mechanism to trigger and prevent scheduling of RT tasks for a little less than 4 seconds. So this is quite unlikely as well. The issue was quite hard to decode as the reproduction time is between 2 days and 3 weeks and intrusive tracing makes it less likely, but the following trace recorded with trace_clock=global, which uses sched_clock_local(), gave the final hint: <idle>-0 0d..30 400269.477150: hrtimer_cancel: hrtimer=0xf7061e80 <idle>-0 0d..30 400269.477151: hrtimer_start: hrtimer=0xf7061e80 ... irq/20-S-587 1d..32 400273.772118: sched_wakeup: comm= ... target_cpu=0 <idle>-0 0dN.30 400273.772118: hrtimer_cancel: hrtimer=0xf7061e80 What happens is that CPU0 goes idle and invokes sched_clock_idle_sleep_event() which invokes sched_clock_local() and CPU1 runs a remote wakeup for CPU0 at the same time, which invokes sched_remote_clock(). The time jump gets propagated to CPU0 via sched_remote_clock() and stays stale on both cores for ~4 seconds. There are only two other possibilities, which could cause a stale sched clock: 1) ktime_get() which reads out CLOCK_MONOTONIC returns a sporadic wrong value. 2) sched_clock() which reads the TSC returns a sporadic wrong value. #1 can be excluded because sched_clock would continue to increase for one jiffy and then go stale. #2 can be excluded because it would not make the clock jump forward. It would just result in a stale sched_clock for one jiffy. After quite some brain twisting and finding the same pattern on other traces, sched_clock_remote() remained the only place which could cause such a problem and as explained above it's indeed racy on 32bit systems. So while on 64bit systems the readout is atomic, we need to verify the remote readout on 32bit machines. We need to protect the local->clock readout in sched_clock_remote() on 32bit as well because an NMI could hit between the low and the high readout, call sched_clock_local() and modify local->clock. Thanks to Siegfried Wulsch for bearing with my debug requests and going through the tedious tasks of running a bunch of reproducer systems to generate the debug information which let me decode the issue. Reported-by: Siegfried Wulsch <Siegfried.Wulsch@rovema.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304051544160.21884@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25powerpc: pSeries_lpar_hpte_remove fails from Adjunct partition being ↵Michael Wolf
performed before the ANDCOND test commit 9fb2640159f9d4f5a2a9d60e490482d4cbecafdb upstream. Some versions of pHyp will perform the adjunct partition test before the ANDCOND test. The result of this is that H_RESOURCE can be returned and cause the BUG_ON condition to occur. The HPTE is not removed. So add a check for H_RESOURCE, it is ok if this HPTE is not removed as pSeries_lpar_hpte_remove is looking for an HPTE to remove and not a specific HPTE to remove. So it is ok to just move on to the next slot and try again. Signed-off-by: Michael Wolf <mjw@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25alpha: Add irongate_io to PCI bus resourcesJay Estabrook
commit aa8b4be3ac049c8b1df2a87e4d1d902ccfc1f7a9 upstream. Fixes a NULL pointer dereference at boot on UP1500. Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jay Estabrook <jay.estabrook@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25ALSA: usb-audio: fix endianness bug in snd_nativeinstruments_*Eldad Zack
commit 889d66848b12d891248b03abcb2a42047f8e172a upstream. The usb_control_msg() function expects __u16 types and performs the endianness conversions by itself. However, in three places, a conversion is performed before it is handed over to usb_control_msg(), which leads to a double conversion (= no conversion): * snd_usb_nativeinstruments_boot_quirk() * snd_nativeinstruments_control_get() * snd_nativeinstruments_control_put() Caught by sparse: sound/usb/mixer_quirks.c:512:38: warning: incorrect type in argument 6 (different base types) sound/usb/mixer_quirks.c:512:38: expected unsigned short [unsigned] [usertype] index sound/usb/mixer_quirks.c:512:38: got restricted __le16 [usertype] <noident> sound/usb/mixer_quirks.c:543:35: warning: incorrect type in argument 5 (different base types) sound/usb/mixer_quirks.c:543:35: expected unsigned short [unsigned] [usertype] value sound/usb/mixer_quirks.c:543:35: got restricted __le16 [usertype] <noident> sound/usb/mixer_quirks.c:543:56: warning: incorrect type in argument 6 (different base types) sound/usb/mixer_quirks.c:543:56: expected unsigned short [unsigned] [usertype] index sound/usb/mixer_quirks.c:543:56: got restricted __le16 [usertype] <noident> sound/usb/quirks.c:502:35: warning: incorrect type in argument 5 (different base types) sound/usb/quirks.c:502:35: expected unsigned short [unsigned] [usertype] value sound/usb/quirks.c:502:35: got restricted __le16 [usertype] <noident> Signed-off-by: Eldad Zack <eldad@fogrefinery.com> Acked-by: Daniel Mack <zonque@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25hwspinlock: fix __hwspin_lock_request error pathLi Fei
commit c10b90d85a5126d25c89cbaa50dc9fdd1c4d001a upstream. Even in failed case of pm_runtime_get_sync, the usage_count is incremented. In order to keep the usage_count with correct value and runtime power management to behave correctly, call pm_runtime_put_noidle in such case. In __hwspin_lock_request, module_put is also called before return in pm_runtime_get_sync failed case. Signed-off-by Liu Chuansheng <chuansheng.liu@intel.com> Signed-off-by: Li Fei <fei.li@intel.com> [edit commit log] Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25ata_piix: Fix DVD not dectected at some Haswell platformsYouquan Song
commit b55f84e2d527182e7c611d466cd0bb6ddce201de upstream. There is a quirk patch 5e5a4f5d5a08c9c504fe956391ac3dae2c66556d "ata_piix: make DVD Drive recognisable on systems with Intel Sandybridge chipsets(v2)" fixing the 4 ports IDE controller 32bit PIO mode. We've hit a problem with DVD not recognized on Haswell Desktop platform which includes Lynx Point 2-port SATA controller. This quirk patch disables 32bit PIO on this controller in IDE mode. v2: Change spelling error in statememnt pointed by Sergei Shtylyov. v3: Change comment statememnt and spliting line over 80 characters pointed by Libor Pechacek and also rebase the patch against 3.8-rc7 kernel. Tested-by: Lee, Chun-Yi <jlee@suse.com> Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25libata: Set max sector to 65535 for Slimtype DVD A DS8A8SH driveShan Hai
commit a32450e127fc6e5ca6d958ceb3cfea4d30a00846 upstream. The Slimtype DVD A DS8A8SH drive locks up when max sector is smaller than 65535, and the blow backtrace is observed on locking up: INFO: task flush-8:32:1130 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. flush-8:32 D ffffffff8180cf60 0 1130 2 0x00000000 ffff880273aef618 0000000000000046 0000000000000005 ffff880273aee000 ffff880273aee000 ffff880273aeffd8 ffff880273aee010 ffff880273aee000 ffff880273aeffd8 ffff880273aee000 ffff88026e842ea0 ffff880274a10000 Call Trace: [<ffffffff8168fc2d>] schedule+0x5d/0x70 [<ffffffff8168fccc>] io_schedule+0x8c/0xd0 [<ffffffff81324461>] get_request+0x731/0x7d0 [<ffffffff8133dc60>] ? cfq_allow_merge+0x50/0x90 [<ffffffff81083aa0>] ? wake_up_bit+0x40/0x40 [<ffffffff81320443>] ? bio_attempt_back_merge+0x33/0x110 [<ffffffff813248ea>] blk_queue_bio+0x23a/0x3f0 [<ffffffff81322176>] generic_make_request+0xc6/0x120 [<ffffffff81322308>] submit_bio+0x138/0x160 [<ffffffff811d7596>] ? bio_alloc_bioset+0x96/0x120 [<ffffffff811d1f61>] submit_bh+0x1f1/0x220 [<ffffffff811d48b8>] __block_write_full_page+0x228/0x340 [<ffffffff811d3650>] ? attach_nobh_buffers+0xc0/0xc0 [<ffffffff811d8960>] ? I_BDEV+0x10/0x10 [<ffffffff811d8960>] ? I_BDEV+0x10/0x10 [<ffffffff811d4ab6>] block_write_full_page_endio+0xe6/0x100 [<ffffffff811d4ae5>] block_write_full_page+0x15/0x20 [<ffffffff811d9268>] blkdev_writepage+0x18/0x20 [<ffffffff81142527>] __writepage+0x17/0x40 [<ffffffff811438ba>] write_cache_pages+0x34a/0x4a0 [<ffffffff81142510>] ? set_page_dirty+0x70/0x70 [<ffffffff81143a61>] generic_writepages+0x51/0x80 [<ffffffff81143ab0>] do_writepages+0x20/0x50 [<ffffffff811c9ed6>] __writeback_single_inode+0xa6/0x2b0 [<ffffffff811ca861>] writeback_sb_inodes+0x311/0x4d0 [<ffffffff811caaa6>] __writeback_inodes_wb+0x86/0xd0 [<ffffffff811cad43>] wb_writeback+0x1a3/0x330 [<ffffffff816916cf>] ? _raw_spin_lock_irqsave+0x3f/0x50 [<ffffffff811b8362>] ? get_nr_inodes+0x52/0x70 [<ffffffff811cb0ac>] wb_do_writeback+0x1dc/0x260 [<ffffffff8168dd34>] ? schedule_timeout+0x204/0x240 [<ffffffff811cb232>] bdi_writeback_thread+0x102/0x2b0 [<ffffffff811cb130>] ? wb_do_writeback+0x260/0x260 [<ffffffff81083550>] kthread+0xc0/0xd0 [<ffffffff81083490>] ? kthread_worker_fn+0x1b0/0x1b0 [<ffffffff8169a3ec>] ret_from_fork+0x7c/0xb0 [<ffffffff81083490>] ? kthread_worker_fn+0x1b0/0x1b0 The above trace was triggered by "dd if=/dev/zero of=/dev/sr0 bs=2048 count=32768" It was previously working by accident, since another bug introduced by 4dce8ba94c7 (libata: Use 'bool' return value for ata_id_XXX) caused all drives to use maxsect=65535. Signed-off-by: Shan Hai <shan.hai@windriver.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25libata: Use integer return value for atapi_command_packet_setShan Hai
commit d8668fcb0b257d9fdcfbe5c172a99b8d85e1cd82 upstream. The function returns type of ATAPI drives so it should return integer value. The commit 4dce8ba94c7 (libata: Use 'bool' return value for ata_id_XXX) since v2.6.39 changed the type of return value from int to bool, the change would cause all of the ATAPI class drives to be treated as TYPE_TAPE and the max_sectors of the drives to be set to 65535 because of the commit f8d8e5799b7(libata: increase 128 KB / cmd limit for ATAPI tape drives), for the function would return true for all ATAPI class drives and the TYPE_TAPE is defined as 0x01. Signed-off-by: Shan Hai <shan.hai@windriver.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25rt2x00: rt2x00pci_regbusy_read() - only print register access failure onceTim Gardner
commit 83589b30f1e1dc9898986293c9336b8ce1705dec upstream. BugLink: http://bugs.launchpad.net/bugs/1128840 It appears that when this register read fails it never recovers, so I think there is no need to repeat the same error message ad infinitum. Cc: Ivo van Doorn <IvDoorn@gmail.com> Cc: Gertjan van Wingerde <gwingerde@gmail.com> Cc: Helmut Schaa <helmut.schaa@googlemail.com> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: linux-wireless@vger.kernel.org Cc: users@rt2x00.serialmonkey.com Cc: netdev@vger.kernel.org Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25crypto: gcm - fix assumption that assoc has one segmentJussi Kivilinna
commit d3dde52209ab571e4e2ec26c66f85ad1355f7475 upstream. rfc4543(gcm(*)) code for GMAC assumes that assoc scatterlist always contains only one segment and only makes use of this first segment. However ipsec passes assoc with three segments when using 'extended sequence number' thus in this case rfc4543(gcm(*)) fails to function correctly. Patch fixes this issue. Reported-by: Chaoxing Lin <Chaoxing.Lin@ultra-3eti.com> Tested-by: Chaoxing Lin <Chaoxing.Lin@ultra-3eti.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25hrtimer: Don't reinitialize a cpu_base lock on CPU_UPMichael Bohan
commit 84cc8fd2fe65866e49d70b38b3fdf7219dd92fe0 upstream. The current code makes the assumption that a cpu_base lock won't be held if the CPU corresponding to that cpu_base is offline, which isn't always true. If a hrtimer is not queued, then it will not be migrated by migrate_hrtimers() when a CPU is offlined. Therefore, the hrtimer's cpu_base may still point to a CPU which has subsequently gone offline if the timer wasn't enqueued at the time the CPU went down. Normally this wouldn't be a problem, but a cpu_base's lock is blindly reinitialized each time a CPU is brought up. If a CPU is brought online during the period that another thread is performing a hrtimer operation on a stale hrtimer, then the lock will be reinitialized under its feet, and a SPIN_BUG() like the following will be observed: <0>[ 28.082085] BUG: spinlock already unlocked on CPU#0, swapper/0/0 <0>[ 28.087078] lock: 0xc4780b40, value 0x0 .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1 <4>[ 42.451150] [<c0014398>] (unwind_backtrace+0x0/0x120) from [<c0269220>] (do_raw_spin_unlock+0x44/0xdc) <4>[ 42.460430] [<c0269220>] (do_raw_spin_unlock+0x44/0xdc) from [<c071b5bc>] (_raw_spin_unlock+0x8/0x30) <4>[ 42.469632] [<c071b5bc>] (_raw_spin_unlock+0x8/0x30) from [<c00a9ce0>] (__hrtimer_start_range_ns+0x1e4/0x4f8) <4>[ 42.479521] [<c00a9ce0>] (__hrtimer_start_range_ns+0x1e4/0x4f8) from [<c00aa014>] (hrtimer_start+0x20/0x28) <4>[ 42.489247] [<c00aa014>] (hrtimer_start+0x20/0x28) from [<c00e6190>] (rcu_idle_enter_common+0x1ac/0x320) <4>[ 42.498709] [<c00e6190>] (rcu_idle_enter_common+0x1ac/0x320) from [<c00e6440>] (rcu_idle_enter+0xa0/0xb8) <4>[ 42.508259] [<c00e6440>] (rcu_idle_enter+0xa0/0xb8) from [<c000f268>] (cpu_idle+0x24/0xf0) <4>[ 42.516503] [<c000f268>] (cpu_idle+0x24/0xf0) from [<c06ed3c0>] (rest_init+0x88/0xa0) <4>[ 42.524319] [<c06ed3c0>] (rest_init+0x88/0xa0) from [<c0c00978>] (start_kernel+0x3d0/0x434) As an example, this particular crash occurred when hrtimer_start() was executed on CPU #0. The code locked the hrtimer's current cpu_base corresponding to CPU #1. CPU #0 then tried to switch the hrtimer's cpu_base to an optimal CPU which was online. In this case, it selected the cpu_base corresponding to CPU #3. Before it could proceed, CPU #1 came online and reinitialized the spinlock corresponding to its cpu_base. Thus now CPU #0 held a lock which was reinitialized. When CPU #0 finally ended up unlocking the old cpu_base corresponding to CPU #1 so that it could switch to CPU #3, we hit this SPIN_BUG() above while in switch_hrtimer_base(). CPU #0 CPU #1 ---- ---- ... <offline> hrtimer_start() lock_hrtimer_base(base #1) ... init_hrtimers_cpu() switch_hrtimer_base() ... ... raw_spin_lock_init(&cpu_base->lock) raw_spin_unlock(&cpu_base->lock) ... <spin_bug> Solve this by statically initializing the lock. Signed-off-by: Michael Bohan <mbohan@codeaurora.org> Link: http://lkml.kernel.org/r/1363745965-23475-1-git-send-email-mbohan@codeaurora.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: ti_usb_3410_5052: fix use-after-free in TIOCMIWAITJohan Hovold
commit fc98ab873aa3dbe783ce56a2ffdbbe7c7609521a upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: ssu100: fix use-after-free in TIOCMIWAITJohan Hovold
commit 43a66b4c417ad15f6d2f632ce67ad195bdf999e8 upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: spcp8x5: fix use-after-free in TIOCMIWAITJohan Hovold
commit dbcea7615d8d7d58f6ff49d2c5568113f70effe9 upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.2: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: pl2303: fix use-after-free in TIOCMIWAITJohan Hovold
commit 40509ca982c00c4b70fc00be887509feca0bff15 upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: oti6858: fix use-after-free in TIOCMIWAITJohan Hovold
commit 8edfdab37157d2683e51b8be5d3d5697f66a9f7b upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.2: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: mos7840: fix use-after-free in TIOCMIWAITJohan Hovold
commit a14430db686b8e459e1cf070a6ecf391515c9ab9 upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: mos7840: fix broken TIOCMIWAITJohan Hovold
commit e670c6af12517d08a403487b1122eecf506021cf upstream. Make sure waiting processes are woken on modem-status changes. Currently processes are only woken on termios changes regardless of whether the modem status has changed. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: mct_u232: fix use-after-free in TIOCMIWAITJohan Hovold
commit cf1d24443677a0758cfa88ca40f24858b89261c0 upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: io_ti: fix use-after-free in TIOCMIWAITJohan Hovold
commit 7b2459690584f239650a365f3411ba2ec1c6d1e0 upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: io_edgeport: fix use-after-free in TIOCMIWAITJohan Hovold
commit 333576255d4cfc53efd056aad438568184b36af6 upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: ftdi_sio: fix use-after-free in TIOCMIWAITJohan Hovold
commit 71ccb9b01981fabae27d3c98260ea4613207618e upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. When switching to tty ports, some lifetime assumptions were changed. Specifically, close can now be called before the final tty reference is dropped as part of hangup at device disconnect. Even with the ftdi private-data refcounting this means that the port private data can be freed while a process is sleeping on modem-status changes and thus cannot be relied on to detect disconnects when woken up. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: cypress_m8: fix use-after-free in TIOCMIWAITJohan Hovold
commit 356050d8b1e526db093e9d2c78daf49d6bf418e3 upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Also remove bogus test for private data pointer being NULL as it is never assigned in the loop. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: ch341: fix use-after-free in TIOCMIWAITJohan Hovold
commit fa1e11d5231c001c80a479160b5832933c5d35fb upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: ark3116: fix use-after-free in TIOCMIWAITJohan Hovold
commit 5018860321dc7a9e50a75d5f319bc981298fb5b7 upstream. Use the port wait queue and make sure to check the serial disconnected flag before accessing private port data after waking up. This is is needed as the private port data (including the wait queue itself) can be gone when waking up after a disconnect. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: serial: fix hang when opening portMing Lei
commit eba0e3c3a0ba7b96f01cbe997680f6a4401a0bfc upstream. Johan's 'fix use-after-free in TIOCMIWAIT' patchset[1] introduces one bug which can cause kernel hang when opening port. This patch initialized the 'port->delta_msr_wait' waitqueue head to fix the bug which is introduced in 3.9-rc4. [1], http://marc.info/?l=linux-usb&m=136368139627876&w=2 Signed-off-by: Ming Lei <tom.leiming@gmail.com> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25USB: serial: add modem-status-change wait queueJohan Hovold
commit e5b33dc9d16053c2ae4c2c669cf008829530364b upstream. Add modem-status-change wait queue to struct usb_serial_port that subdrivers can use to implement TIOCMIWAIT. Currently subdrivers use a private wait queue which may have been released when waking up after device disconnected. Note that we're adding a new wait queue rather than reusing the tty-port one as we do not want to get woken up at hangup (yet). Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10Linux 3.2.43v3.2.43Ben Hutchings
2013-04-10HID: microsoft: do not use compound literal - fix buildJiri Slaby
commit 6b90466cfec2a2fe027187d675d8d14217c12d82 upstream. In patch "HID: microsoft: fix invalid rdesc for 3k kbd" I fixed support for MS 3k keyboards. However the added check using memcmp and a compound statement breaks build on architectures where memcmp is a macro with parameters. hid-microsoft.c:51:18: error: macro "memcmp" passed 6 arguments, but takes just 3 On x86_64, memcmp is a function, so I did not see the error. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10bonding: get netdev_rx_handler_unregister out of locksVeaceslav Falico
commit fcd99434fb5c137274d2e15dd2a6a7455f0f29ff upstream. Now that netdev_rx_handler_unregister contains synchronize_net(), we need to call it outside of bond->lock, cause it might sleep. Also, remove the already unneded synchronize_net(). Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10smsc75xx: fix jumbo frame supportSteve Glendinning
[ Upstream commit 4c51e53689569398d656e631c17308d9b8e84650 ] This patch enables RX of jumbo frames for LAN7500. Previously the driver would transmit jumbo frames succesfully but would drop received jumbo frames (incrementing the interface errors count). With this patch applied the device can succesfully receive jumbo frames up to MTU 9000 (9014 bytes on the wire including ethernet header). Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10pch_gbe: fix ip_summed checksum reporting on rxVeaceslav Falico
[ Upstream commit 76a0e68129d7d24eb995a6871ab47081bbfa0acc ] skb->ip_summed should be CHECKSUM_UNNECESSARY when the driver reports that checksums were correct and CHECKSUM_NONE in any other case. They're currently placed vice versa, which breaks the forwarding scenario. Fix it by placing them as described above. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10net: add a synchronize_net() in netdev_rx_handler_unregister()Eric Dumazet
[ Upstream commit 00cfec37484761a44a3b6f4675a54caa618210ae ] commit 35d48903e97819 (bonding: fix rx_handler locking) added a race in bonding driver, reported by Steven Rostedt who did a very good diagnosis : <quoting Steven> I'm currently debugging a crash in an old 3.0-rt kernel that one of our customers is seeing. The bug happens with a stress test that loads and unloads the bonding module in a loop (I don't know all the details as I'm not the one that is directly interacting with the customer). But the bug looks to be something that may still be present and possibly present in mainline too. It will just be much harder to trigger it in mainline. In -rt, interrupts are threads, and can schedule in and out just like any other thread. Note, mainline now supports interrupt threads so this may be easily reproducible in mainline as well. I don't have the ability to tell the customer to try mainline or other kernels, so my hands are somewhat tied to what I can do. But according to a core dump, I tracked down that the eth irq thread crashed in bond_handle_frame() here: slave = bond_slave_get_rcu(skb->dev); bond = slave->bond; <--- BUG the slave returned was NULL and accessing slave->bond caused a NULL pointer dereference. Looking at the code that unregisters the handler: void netdev_rx_handler_unregister(struct net_device *dev) { ASSERT_RTNL(); RCU_INIT_POINTER(dev->rx_handler, NULL); RCU_INIT_POINTER(dev->rx_handler_data, NULL); } Which is basically: dev->rx_handler = NULL; dev->rx_handler_data = NULL; And looking at __netif_receive_skb() we have: rx_handler = rcu_dereference(skb->dev->rx_handler); if (rx_handler) { if (pt_prev) { ret = deliver_skb(skb, pt_prev, orig_dev); pt_prev = NULL; } switch (rx_handler(&skb)) { My question to all of you is, what stops this interrupt from happening while the bonding module is unloading? What happens if the interrupt triggers and we have this: CPU0 CPU1 ---- ---- rx_handler = skb->dev->rx_handler netdev_rx_handler_unregister() { dev->rx_handler = NULL; dev->rx_handler_data = NULL; rx_handler() bond_handle_frame() { slave = skb->dev->rx_handler; bond = slave->bond; <-- NULL pointer dereference!!! What protection am I missing in the bond release handler that would prevent the above from happening? </quoting Steven> We can fix bug this in two ways. First is adding a test in bond_handle_frame() and others to check if rx_handler_data is NULL. A second way is adding a synchronize_net() in netdev_rx_handler_unregister() to make sure that a rcu protected reader has the guarantee to see a non NULL rx_handler_data. The second way is better as it avoids an extra test in fast path. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10ks8851: Fix interpretation of rxlen field.Max.Nekludov@us.elster.com
[ Upstream commit 14bc435ea54cb888409efb54fc6b76c13ef530e9 ] According to the Datasheet (page 52): 15-12 Reserved 11-0 RXBC Receive Byte Count This field indicates the present received frame byte size. The code has a bug: rxh = ks8851_rdreg32(ks, KS_RXFHSR); rxstat = rxh & 0xffff; rxlen = rxh >> 16; // BUG!!! 0xFFF mask should be applied Signed-off-by: Max Nekludov <Max.Nekludov@us.elster.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10ipv6: don't accept node local multicast traffic from the wireHannes Frederic Sowa
[ Upstream commit 1c4a154e5253687c51123956dfcee9e9dfa8542d ] Erik Hugne's errata proposal (Errata ID: 3480) to RFC4291 has been verified: http://www.rfc-editor.org/errata_search.php?eid=3480 We have to check for pkt_type and loopback flag because either the packets are allowed to travel over the loopback interface (in which case pkt_type is PACKET_HOST and IFF_LOOPBACK flag is set) or they travel over a non-loopback interface back to us (in which case PACKET_TYPE is PACKET_LOOPBACK and IFF_LOOPBACK flag is not set). Cc: Erik Hugne <erik.hugne@ericsson.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10ipv6: fix bad free of addrconf_init_netHong Zhiguo
[ Upstream commit a79ca223e029aa4f09abb337accf1812c900a800 ] Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>