summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-04-08nfp: break up nfp_net_{alloc|free}_ringsJakub Kicinski
nfp_net_{alloc|free}_rings contained strange mix of allocations and vector initialization. Remove it, declare vector init as a separate function and handle allocations explicitly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08nfp: move link state interrupt request/free callsJakub Kicinski
We need to be able to disable the link state interrupt when the device is brought down. We used to just free the IRQ at the beginning of .ndo_stop(). As we now move towards more ordered .ndo_open()/.ndo_stop() paths LSC allocation should be placed in the "allocate resource" section. Since the IRQ can't be freed early in .ndo_stop(), it is disabled instead. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08nfp: correct RX buffer length calculationJakub Kicinski
When calculating the RX buffer length we need to account for up to 2 VLAN tags. Rounding up to 1k is an relic of a distant past and can be removed. While at it also remove trivial print statement. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08orangefs: Add KERN_<LEVEL> to gossip_<level> macrosJoe Perches
Emit the logging messages at the appropriate levels. Miscellanea: o Change format to fmt o Use the more common ##__VA_ARGS__ Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-04-08orangefs: strncpy -> strscpyMartin Brandenburg
It would have been possible for a rogue client-core to send in a symlink target which is not NUL terminated. This returns EIO if the client-core gives us corrupt data. Leave debugfs and superblock code as is for now. Other dcache.c and namei.c strncpy instances are safe because ORANGEFS_NAME_MAX = NAME_MAX + 1; there is always enough space for a name plus a NUL byte. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-04-08orangefs: clean up truncate ctime and mtime settingMartin Brandenburg
The ctime and mtime are always updated on a successful ftruncate and only updated on a successful truncate where the size changed. We handle the ``if the size changed'' bit. This matches FUSE's behavior. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-04-08Orangefs: fix ifnullfree.cocci warningskbuild test robot
fs/orangefs/orangefs-debugfs.c:130:2-26: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values. NULL check before some freeing functions is not needed. Based on checkpatch warning "kfree(NULL) is safe this check is probably not required" and kfreeaddr.cocci by Julia Lawall. Generated by: scripts/coccinelle/free/ifnullfree.cocci Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-04-08Orangefs: optimize boilerplate code.Mike Marshall
Suggested by David Binderman <dcb314@hotmail.com> The former can potentially be a performance win over the latter. memcpy(d, s, len); memset(d+len, c, size-len); memset(d, c, size); memcpy(d, s, len); Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-04-08Orangefs: xattr.c cleanupMike Marshall
1. It is nonsense to test for negative size_t, suggested by David Binderman <dcb314@hotmail.com> 2. By the time Orangefs gets called, the vfs has ensured that name != NULL, and that buffer and size are sane. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-04-08ieee802154/adf7242: fix memory leak of firmwareSudip Mukherjee
If the firmware upload or the firmware verification fails then we printed the error message and exited but we missed releasing the firmware. Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Acked-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-086lowpan: iphc: fix handling of link-local compressionAlexander Aring
This patch fixes handling in case of link-local address compression. A IPv6 link-local address is defined as fe80::/10 prefix which is also what ipv6_addr_type checks for link-local addresses. But IPHC compression for link-local addresses are for fe80::/64 types only. This patch adds additional checks for zero padded bits in case of link-local address compression to match on a fe80::/64 address only. Signed-off-by: Alexander Aring <aar@pengutronix.de> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08Bluetooth: vhci: purge unhandled skbsJiri Slaby
The write handler allocates skbs and queues them into data->readq. Read side should read them, if there is any. If there is none, skbs should be dropped by hdev->flush. But this happens only if the device is HCI_UP, i.e. hdev->power_on work was triggered already. When it was not, skbs stay allocated in the queue when /dev/vhci is closed. So purge the queue in ->release. Program to reproduce: #include <err.h> #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/stat.h> #include <sys/types.h> #include <sys/uio.h> int main() { char buf[] = { 0xff, 0 }; struct iovec iov = { .iov_base = buf, .iov_len = sizeof(buf), }; int fd; while (1) { fd = open("/dev/vhci", O_RDWR); if (fd < 0) err(1, "open"); usleep(50); if (writev(fd, &iov, 1) < 0) err(1, "writev"); usleep(50); close(fd); } return 0; } Result: kmemleak: 4609 new suspected memory leaks unreferenced object 0xffff88059f4d5440 (size 232): comm "vhci", pid 1084, jiffies 4294912542 (age 37569.296s) hex dump (first 32 bytes): 20 f0 23 87 05 88 ff ff 20 f0 23 87 05 88 ff ff .#..... .#..... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: ... [<ffffffff81ece010>] __alloc_skb+0x0/0x5a0 [<ffffffffa021886c>] vhci_create_device+0x5c/0x580 [hci_vhci] [<ffffffffa0219436>] vhci_write+0x306/0x4c8 [hci_vhci] Fixes: 23424c0d31 (Bluetooth: Add support creating virtual AMP controllers) Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable 3.13+ <stable@vger.kernel.org>
2016-04-08Bluetooth: vhci: fix open_timeout vs. hdev raceJiri Slaby
Both vhci_get_user and vhci_release race with open_timeout work. They both contain cancel_delayed_work_sync, but do not test whether the work actually created hdev or not. Since the work can be in progress and _sync will wait for finishing it, we can have data->hdev allocated when cancel_delayed_work_sync returns. But the call sites do 'if (data->hdev)' *before* cancel_delayed_work_sync. As a result: * vhci_get_user allocates a second hdev and puts it into data->hdev. The former is leaked. * vhci_release does not release data->hdev properly as it thinks there is none. Fix both cases by moving the actual test *after* the call to cancel_delayed_work_sync. This can be hit by this program: #include <err.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <time.h> #include <unistd.h> #include <sys/stat.h> #include <sys/types.h> int main(int argc, char **argv) { int fd; srand(time(NULL)); while (1) { const int delta = (rand() % 200 - 100) * 100; fd = open("/dev/vhci", O_RDWR); if (fd < 0) err(1, "open"); usleep(1000000 + delta); close(fd); } return 0; } And the result is: BUG: KASAN: use-after-free in skb_queue_tail+0x13e/0x150 at addr ffff88006b0c1228 Read of size 8 by task kworker/u13:1/32068 ============================================================================= BUG kmalloc-192 (Tainted: G E ): kasan: bad access detected ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Allocated in vhci_open+0x50/0x330 [hci_vhci] age=260 cpu=3 pid=32040 ... kmem_cache_alloc_trace+0x150/0x190 vhci_open+0x50/0x330 [hci_vhci] misc_open+0x35b/0x4e0 chrdev_open+0x23b/0x510 ... INFO: Freed in vhci_release+0xa4/0xd0 [hci_vhci] age=9 cpu=2 pid=32040 ... __slab_free+0x204/0x310 vhci_release+0xa4/0xd0 [hci_vhci] ... INFO: Slab 0xffffea0001ac3000 objects=16 used=13 fp=0xffff88006b0c1e00 flags=0x5fffff80004080 INFO: Object 0xffff88006b0c1200 @offset=4608 fp=0xffff88006b0c0600 Bytes b4 ffff88006b0c11f0: 09 df 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff88006b0c1200: 00 06 0c 6b 00 88 ff ff 00 00 00 00 00 00 00 00 ...k............ Object ffff88006b0c1210: 10 12 0c 6b 00 88 ff ff 10 12 0c 6b 00 88 ff ff ...k.......k.... Object ffff88006b0c1220: c0 46 c2 6b 00 88 ff ff c0 46 c2 6b 00 88 ff ff .F.k.....F.k.... Object ffff88006b0c1230: 01 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 ................ Object ffff88006b0c1240: 40 12 0c 6b 00 88 ff ff 40 12 0c 6b 00 88 ff ff @..k....@..k.... Object ffff88006b0c1250: 50 0d 6e a0 ff ff ff ff 00 02 00 00 00 00 ad de P.n............. Object ffff88006b0c1260: 00 00 00 00 00 00 00 00 ab 62 02 00 01 00 00 00 .........b...... Object ffff88006b0c1270: 90 b9 19 81 ff ff ff ff 38 12 0c 6b 00 88 ff ff ........8..k.... Object ffff88006b0c1280: 03 00 20 00 ff ff ff ff ff ff ff ff 00 00 00 00 .. ............. Object ffff88006b0c1290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff88006b0c12a0: 00 00 00 00 00 00 00 00 00 80 cd 3d 00 88 ff ff ...........=.... Object ffff88006b0c12b0: 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 . .............. Redzone ffff88006b0c12c0: bb bb bb bb bb bb bb bb ........ Padding ffff88006b0c13f8: 00 00 00 00 00 00 00 00 ........ CPU: 3 PID: 32068 Comm: kworker/u13:1 Tainted: G B E 4.4.6-0-default #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20151112_172657-sheep25 04/01/2014 Workqueue: hci0 hci_cmd_work [bluetooth] 00000000ffffffff ffffffff81926cfa ffff88006be37c68 ffff88006bc27180 ffff88006b0c1200 ffff88006b0c1234 ffffffff81577993 ffffffff82489320 ffff88006bc24240 0000000000000046 ffff88006a100000 000000026e51eb80 Call Trace: ... [<ffffffff81ec8ebe>] ? skb_queue_tail+0x13e/0x150 [<ffffffffa06e027c>] ? vhci_send_frame+0xac/0x100 [hci_vhci] [<ffffffffa0c61268>] ? hci_send_frame+0x188/0x320 [bluetooth] [<ffffffffa0c61515>] ? hci_cmd_work+0x115/0x310 [bluetooth] [<ffffffff811a1375>] ? process_one_work+0x815/0x1340 [<ffffffff811a1f85>] ? worker_thread+0xe5/0x11f0 [<ffffffff811a1ea0>] ? process_one_work+0x1340/0x1340 [<ffffffff811b3c68>] ? kthread+0x1c8/0x230 ... Memory state around the buggy address: ffff88006b0c1100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88006b0c1180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88006b0c1200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88006b0c1280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff88006b0c1300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Fixes: 23424c0d31 (Bluetooth: Add support creating virtual AMP controllers) Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: stable 3.13+ <stable@vger.kernel.org>
2016-04-08Bluetooth: Allow setting BT_SECURITY_FIPS with setsockoptPatrik Flykt
Update the security level check to allow setting BT_SECURITY_FIPS for an L2CAP socket. Signed-off-by: Patrik Flykt <patrik.flykt@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08Bluetooth: hci_ldisc: Fix null pointer derefence in case of early dataLoic Poulain
HCI_UART_PROTO_SET flag is set before hci_uart_set_proto call. If we receive data from tty layer during this procedure, proto pointer may not be assigned yet, leading to null pointer dereference in rx method hci_uart_tty_receive. This patch fixes this issue by introducing HCI_UART_PROTO_READY flag in order to avoid any proto operation before proto opening and assignment. Signed-off-by: Loic Poulain <loic.poulain@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08Bluetooth: hci_bcm: Add BCM2E71 ACPI IDLoic Poulain
This ID is used at least by Asus T100-CHI. Signed-off-by: Loic Poulain <loic.poulain@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08Bluetooth: Ignore unknown advertising packet typesJohan Hedberg
In case of buggy controllers send advertising packet types that we don't know of we should simply ignore them instead of trying to react to them in some (potentially wrong) way. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08Bluetooth: Fix setting NO_BREDR advertising flagJohan Hedberg
If we're dealing with a single-mode controller or BR/EDR is disable for a dual-mode one, the NO_BREDR flag needs to be unconditionally present in the advertising data. This patch moves it out from behind an extra condition to be always set in the create_instance_adv_data() function if BR/EDR is disabled. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08mpls: find_outdev: check for err ptr in addition to NULL checkRoopa Prabhu
find_outdev calls inet{,6}_fib_lookup_dev() or dev_get_by_index() to find the output device. In case of an error, inet{,6}_fib_lookup_dev() returns error pointer and dev_get_by_index() returns NULL. But the function only checks for NULL and thus can end up calling dev_put on an ERR_PTR. This patch adds an additional check for err ptr after the NULL check. Before: Trying to add an mpls route with no oif from user, no available path to 10.1.1.8 and no default route: $ip -f mpls route add 100 as 200 via inet 10.1.1.8 [ 822.337195] BUG: unable to handle kernel NULL pointer dereference at 00000000000003a3 [ 822.340033] IP: [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182 [ 822.340033] PGD 1db38067 PUD 1de9e067 PMD 0 [ 822.340033] Oops: 0000 [#1] SMP [ 822.340033] Modules linked in: [ 822.340033] CPU: 0 PID: 11148 Comm: ip Not tainted 4.5.0-rc7+ #54 [ 822.340033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014 [ 822.340033] task: ffff88001db82580 ti: ffff88001dad4000 task.ti: ffff88001dad4000 [ 822.340033] RIP: 0010:[<ffffffff8148781e>] [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182 [ 822.340033] RSP: 0018:ffff88001dad7a88 EFLAGS: 00010282 [ 822.340033] RAX: ffffffffffffff9b RBX: ffffffffffffff9b RCX: 0000000000000002 [ 822.340033] RDX: 00000000ffffff9b RSI: 0000000000000008 RDI: 0000000000000000 [ 822.340033] RBP: ffff88001ddc9ea0 R08: ffff88001e9f1768 R09: 0000000000000000 [ 822.340033] R10: ffff88001d9c1100 R11: ffff88001e3c89f0 R12: ffffffff8187e0c0 [ 822.340033] R13: ffffffff8187e0c0 R14: ffff88001ddc9e80 R15: 0000000000000004 [ 822.340033] FS: 00007ff9ed798700(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 [ 822.340033] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 822.340033] CR2: 00000000000003a3 CR3: 000000001de89000 CR4: 00000000000006f0 [ 822.340033] Stack: [ 822.340033] 0000000000000000 0000000100000000 0000000000000000 0000000000000000 [ 822.340033] 0000000000000000 0801010a00000000 0000000000000000 0000000000000000 [ 822.340033] 0000000000000004 ffffffff8148749b ffffffff8187e0c0 000000000000001c [ 822.340033] Call Trace: [ 822.340033] [<ffffffff8148749b>] ? mpls_rt_alloc+0x2b/0x3e [ 822.340033] [<ffffffff81488e66>] ? mpls_rtm_newroute+0x358/0x3e2 [ 822.340033] [<ffffffff810e7bbc>] ? get_page+0x5/0xa [ 822.340033] [<ffffffff813b7d94>] ? rtnetlink_rcv_msg+0x17e/0x191 [ 822.340033] [<ffffffff8111794e>] ? __kmalloc_track_caller+0x8c/0x9e [ 822.340033] [<ffffffff813c9393>] ? rht_key_hashfn.isra.20.constprop.57+0x14/0x1f [ 822.340033] [<ffffffff813b7c16>] ? __rtnl_unlock+0xc/0xc [ 822.340033] [<ffffffff813cb794>] ? netlink_rcv_skb+0x36/0x82 [ 822.340033] [<ffffffff813b4507>] ? rtnetlink_rcv+0x1f/0x28 [ 822.340033] [<ffffffff813cb2b1>] ? netlink_unicast+0x106/0x189 [ 822.340033] [<ffffffff813cb5b3>] ? netlink_sendmsg+0x27f/0x2c8 [ 822.340033] [<ffffffff81392ede>] ? sock_sendmsg_nosec+0x10/0x1b [ 822.340033] [<ffffffff81393df1>] ? ___sys_sendmsg+0x182/0x1e3 [ 822.340033] [<ffffffff810e4f35>] ? __alloc_pages_nodemask+0x11c/0x1e4 [ 822.340033] [<ffffffff8110619c>] ? PageAnon+0x5/0xd [ 822.340033] [<ffffffff811062fe>] ? __page_set_anon_rmap+0x45/0x52 [ 822.340033] [<ffffffff810e7bbc>] ? get_page+0x5/0xa [ 822.340033] [<ffffffff810e85ab>] ? __lru_cache_add+0x1a/0x3a [ 822.340033] [<ffffffff81087ea9>] ? current_kernel_time64+0x9/0x30 [ 822.340033] [<ffffffff813940c4>] ? __sys_sendmsg+0x3c/0x5a [ 822.340033] [<ffffffff8148f597>] ? entry_SYSCALL_64_fastpath+0x12/0x6a [ 822.340033] Code: 83 08 04 00 00 65 ff 00 48 8b 3c 24 e8 40 7c f2 ff eb 13 48 c7 c3 9f ff ff ff eb 0f 89 ce e8 f1 ae f1 ff 48 89 c3 48 85 db 74 15 <48> 8b 83 08 04 00 00 65 ff 08 48 81 fb 00 f0 ff ff 76 0d eb 07 [ 822.340033] RIP [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182 [ 822.340033] RSP <ffff88001dad7a88> [ 822.340033] CR2: 00000000000003a3 [ 822.435363] ---[ end trace 98cc65e6f6b8bf11 ]--- After patch: $ip -f mpls route add 100 as 200 via inet 10.1.1.8 RTNETLINK answers: Network is unreachable Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Reported-by: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08Merge branch '10GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2016-04-07 This series contains updates to ixgbe and ixgbevf. This entire series (except for one patch from Alex) comes from Mark and is mainly to add support for our new MAC (x550em_a). So let's get Alex's patch out of the way first before we cover Mark's many changes. Alex does his enable bulk free in transmit cleanup for ixgbe and ixgbevf, like his has done for all of our other drivers. First Mark cleans up registers that were not being used, so do some house cleaning. Then to avoid casting lan_id and func fields, just make them u8 since they only hold small values anyways. Found and fixed an issue where on read operations it could be possible to modify locations beyond the length passed in, so change the check to round up in the same way. Cleaned up the interface for issuing firmware commands to use a void * instead of a u32 * which eliminates a number of casts. Added support for the new MAC and provided method pointers and use them to access IOSF-attached devices, since the new MAC will also need a new access method. Added support for SFPs with an external retimer and for an SGMII backplane interface. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07ipv6: Count in extension headers in skb->network_headerJakub Sitnicki
When sending a UDPv6 message longer than MTU, account for the length of fragmentable IPv6 extension headers in skb->network_header offset. Same as we do in alloc_new_skb path in __ip6_append_data(). This ensures that later on __ip6_make_skb() will make space in headroom for fragmentable extension headers: /* move skb->data to ip header from ext header */ if (skb->data < skb_network_header(skb)) __skb_pull(skb, skb_network_offset(skb)); Prevents a splat due to skb_under_panic: skbuff: skb_under_panic: text:ffffffff8143397b len:2126 put:14 \ head:ffff880005bacf50 data:ffff880005bacf4a tail:0x48 end:0xc0 dev:lo ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:104! invalid opcode: 0000 [#1] KASAN CPU: 0 PID: 160 Comm: reproducer Not tainted 4.6.0-rc2 #65 [...] Call Trace: [<ffffffff813eb7b9>] skb_push+0x79/0x80 [<ffffffff8143397b>] eth_header+0x2b/0x100 [<ffffffff8141e0d0>] neigh_resolve_output+0x210/0x310 [<ffffffff814eab77>] ip6_finish_output2+0x4a7/0x7c0 [<ffffffff814efe3a>] ip6_output+0x16a/0x280 [<ffffffff815440c1>] ip6_local_out+0xb1/0xf0 [<ffffffff814f1115>] ip6_send_skb+0x45/0xd0 [<ffffffff81518836>] udp_v6_send_skb+0x246/0x5d0 [<ffffffff8151985e>] udpv6_sendmsg+0xa6e/0x1090 [...] Reported-by: Ji Jianwen <jiji@redhat.com> Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07Revert "ib_srpt: Convert to percpu_ida tag allocation"Bart Van Assche
This reverts commit 0fd10721fe3664f7549e74af9d28a509c9a68719. That patch causes the ib_srpt driver to crash as soon as the first SCSI command is received: kernel BUG at drivers/infiniband/ulp/srpt/ib_srpt.c:1439! invalid opcode: 0000 [#1] SMP Workqueue: target_completion target_complete_ok_work [target_core_mod] RIP: srpt_queue_response+0x437/0x4a0 [ib_srpt] Call Trace: srpt_queue_data_in+0x9/0x10 [ib_srpt] target_complete_ok_work+0x152/0x2b0 [target_core_mod] process_one_work+0x197/0x480 worker_thread+0x49/0x490 kthread+0xea/0x100 ret_from_fork+0x22/0x40 Aside from the crash, the shortcomings of that patch are as follows: - It makes the ib_srpt driver use I/O contexts allocated by transport_alloc_session_tags() but it does not initialize these I/O contexts properly. All the initializations performed by srpt_alloc_ioctx() are skipped. - It swaps the order of the send ioctx allocation and the transition to RTR mode which is wrong. - The amount of memory that is needed for I/O contexts is doubled. - srpt_rdma_ch.free_list is no longer used but is not removed. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-07Merge branch 'bpf-tracepoints'David S. Miller
Alexei Starovoitov says: ==================== allow bpf attach to tracepoints Hi Steven, Peter, v1->v2: addressed Peter's comments: - fixed wording in patch 1, added ack - refactored 2nd patch into 3: 2/10 remove unused __perf_addr macro which frees up an argument in perf_trace_buf_submit 3/10 split perf_trace_buf_prepare into alloc and update parts, so that bpf programs don't have to pay performance penalty for update of struct trace_entry which is not going to be accessed by bpf 4/10 actual addition of bpf filter to perf tracepoint handler is now trivial and bpf prog can be used as proper filter of tracepoints v1 cover: last time we discussed bpf+tracepoints it was a year ago [1] and the reason we didn't proceed with that approach was that bpf would make arguments arg1, arg2 to trace_xx(arg1, arg2) call to be exposed to bpf program and that was considered unnecessary extension of abi. Back then I wanted to avoid the cost of buffer alloc and field assign part in all of the tracepoints, but looks like when optimized the cost is acceptable. So this new apporach doesn't expose any new abi to bpf program. The program is looking at tracepoint fields after they were copied by perf_trace_xx() and described in /sys/kernel/debug/tracing/events/xxx/format We made a tool [2] that takes arguments from /sys/.../format and works as: $ tplist.py -v random:urandom_read int got_bits; int pool_left; int input_left; Then these fields can be copy-pasted into bpf program like: struct urandom_read { __u64 hidden_pad; int got_bits; int pool_left; int input_left; }; and the program can use it: SEC("tracepoint/random/urandom_read") int bpf_prog(struct urandom_read *ctx) { return ctx->pool_left > 0 ? 1 : 0; } This way the program can access tracepoint fields faster than equivalent bpf+kprobe program, which is the main goal of these patches. Patch 1-4 are simple changes in perf core side, please review. I'd like to take the whole set via net-next tree, since the rest of the patches might conflict with other bpf work going on in net-next and we want to avoid cross-tree merge conflicts. Alternatively we can put patches 1-4 into both tip and net-next. Patch 9 is an example of access to tracepoint fields from bpf prog. Patch 10 is a micro benchmark for bpf+kprobe vs bpf+tracepoint. Note that for actual tracing tools the user doesn't need to run tplist.py and copy-paste fields manually. The tools do it automatically. Like argdist tool [3] can be used as: $ argdist -H 't:block:block_rq_complete():u32:nr_sector' where 'nr_sector' is name of tracepoint field taken from /sys/kernel/debug/tracing/events/block/block_rq_complete/format and appropriate bpf program is generated on the fly. [1] http://thread.gmane.org/gmane.linux.kernel.api/8127/focus=8165 [2] https://github.com/iovisor/bcc/blob/master/tools/tplist.py [3] https://github.com/iovisor/bcc/blob/master/tools/argdist.py ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07samples/bpf: add tracepoint vs kprobe performance testsAlexei Starovoitov
the first microbenchmark does fd=open("/proc/self/comm"); for() { write(fd, "test"); } and on 4 cpus in parallel: writes per sec base (no tracepoints, no kprobes) 930k with kprobe at __set_task_comm() 420k with tracepoint at task:task_rename 730k For kprobe + full bpf program manully fetches oldcomm, newcomm via bpf_probe_read. For tracepint bpf program does nothing, since arguments are copied by tracepoint. 2nd microbenchmark does: fd=open("/dev/urandom"); for() { read(fd, buf); } and on 4 cpus in parallel: reads per sec base (no tracepoints, no kprobes) 300k with kprobe at urandom_read() 279k with tracepoint at random:urandom_read 290k bpf progs attached to kprobe and tracepoint are noop. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07samples/bpf: tracepoint exampleAlexei Starovoitov
modify offwaketime to work with sched/sched_switch tracepoint instead of kprobe into finish_task_switch Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07samples/bpf: add tracepoint support to bpf loaderAlexei Starovoitov
Recognize "tracepoint/" section name prefix and attach the program to that tracepoint. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07bpf: sanitize bpf tracepoint accessAlexei Starovoitov
during bpf program loading remember the last byte of ctx access and at the time of attaching the program to tracepoint check that the program doesn't access bytes beyond defined in tracepoint fields This also disallows access to __dynamic_array fields, but can be relaxed in the future. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07bpf: support bpf_get_stackid() and bpf_perf_event_output() in tracepoint ↵Alexei Starovoitov
programs needs two wrapper functions to fetch 'struct pt_regs *' to convert tracepoint bpf context into kprobe bpf context to reuse existing helper functions Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07bpf: register BPF_PROG_TYPE_TRACEPOINT program typeAlexei Starovoitov
register tracepoint bpf program type and let it call the same set of helper functions as BPF_PROG_TYPE_KPROBE Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07perf, bpf: allow bpf programs attach to tracepointsAlexei Starovoitov
introduce BPF_PROG_TYPE_TRACEPOINT program type and allow it to be attached to the perf tracepoint handler, which will copy the arguments into the per-cpu buffer and pass it to the bpf program as its first argument. The layout of the fields can be discovered by doing 'cat /sys/kernel/debug/tracing/events/sched/sched_switch/format' prior to the compilation of the program with exception that first 8 bytes are reserved and not accessible to the program. This area is used to store the pointer to 'struct pt_regs' which some of the bpf helpers will use: +---------+ | 8 bytes | hidden 'struct pt_regs *' (inaccessible to bpf program) +---------+ | N bytes | static tracepoint fields defined in tracepoint/format (bpf readonly) +---------+ | dynamic | __dynamic_array bytes of tracepoint (inaccessible to bpf yet) +---------+ Not that all of the fields are already dumped to user space via perf ring buffer and broken application access it directly without consulting tracepoint/format. Same rule applies here: static tracepoint fields should only be accessed in a format defined in tracepoint/format. The order of fields and field sizes are not an ABI. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07perf: split perf_trace_buf_prepare into alloc and update partsAlexei Starovoitov
split allows to move expensive update of 'struct trace_entry' to later phase. Repurpose unused 1st argument of perf_tp_event() to indicate event type. While splitting use temp variable 'rctx' instead of '*rctx' to avoid unnecessary loads done by the compiler due to -fno-strict-aliasing Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07perf: remove unused __addr variableAlexei Starovoitov
now all calls to perf_trace_buf_submit() pass 0 as 4th argument which will be repurposed in the next patch which will change the meaning of 1st arg of perf_tp_event() to event_type Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07perf: optimize perf_fetch_caller_regsAlexei Starovoitov
avoid memset in perf_fetch_caller_regs, since it's the critical path of all tracepoints. It's called from perf_sw_event_sched, perf_event_task_sched_in and all of perf_trace_##call with this_cpu_ptr(&__perf_regs[..]) which are zero initialized by perpcu init logic and subsequent call to perf_arch_fetch_caller_regs initializes the same fields on all archs, so we can safely drop memset from all of the above cases and move it into perf_ftrace_function_call that calls it with stack allocated pt_regs. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07net: Fix build failure due to lockdep_sock_is_held().David S. Miller
Needs to be protected with CONFIG_LOCKDEP. Based upon a patch by Hannes Frederic Sowa. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "These changes contains a fix for overlayfs interacting with some (badly behaved) dentry code in various file systems. These have been reviewed by Al and the respective file system mtinainers and are going through the ext4 tree for convenience. This also has a few ext4 encryption bug fixes that were discovered in Android testing (yes, we will need to get these sync'ed up with the fs/crypto code; I'll take care of that). It also has some bug fixes and a change to ignore the legacy quota options to allow for xfstests regression testing of ext4's internal quota feature and to be more consistent with how xfs handles this case" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: ignore quota mount options if the quota feature is enabled ext4 crypto: fix some error handling ext4: avoid calling dquot_get_next_id() if quota is not enabled ext4: retry block allocation for failed DIO and DAX writes ext4: add lockdep annotations for i_data_sem ext4: allow readdir()'s of large empty directories to be interrupted btrfs: fix crash/invalid memory access on fsync when using overlayfs ext4 crypto: use dget_parent() in ext4_d_revalidate() ext4: use file_dentry() ext4: use dget_parent() in ext4_file_open() nfs: use file_dentry() fs: add file_dentry() ext4 crypto: don't let data integrity writebacks fail with ENOMEM ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()
2016-04-07ixgbe: Bump version numberMark Rustad
Update ixgbe version number. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe: Add KR backplane support for x550em_aMark Rustad
Add support for x550em_a-based KR backplane devices. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe: Add support for SGMII backplane interfaceMark Rustad
Add support for an SGMII backplane interface. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe: Add support for SFPs with retimerMark Rustad
Add support for SFPs with an external retimer. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe: Introduce function to control MDIO speedMark Rustad
Move code that controls MDIO speed into a new function because there will be more MACs that need the control. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe: Read and parse NW_MNG_IF_SEL registerMark Rustad
Read the IXGBE_NW_MNG_IF_SEL register and use it to set interface attributes. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fix from Sage Weil: "This just fixes a few remaining memory allocations in RBD to use GFP_NOIO instead of GFP_ATOMIC" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: use GFP_NOIO consistently for request allocations
2016-04-07ixgbe: Read and set instance idMark Rustad
Read the instance number from EEPROM and save it for later use. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe: Use new methods for PHY accessMark Rustad
Now x550em_a devices will use a new method for PHY access that will get the firmware token for each access. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio/qemu fixes from Michael S Tsirkin: "A couple of fixes for virtio and for the new QEMU fw cfg driver" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio: add VIRTIO_CONFIG_S_NEEDS_RESET device status bit MAINTAINERS: add entry for QEMU firmware: qemu_fw_cfg.c: hold ACPI global lock during device access virtio: virtio 1.0 cs04 spec compliance for reset qemu_fw_cfg: don't leak kobj on init error
2016-04-07ixgbe: Add support for x550em_a 10G MAC typeMark Rustad
Add support for x550em_a 10G MAC type to the ixgbe driver. The new MAC includes new firmware commands that need to be used to control PHY and IOSF access, so that support is also added. The interface supported is a native SFP+ interface. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe: Use method pointer to access IOSF devicesMark Rustad
Provide method pointers and use them to access IOSF-attached devices. A new MAC will introduce a new access method. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe: Add definitions for x550em_a 10G MACMark Rustad
Add definitions for a x550em_a 10G MAC device with a native SFP interface. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe: Add support for single-port X550 deviceMark Rustad
Add support for a single-port X550 device. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-07ixgbe/ixgbevf: Add support for bulk free in Tx cleanup & cleanup boolean logicAlexander Duyck
This patch enables bulk free in Tx cleanup for ixgbevf and cleans up the boolean logic in the polling routines for ixgbe and ixgbevf in the hopes of avoiding any mix-ups similar to what occurred with i40e and i40evf. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>