summaryrefslogtreecommitdiff
path: root/net/ipv6
AgeCommit message (Collapse)Author
2010-12-01net/ipv6/sit.c: return unhandled skb to tunnel4_rcvDavid McCullough
I found a problem using an IPv6 over IPv4 tunnel. When CONFIG_IPV6_SIT was enabled, the packets would be rejected as net/ipv6/sit.c was catching all IPPROTO_IPV6 packets and returning an ICMP port unreachable error. I think this patch fixes the problem cleanly. I believe the code in net/ipv4/tunnel4.c:tunnel4_rcv takes care of it properly if none of the handlers claim the skb. Signed-off-by: David McCullough <david_mccullough@mcafee.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01Make the ip6_tunnel reflect the true mtu.Anders Franzen
The ip6_tunnel always assumes it consumes 40 bytes (ip6 hdr) of the mtu of the underlaying device. So for a normal ethernet bearer, the mtu of the ip6_tunnel is 1460. However, when creating a tunnel the encap limit option is enabled by default, and it consumes 8 bytes more, so the true mtu shall be 1452. I dont really know if this breaks some statement in some RFC, so this is a request for comments. Signed-off-by: Anders Franzen <anders.franzen@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30inet: Turn ->remember_stamp into ->get_peer in connection AF ops.David S. Miller
Then we can make a completely generic tcp_remember_stamp() that uses ->get_peer() as a helper, minimizing the AF specific code and minimizing the eventual code duplication when we implement the ipv6 side of TW recycling. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30ipv6: Add infrastructure to bind inet_peer objects to routes.David S. Miller
They are only allowed on cached ipv6 routes. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28ipv6: Prepare the tree for un-inlined jhash.Jozsef Kadlecsik
jhash is widely used in the kernel and because the functions are inlined, the cost in size is significant. Also, the new jhash functions are slightly larger than the previous ones so better un-inline. As a preparation step, the calls to the internal macros are replaced with the plain jhash function calls. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28ipv6: kill two unused macro definitionShan Wei
1. IPV6_TLV_TEL_DST_SIZE This has not been using for several years since created. 2. RT6_INFO_LEN commit 33120b30 kill all RT6_INFO_LEN's references, but only this definition remained. commit 33120b30cc3b8665204d4fcde7288638b0dd04d5 Author: Alexey Dobriyan <adobriyan@sw.ru> Date: Tue Nov 6 05:27:11 2007 -0800 [IPV6]: Convert /proc/net/ipv6_route to seq_file interface Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-27rtnl: make link af-specific updates atomicThomas Graf
As David pointed out correctly, updates to af-specific attributes are currently not atomic. If multiple changes are requested and one of them fails, previous updates may have been applied already leaving the link behind in a undefined state. This patch splits the function parse_link_af() into two functions validate_link_af() and set_link_at(). validate_link_af() is placed to validate_linkmsg() check for errors as early as possible before any changes to the link have been made. set_link_af() is called to commit the changes later. This method is not fail proof, while it is currently sufficient to make set_link_af() inerrable and thus 100% atomic, the validation function method will not be able to detect all error scenarios in the future, there will likely always be errors depending on states which are f.e. not protected by rtnl_mutex and thus may change between validation and setting. Also, instead of silently ignoring unknown address families and config blocks for address families which did not register a set function the errors EAFNOSUPPORT respectively EOPNOSUPPORT are returned to avoid comitting 4 out of 5 update requests without notifying the user. Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24ipv6: mcast: RCU conversionEric Dumazet
ipv6_sk_mc_lock rwlock becomes a spinlock. readers (inet6_mc_check()) now takes rcu_read_lock() instead of read lock. Writers dont need to disable BH anymore. struct ipv6_mc_socklist objects are reclaimed after one RCU grace period. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-22Net: ipv6: netfiliter: Makefile: Remove deprecated kbuild goal definitionsTracey Dent
Changed Makefile to use <modules>-y instead of <modules>-objs because -objs is deprecated and not mentioned in Documentation/kbuild/makefiles.txt. Signed-off-by: Tracey Dent <tdent48227@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-22ipv6: fix missing in6_ifa_put in addrconfJohn Fastabend
Fix ref count bug introduced by commit 2de795707294972f6c34bae9de713e502c431296 Author: Lorenzo Colitti <lorenzo@google.com> Date: Wed Oct 27 18:16:49 2010 +0000 ipv6: addrconf: don't remove address state on ifdown if the address is being kept Fix logic so that addrconf_ifdown() decrements the inet6_ifaddr refcnt correctly with in6_ifa_put(). Reported-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-19Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bonding/bond_main.c net/core/net-sysfs.c net/ipv6/addrconf.c
2010-11-18ipv6: Expose reachable and retrans timer values as msecsThomas Graf
Expose reachable and retrans timer values in msecs instead of jiffies. Both timer values are already exposed as msecs in the neighbour table netlink interface. The creation timestamp format with increased precision is kept but cleaned up. Signed-off-by: Thomas Graf <tgraf@infradead.org> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-18ipv6: Expose IFLA_PROTINFO timer values in msecs instead of jiffiesThomas Graf
IFLA_PROTINFO exposes timer related per device settings in jiffies. Change it to expose these values in msecs like the sysctl interface does. I did not find any users of IFLA_PROTINFO which rely on any of these values and even if there are, they are likely already broken because there is no way for them to reliably convert such a value to another time format. Signed-off-by: Thomas Graf <tgraf@infradead.org> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17net: use the macros defined for the members of flowiChangli Gao
Use the macros defined for the members of flowi to clean the code up. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17ipv6: AF_INET6 link address familyThomas Graf
IPv6 already exposes some address family data via netlink in the IFLA_PROTINFO attribute if RTM_GETLINK request is sent with the address family set to AF_INET6. We take over this format and reuse all the code. Signed-off-by: Thomas Graf <tgraf@infradead.org> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16udp: use atomic_inc_not_zero_hintEric Dumazet
UDP sockets refcount is usually 2, unless an incoming frame is going to be queued in receive or backlog queue. Using atomic_inc_not_zero_hint() permits to reduce latency, because processor issues less memory transactions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16ipv6: fix missing in6_ifa_put in addrconfJohn Fastabend
Fix ref count bug introduced by commit 2de795707294972f6c34bae9de713e502c431296 Author: Lorenzo Colitti <lorenzo@google.com> Date: Wed Oct 27 18:16:49 2010 +0000 ipv6: addrconf: don't remove address state on ifdown if the address is being kept Fix logic so that addrconf_ifdown() decrements the inet6_ifaddr refcnt correctly with in6_ifa_put(). Reported-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15net/ipv6/mcast.c: Remove unnecessary semicolonsJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-12ipv6: Warn users if maximum number of routes is reached.Ben Greear
This gives users at least some clue as to what the problem might be and how to go about fixing it. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-12ipv6: addrconf: don't remove address state on ifdown if the address is being ↵Lorenzo Colitti
kept Currently, addrconf_ifdown does not delete statically configured IPv6 addresses when the interface is brought down. The intent is that when the interface comes back up the address will be usable again. However, this doesn't actually work, because the system stops listening on the corresponding solicited-node multicast address, so the address cannot respond to neighbor solicitations and thus receive traffic. Also, the code notifies the rest of the system that the address is being deleted (e.g, RTM_DELADDR), even though it is not. Fix it so that none of this state is updated if the address is being kept on the interface. Tested: Added a statically configured IPv6 address to an interface, started ping, brought link down, brought link up again. When link came up ping kept on going and "ip -6 maddr" showed that the host was still subscribed to there Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6David S. Miller
2010-11-12netfilter: ipv6: fix overlap check for fragmentsShan Wei
The type of FRAG6_CB(prev)->offset is int, skb->len is *unsigned* int, and offset is int. Without this patch, type conversion occurred to this expression, when (FRAG6_CB(prev)->offset + prev->len) is less than offset. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-08ipv6: fix overlap check for fragmentsShan Wei
The type of FRAG6_CB(prev)->offset is int, skb->len is *unsigned* int, and offset is int. Without this patch, type conversion occurred to this expression, when (FRAG6_CB(prev)->offset + prev->len) is less than offset. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-03netfilter: ip6_tables: fix information leak to userspaceJan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-03net dst: fix percpu_counter list corruption and poison overwrittenXiaotian Feng
There're some percpu_counter list corruption and poison overwritten warnings in recent kernel, which is resulted by fc66f95c. commit fc66f95c switches to use percpu_counter, in ip6_route_net_init, kernel init the percpu_counter for dst entries, but, the percpu_counter is never destroyed in ip6_route_net_exit. So if the related data is freed by kernel, the freed percpu_counter is still on the list, then if we insert/remove other percpu_counter, list corruption resulted. Also, if the insert/remove option modifies the ->prev,->next pointer of the freed value, the poison overwritten is resulted then. With the following patch, the percpu_counter list corruption and poison overwritten warnings disappeared. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-30ipv6/udp: report SndbufErrors and RcvbufErrorsEric Dumazet
commit a18135eb9389 (Add UDP_MIB_{SND,RCV}BUFERRORS handling.) forgot to make the necessary changes in net/ipv6/proc.c to report additional counters in /proc/net/snmp6 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27tunnels: Fix tunnels change rcu protectionPavel Emelyanov
After making rcu protection for tunnels (ipip, gre, sit and ip6) a bug was introduced into the SIOCCHGTUNNEL code. The tunnel is first unlinked, then addresses change, then it is linked back probably into another bucket. But while changing the parms, the hash table is unlocked to readers and they can lookup the improper tunnel. Respective commits are b7285b79 (ipip: get rid of ipip_lock), 1507850b (gre: get rid of ipgre_lock), 3a43be3c (sit: get rid of ipip6_lock) and 94767632 (ip6tnl: get rid of ip6_tnl_lock). The quick fix is to wait for quiescent state to pass after unlinking, but if it is inappropriate I can invent something better, just let me know. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27net: add __rcu annotations to protocolEric Dumazet
Add __rcu annotations to : struct net_protocol *inet_protos struct net_protocol *inet6_protos And use appropriate casts to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27ipv6: fix refcnt problem related to POSTDAD stateUrsula Braun
After running this bonding setup script modprobe bonding miimon=100 mode=0 max_bonds=1 ifconfig bond0 10.1.1.1/16 ifenslave bond0 eth1 ifenslave bond0 eth3 on s390 with qeth-driven slaves, modprobe -r fails with this message unregister_netdevice: waiting for bond0 to become free. Usage count = 1 due to twice detection of duplicate address. Problem is caused by a missing decrease of ifp->refcnt in addrconf_dad_failure. An extra call of in6_ifa_put(ifp) solves it. Problem has been introduced with commit f2344a131bccdbfc5338e17fa71a807dee7944fa. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26IPv6: Temp addresses are immediately deleted.Glenn Wurster
There is a bug in the interaction between ipv6_create_tempaddr and addrconf_verify. Because ipv6_create_tempaddr uses the cstamp and tstamp from the public address in creating a private address, if we have not received a router advertisement in a while, tstamp + temp_valid_lft might be < now. If this happens, the new address is created inside ipv6_create_tempaddr, then the loop within addrconf_verify starts again and the address is immediately deleted. We are left with no temporary addresses on the interface, and no more will be created until the public IP address is updated. To avoid this, set the expiry time to be the minimum of the time left on the public address or the config option PLUS the current age of the public interface. Signed-off-by: Glenn Wurster <gwurster@scs.carleton.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26IPv6: Create temporary address if none exists.Glenn Wurster
If privacy extentions are enabled, but no current temporary address exists, then create one when we get a router advertisement. Signed-off-by: Glenn Wurster <gwurster@scs.carleton.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26netfilter: Add missing CONFIG_SYSCTL checks in ipv6's nf_conntrack_reasm.cDavid S. Miller
Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25net: add __rcu annotation to sk_filterEric Dumazet
Add __rcu annotation to : (struct sock)->sk_filter And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25netfilter: fix module dependency issues with IPv6 defragmentation, ip6tables ↵KOVACS Krisztian
and xt_TPROXY One of the previous tproxy related patches split IPv6 defragmentation and connection tracking, but did not correctly add Kconfig stanzas to handle the new dependencies correctly. This patch fixes that by making the config options mirror the setup we have for IPv4: a distinct config option for defragmentation that is automatically selected by both connection tracking and xt_TPROXY/xt_socket. The patch also changes the #ifdefs enclosing IPv6 specific code in xt_socket and xt_TPROXY: we only compile these in case we have ip6tables support enabled. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25tunnels: add _rcu annotationsEric Dumazet
(struct ip6_tnl)->next is rcu protected : (struct ip_tunnel)->next is rcu protected : (struct xfrm6_tunnel)->next is rcu protected : add __rcu annotation and proper rcu primitives. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-24tproxy: Add missing CAP_NET_ADMIN check to ipv6 sideBalazs Scheidler
IP_TRANSPARENT requires root (more precisely CAP_NET_ADMIN privielges) for IPV6. However as I see right now this check was missed from the IPv6 implementation. Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-24ip6_tunnel dont update the mtu on the route.Anders Franzen
The ip6_tunnel device did not unset the flag, IFF_XMIT_DST_RELEASE. This will make the dev layer to release the dst before calling the tunnel. The tunnel will not update any mtu/pmtu info, since it does not have a dst on the skb. Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2010-10-21tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabledBalazs Scheidler
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21tproxy: added tproxy sockopt interface in the IPV6 layerBalazs Scheidler
Support for IPV6_RECVORIGDSTADDR sockopt for UDP sockets were contributed by Harry Mason. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21tproxy: added udp6_lib_lookup functionBalazs Scheidler
Just like with IPv4, we need access to the UDP hash table to look up local sockets, but instead of exporting the global udp_table, export a lookup function. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21tproxy: added const specifiers to udp lookup functionsBalazs Scheidler
The parameters for various UDP lookup functions were non-const, even though they could be const. TProxy has some const references and instead of downcasting it, I added const specifiers along the path. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21tproxy: split off ipv6 defragmentation to a separate moduleBalazs Scheidler
Like with IPv4, TProxy needs IPv6 defragmentation but does not require connection tracking. Since defragmentation was coupled with conntrack, I split off the two, creating an nf_defrag_ipv6 module, similar to the already existing nf_defrag_ipv4. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21tproxy: fix hash locking issue when using port redirection in ↵Balazs Scheidler
__inet_inherit_port() When __inet_inherit_port() is called on a tproxy connection the wrong locks are held for the inet_bind_bucket it is added to. __inet_inherit_port() made an implicit assumption that the listener's port number (and thus its bind bucket). Unfortunately, if you're using the TPROXY target to redirect skbs to a transparent proxy that assumption is not true anymore and things break. This patch adds code to __inet_inherit_port() so that it can handle this case by looking up or creating a new bind bucket for the child socket and updates callers of __inet_inherit_port() to gracefully handle __inet_inherit_port() failing. Reported by and original patch from Stephen Buck <stephen.buck@exinda.com>. See http://marc.info/?t=128169268200001&r=1&w=2 for the original discussion. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21xfrm6: make xfrm6_tunnel_free_spi localstephen hemminger
Function only defined and used in one file. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-18netfilter: fix kconfig unmet dependency warningRandy Dunlap
Fix netfilter kconfig unmet dependencies warning & spell out "compatible" while there. warning: (IP_NF_TARGET_TTL && NET && INET && NETFILTER && IP_NF_IPTABLES && NETFILTER_ADVANCED || IP6_NF_TARGET_HL && NET && INET && IPV6 && NETFILTER && IP6_NF_IPTABLES && NETFILTER_ADVANCED) selects NETFILTER_XT_TARGET_HL which has unmet direct dependencies ((IP_NF_MANGLE || IP6_NF_MANGLE) && NETFILTER_ADVANCED) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-16fib: avoid false sharing on fib_table_hashEric Dumazet
While doing profile analysis, I found fib_hash_table was sometime in a cache line shared by a possibly often written kernel structure. (CONFIG_IP_ROUTE_MULTIPATH || !CONFIG_IPV6_MULTIPLE_TABLES) It's hard to detect because not easily reproductible. Make sure we allocate a full cache line to keep this shared in all cpus caches. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-16fib6: use FIB_LOOKUP_NOREF in fib6_rule_lookup()Eric Dumazet
Avoid two atomic ops on found rule in fib6_rule_lookup() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-13netfilter: xtables: resolve indirect macros 3/3Jan Engelhardt
2010-10-13netfilter: xtables: resolve indirect macros 2/3Jan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>