liunx-3.6/linux-3.6.1 がバグっててルータとして機能しない件

2012-09-30 にリリースされた Linux 3.6 にはTCP/IP のスタックにバグがあり、パケットが転送(ルーティング)できなくなってました。NATやマスカレードができないため、3.6系はrouterなどでは全く使えないバージョンになっていたのです。

この残念なバグは Linux 3.6.1 でも修正されていなかったのですが、先日やっと修正がcommit されました。これです。

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=f4ef85bbda96324785097356336bc79cdd37db0a

これでやっと3.6系へ移行できます。

(2012年10月14日追記)
上記修正が取り込まれた linux 3.6.2 がリリースされています.
ChangeLog-3.6.2より関連箇所を引用しておきます.

commit 52fc5048534e9d4127622fa5a269a92f3bb5218b
Author: Eric Dumazet
Date: Thu Oct 4 01:25:26 2012 +0000

ipv4: add a fib_type to fib_info

[ Upstream commit f4ef85bbda96324785097356336bc79cdd37db0a ]

commit d2d68ba9fe8 (ipv4: Cache input routes in fib_info nexthops.)
introduced a regression for forwarding.

This was hard to reproduce but the symptom was that packets were
delivered to local host instead of being forwarded.

David suggested to add fib_type to fib_info so that we dont
inadvertently share same fib_info for different purposes.

With help from Julian Anastasov who provided very helpful
hints, reproduced here :


Can it be a problem related to fib_info reuse
from different routes. For example, when local IP address
is created for subnet we have:

broadcast 192.168.0.255 dev DEV proto kernel scope link src
192.168.0.1
192.168.0.0/24 dev DEV proto kernel scope link src 192.168.0.1
local 192.168.0.1 dev DEV proto kernel scope host src 192.168.0.1

The "dev DEV proto kernel scope link src 192.168.0.1" is
a reused fib_info structure where we put cached routes.
The result can be same fib_info for 192.168.0.255 and
192.168.0.0/24. RTN_BROADCAST is cached only for input
routes. Incoming broadcast to 192.168.0.255 can be cached
and can cause problems for traffic forwarded to 192.168.0.0/24.
So, this patch should solve the problem because it
separates the broadcast from unicast traffic.

And the ip_route_input_slow caching will work for
local and broadcast input routes (above routes 1 and 3) just
because they differ in scope and use different fib_info.