Overview
In the network model of Silverline DDoS mitigation, GRE & IP-IP tunnels are often used as a valid method of returning "clean" traffic to the customer.
The IP protocol was designed for use on a wide variety of transmission links. Although the maximum length of an IP datagram is 65535, most transmission links enforce a smaller maximum packet length limit, called an MTU. The value of the MTU depends on the type of transmission link. The design of IP accommodates MTU differences since it allows routers to fragment IP datagrams as necessary. The receiving station is responsible for the reassembly of the fragments back into the original full-size IP datagram.
Because tunnels add encapsulation overhead, this shrinks the maximum packet size that can be carried inside a tunnel without fragmenting. Often, customer internet connections are ethernet-based; 1500-bytes from Ethernet - 24-bytes for the GRE encapsulation = 1476-Bytes of payload without fragmentation.
IP Fragmentation
IP fragmentation involves breaking a datagram into a number of pieces that can be reassembled later. The IP source, destination, identification, total length, and fragment offset fields, along with the "more fragments" and "don't fragment" flags in the IP header, are used for IP fragmentation and reassembly.
There is a small increase in CPU and memory overhead to fragment an IP datagram. This holds true for the sender as well as for a router in the path between a sender and a receiver. Creating fragments simply involves creating fragment headers and copying the original datagram into fragments.
This can be done fairly efficiently because all the information needed to create the fragments is immediately available. Fragmentation causes more overhead for the receiver when reassembling the fragments because the receiver must allocate memory for the arriving fragments and coalesce them back into one datagram after all of the fragments are received. Reassembly on a host is not considered a problem because the host has the time and memory resources to devote to this task.
But, reassembly is very inefficient on a router whose primary job is to forward packets as quickly as possible. A router is not designed to hold on to packets for any length of time. Also, a router that does reassembly chooses the largest buffer available (18K) with which to work because it has no way to know the size of the original IP packet until the last fragment is received.
Allow Fragmentation:
Cisco enabled by default
Juniper allow-fragmentation;
Reassembly / Virtual-Reassembly
After a packet is fragmented and carried across a tunnel, sometimes an IT security policy can prevent those packets from being correctly reassembled back into the original packet. With virtual-reassembly enabled, when the router receives fragments it will not process them until it has received all fragments (within configurable limits) for that packet. It does not actually reassemble the fragments into a single packet but rather processes the fragments as a single packet for routing, access control, etc.
Cisco
- <add to existing interface ACL's> permit ip any any fragments <add to incoming/WAN interface> ip virtual-reassembly [max-reassemblies number] [max-fragments number] [timeout seconds] [drop-fragments]
- Example:
- Router(config-if)# ip virtual-reassembly max-reassemblies 64 max-fragments 16 timeout 5
- Example:
Juniper
-
There are very specific hardware restrictions on enabling packet re-assembly, not all platforms support it.
-
reassemble-packets;
IP TCP adjust-mss
The TCP Maximum Segment Size (MSS) defines the maximum amount of data that a host is willing to accept in a single TCP/IP datagram. This TCP/IP datagram might be fragmented at the IP layer, however, the MSS value is sent as a TCP header option only in TCP SYN segments. Each side of a TCP connection reports its MSS value to the other side. Contrary to popular belief, the MSS value is not negotiated between hosts. The sending host is required to limit the size of data in a single TCP segment to a value less than or equal to the MSS reported by the receiving host.
Originally, MSS meant how big a buffer (greater than or equal to 65496K) was allocated on a receiving station to be able to store the TCP data contained within a single IP datagram. MSS was the maximum segment (chunk) of data that the TCP receiver was willing to accept. This TCP segment could be as large as 64K (the maximum IP datagram size) and it could be fragmented at the IP layer in order to be transmitted across the network to the receiving host. The receiving host would reassemble the IP datagram before it handed the complete TCP segment to the TCP layer.
In more complex topologies, IE Load Balancers behind a router(s), sometimes enabling Adjust-MSS on the load balancers can be the single most effective place to apply the change.
Please note: Because TCP adjust-mss has no impact on non-TCP traffic, it's often part of an overall solution to fragmentation, rather than the only step being taken.
Cisco
<apply to outbound WAN interface> <suggest VALUE of 1436 for GRE tunnels on 1500 MTU WAN>
ip tcp adjust-mss <VALUE>
Juniper
Starting in Junos OS Release 14.2, you can configure the TCP MSS value on MX Series routers. To specify a TCP MSS value, include the TCP-mss statement at the:
<apply to outbound WAN interface> <suggest VALUE of 1436 for GRE tunnels on 1500 MTU WAN>
[edit interfaces interface-name unit logical-unit-number family family] tcp mss VALUE
Forced MTU - Smaller
<apply to outbound WAN interface> <suggest VALUE of 1436 for GRE tunnels on 1500 MTU WAN>
[edit interfaces interface-name unit logical-unit-number family family] tcp mss VALUE
Cisco
<apply to outbound WAN interface> <suggest VALUE of 1436 for GRE tunnels on 1500 MTU WAN>
[edit interfaces interface-name unit logical-unit-number family family] tcp mss VALUE
<apply to outbound WAN interface> ip mtu VALUE
Juniper
Forced MTU - Larger Sometimes the least impacting way to avoid having to fragment packets is to run a larger packet size (jumbo) between the customer and Silverline. However, given the number of devices and hops between the two networks, there's no guarantee that a larger MTU will pass unfragmented. If all other options are exhausted, trying a larger MTU can be worthwhile.
Note: carrier MTU is reflected at top of the page. If the customer path can support it, we suggest the customer set their IP MTU to 1600, and tunnel MTU to 1500, fragmentation disabled.
Cisco
<apply to outbound WAN interface> ip mtu VALUE
TCPDump Examples
Source 50.131.12.22 → SJC → GRE Tunnel → https://x.x.115.202
PMTUD Fragmentation Required Message via tunnel interface:
[elange@lb01-eqx-ams07.path.net] ~>
sudo tcpdump -v -nni $_int host 50.131.12.22 and host x.x.79.2
tcpdump: WARNING: $_int: no IPv4 address assigned
tcpdump: listening on $_int, link-type EN10MB (Ethernet), capture size 65535 bytes
20:22:44.065659 IP (tos 0x0, ttl 254, id 0, offset 0, flags [none], proto ICMP (1), length 56)
x.x.1.123 > 50.131.12.22: ICMP 69.163.115.202 unreachable - need to frag (mtu 1476), length 36
IP (tos 0x0, ttl 55, id 31260, offset 0, flags [DF], proto TCP (6), length 1500)
50.131.12.22.57661 > 69.163.115.202.443: [|tcp]
PMTUD Frag Required Received By End User (50.131.12.22 laptop):
$ sudo tcpdump -v -nni en0 host x.x.79.2
Password:
tcpdump: listening on en0, link-type EN10MB (Ethernet), capture size 65535 bytes
13:32:24.991130 IP (tos 0x20, ttl 244, id 0, offset 0, flags [none], proto ICMP (1), length 56)
x.x.1.123 > 192.168.1.7: ICMP x.x.115.202 unreachable - need to frag (mtu 1476), length 36
IP (tos 0x0, ttl 55, id 8550, offset 0, flags [DF], proto TCP (6), length 1500)
192.168.1.7.57993 > x.x.115.202.443: tcp 1480 [bad hdr length 0 - too short, < 20]
Packets from 50.131.12.22, incorrect, then reduced length:
elange@lb01-eqx-ams07.path.net ~> sudo tcpdump -v -nni $_int host 50.131.12.22
[sudo] password for elange
tcpdump: WARNING: $_int: no IPv4 address assigned
tcpdump: listening on $_int, link-type EN10MB (Ethernet), capture size 65535 bytes
20:32:25.391494 IP (tos 0x0, ttl 55, id 8550, offset 0, flags [DF], proto TCP (6), length 1500)
50.131.12.22.57993 > x.x.115.202.443: Flags [.], cksum 0xef4b (correct), seq 651:2099, ack 4310, win 4096, options [nop,nop,TS val 1640118539 ecr 1485654161], length 1448
20:32:25.411732 IP (tos 0x0, ttl 55, id 36142, offset 0, flags [DF], proto TCP (6), length 1476)
50.131.12.22.57993 > x.x.115.202.443: Flags [.], cksum 0xe4df (correct), seq 651:2075, ack 4310, win 4096, options [nop,nop,TS val 1640118539 ecr 1485654161], length 1424
Capture filter for all ICMP messages to/from the Loopback designated for those responses, but excluding traceroute messages. Useful when turning up a new customer, see if a new flurry of messages starts.
$ sudo tshark -nni $_int host x.x.79.2 and icmp[icmptype] != 11 && icmp[icmpcode] != 0
Capturing on '$_int'
1 0.000000 x.x.1.123 -> 65.55.50.190 ICMP 70 Destination unreachable (Fragmentation needed)
2 0.000075 x.x.1.123 -> 65.55.50.190 ICMP 70 Destination unreachable (Fragmentation needed)
2 3 3.137623 x.x.1.123 -> 96.45.72.111 ICMP 70 Destination unreachable (Fragmentation needed)
4 3.137706 x.x.1.123 -> 96.45.72.111 ICMP 70 Destination unreachable (Fragmentation needed)
5 3.137780 x.x.1.123 -> 96.45.72.111 ICMP 70 Destination unreachable (Fragmentation needed)
6 3.137872 x.x.1.123 -> 96.45.72.111 ICMP 70 Destination unreachable (Fragmentation needed)
7 3.225530 x.x.1.123 -> 96.45.72.111 ICMP 70 Destination unreachable (Fragmentation needed)
7 8 17.665265 x.x.1.123 -> 73.243.162.228 ICMP 70 Destination unreachable (Fragmentation needed)
8 9 25.758693 x.x.1.123 -> 204.152.95.98 ICMP 70 Destination unreachable (Fragmentation needed)
9 10 26.552176 x.x.1.123 -> 204.152.95.7 ICMP 70 Destination unreachable (Fragmentation needed)
10 11 28.255118 x.x.1.123 -> 65.55.50.158 ICMP 70 Destination unreachable (Fragmentation needed)
12 28.255122 x.x.1.123 -> 65.55.50.158 ICMP 70 Destination unreachable (Fragmentation needed)
12 13 28.878256 x.x.1.123 -> 67.207.199.19 ICMP 70 Destination unreachable (Fragmentation needed)
Comments
0 comments
Article is closed for comments.