Resolving OSPF MTU problems with SROS

OSPF is a popular Interior Gateway Routing Protocol and in many instances it “just works” for a lot of situations, however care must be taken even in simple deployments. An issue that comes up from time to time is with regards to the maximum transmission unit (MTU). The network topology is a three router topology where I only have direct control a Nokia SROS based router.

OSPF MTU Test Topology

TL;DR – OSPF neighbor in ExchStart – you need to increase your MTU, OSPF neighbor in Exchange – you need to decrease your MTU. Keep reading to see how you can identify and resolve the MTU issues on Nokia Routers with SROS.

Below is the configuration of SR (the router under our administrative control):

configure
    system
        name "SR"
    exit
    card 1
        card-type iom3-xp-b
        mda 1
            mda-type m5-1gb-sfp-b
            no shutdown
        exit
        no shutdown
    exit
    port 1/1/1
        ethernet
            mode access
        exit
        no shutdown
    exit
    port 1/1/2
        ethernet
            mode access
        exit
        no shutdown
    exit
#--------------------------------------------------
echo "Router (Network Side) Configuration"
#--------------------------------------------------
    router Base
        interface "system"
            address 1.1.1.1/32
            no shutdown
        exit
#--------------------------------------------------
echo "OSPFv2 Configuration"
#--------------------------------------------------
        ospf 0
            area 0.0.0.0
                interface "system"
                    no shutdown
                exit
            exit
            no shutdown
        exit
    exit

#--------------------------------------------------
echo "Service Configuration"
#--------------------------------------------------
    service
        customer 1 create
            description "Default customer"
        exit
        ies 100 customer 1 create
            description "PEER1"
            interface "PEER1" create
                address 10.1.2.1/27
                sap 1/1/1 create
                exit
            exit
            no shutdown
        exit
        ies 200 customer 1 create
            description "PEER2"
            interface "PEER2" create
                address 10.1.3.1/27
                sap 1/1/2 create
                exit
            exit
            no shutdown
        exit
    exit
#--------------------------------------------------
echo "Router (Service Side) Configuration"
#--------------------------------------------------
    router
        ospf 0
            area 0.0.0.1
                interface "PEER1"
                    no shutdown
                exit
                interface "PEER2"
                    no shutdown
                exit
            exit
            no shutdown
        exit
    exit
exit all

One thing to note is that the Peer routers are attached to an Internet Enhanced Service (IES) and not part of the OSPF Backbone Area – from a stored configuration perspective there is a distinction between core network and customer configurations but from a protocol pespective things are the same. IES Interfaces that are bound to Service Access Points (SAPs) which must be changed from the default mode of network – in this case we are using access, however hybrid is an option as well.

As this post is about resolving issues, obviously things are not working as straight forward as expected.

A:SR# show router route-table

===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
1.1.1.1/32                                    Local   Local     00h40m51s  0
       system                                                       0
10.1.2.0/27                                   Local   Local     00h33m32s  0
       PEER1                                                        0
10.1.3.0/27                                   Local   Local     00h34m02s  0
       PEER2                                                        0
-------------------------------------------------------------------------------
No. of Routes: 3
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

So nothing from OSPF is in the routing table, while its possible (but unlikely) that our peers aren’t advertising anything, an alternate explanation is that it could be a connectivity issue, so we’ll ping each peer router first
A:SR# ping 10.1.2.2 count 3
PING 10.1.2.2 56 data bytes
64 bytes from 10.1.2.2: icmp_seq=1 ttl=64 time=1.28ms.
64 bytes from 10.1.2.2: icmp_seq=2 ttl=64 time=1.15ms.
64 bytes from 10.1.2.2: icmp_seq=3 ttl=64 time=1.08ms.

---- 10.1.2.2 PING Statistics ----
3 packets transmitted, 3 packets received, 0.00% packet loss
round-trip min = 1.08ms, avg = 1.17ms, max = 1.28ms, stddev = 0.081ms
A:SR# ping 10.1.3.3 count 3
PING 10.1.3.3 56 data bytes
64 bytes from 10.1.3.3: icmp_seq=1 ttl=64 time=1.51ms.
64 bytes from 10.1.3.3: icmp_seq=2 ttl=64 time=1.31ms.
64 bytes from 10.1.3.3: icmp_seq=3 ttl=64 time=1.24ms.

---- 10.1.3.3 PING Statistics ----
3 packets transmitted, 3 packets received, 0.00% packet loss
round-trip min = 1.24ms, avg = 1.35ms, max = 1.51ms, stddev = 0.116ms
Okay so IP connectivity is established, lets check the OSPF interface state
A:SR# show router ospf interface

===============================================================================
Rtr Base OSPFv2 Instance 0 Interfaces
===============================================================================
If Name               Area Id         Designated Rtr  Bkup Desig Rtr  Adm  Oper
-------------------------------------------------------------------------------
system                0.0.0.0         1.1.1.1         0.0.0.0         Up   DR
PEER1                 0.0.0.1         1.1.1.1         100.100.100.100 Up   DR
PEER2                 0.0.0.1         1.1.1.1         0.0.0.0         Up   DR
-------------------------------------------------------------------------------
No. of OSPF Interfaces: 3
===============================================================================

From first glance PEER1 seems okay but PEER2 doesn’t have a BDR and since we are using the default ospf interface type (broadcast) we would expect that to see both the DR and BDR – lets get some more details
A:SR# show router ospf interface "PEER2" detail

===============================================================================
Rtr Base OSPFv2 Instance 0 Interface "PEER2" (detail)
===============================================================================
-------------------------------------------------------------------------------
Configuration
-------------------------------------------------------------------------------
IP Address       : 10.1.3.1
Area Id          : 0.0.0.1              Priority         : 1
Hello Intrvl     : 10 sec               Rtr Dead Intrvl  : 40 sec
Retrans Intrvl   : 5 sec                Poll Intrvl      : 120 sec
Cfg Metric       : 0                    Advert Subnet    : True
Transit Delay    : 1                    Cfg IF Type      : None
Passive          : False                Cfg MTU          : 0
LSA-filter-out   : None                 Adv Rtr Capab    : Yes
LFA              : Include              LFA NH Template  :
RIB-priority     : None
Auth Type        : None
-------------------------------------------------------------------------------
State
-------------------------------------------------------------------------------
Admin Status     : Enabled              Oper State       : Designated Rtr
Designated Rtr   : 1.1.1.1              Backup Desig Rtr : 0.0.0.0
IF Type          : Broadcast            Network Type     : Stub
Oper MTU         : 1500                 Last Enabled     : 06/02/2017 01:44:23
Oper Metric      : 100                  Bfd Enabled      : No
Te Metric        : 100                  Te State         : Down
Admin Groups     : None
Ldp Sync         : outOfService         Ldp Sync Wait    : Disabled
Ldp Timer State  : Disabled             Ldp Tm Left      : 0
-------------------------------------------------------------------------------
Statistics
-------------------------------------------------------------------------------
Nbr Count        : 0                    If Events        : 2
Tot Rx Packets   : 0                    Tot Tx Packets   : 76
Rx Hellos        : 0                    Tx Hellos        : 76
Rx DBDs          : 0                    Tx DBDs          : 0
Rx LSRs          : 0                    Tx LSRs          : 0
Rx LSUs          : 0                    Tx LSUs          : 0
Rx LS Acks       : 0                    Tx LS Acks       : 0
Retransmits      : 0                    Discards         : 78
Bad Networks     : 0                    Bad Virt Links   : 0
Bad Areas        : 78                   Bad Dest Addrs   : 0
Bad Auth Types   : 0                    Auth Failures    : 0
Bad Neighbors    : 0                    Bad Pkt Types    : 0
Bad Lengths      : 0                    Bad Hello Int.   : 0
Bad Dead Int.    : 0                    Bad Options      : 0
Bad Versions     : 0                    Bad Checksums    : 0
LSA Count        : 0                    LSA Checksum     : 0x0
===============================================================================
Okay, we can see that there are discards which align with the Bad Area Count – this means that PEER2 doesn’t believe it’s part of OSPF Area 1.

Log-id 99 is automatically configured on Nokia SROS devices to capture a number of event messages however it can get a bit overwhelming to find something specific. Fortunately there are ways to reduce the output by specifying the application (OSPF) and something that may be part of the log message itself we want to see (PEER2)

A:SR# show log log-id 99 application OSPF message PEER2

===============================================================================
Event Log 99
===============================================================================
Description : Default System Log
Memory Log contents  [size=500   next event=944  (wrapped)]

941 2017/06/02 02:10:15.55 UTC WARNING: OSPF #2043 Base VR:  1 OSPFv2 (0)
"LCL_RTR_ID 1.1.1.1: Conflicting configuration areaMismatch on interface PEER2 from 10.1.3.3 in hello"
So while we have identified a problem – OSPF Area MisMatch, we need to overcome it – remembering we cant configure PEER2 (the person that manages it is on a training course and cannot be contacted, while your project manager is wanting solutions, not problems..)

This is where using show and debug commands can help identify and resolve issues – SROS is quite powerful with its debugging tools and while they can be used in production, it is always best to attempt to narrow down what you are attempting to collect – firstly we need to create a debug log if one doesn’t already exist – for this example I’m just logging to a circular memory buffer but it could go to SNMP, syslog or a file if necessary.

A:SR# configure log log-id 10
*A:SR>config>log>log-id$ from debug-trace
*A:SR>config>log>log-id$ to memory
*A:SR>config>log>log-id$ no shutdown
*A:SR>config>log>log-id$ back
*A:SR>config>log# info
----------------------------------------------
        log-id 10
            from debug-trace
            to memory
            no shutdown
        exit
----------------------------------------------
Now to set up the debug – we know it’s from interface PEER2 and the log message kindly told us the packet type (in hello)..
*A:SR>config>log# /debug router ospf packet hello "PEER2"
*A:SR>config>log# show debug
debug
    router "Base"
        ospf
            packet hello "PEER2"
        exit
    exit
exit
Router Base is the global routing table of the router, the debug can reference other services e.g. a VPRN if necessary by changing the router – After a few seconds (OSPF hello packets will come every 10 seconds or so) we can look in log 10 to see what was received.
*A:SR>config>log# show log log-id 10

===============================================================================
Event Log 10
===============================================================================
Description : (Not Specified)
Memory Log contents  [size=100   next event=10  (not wrapped)]

9 2017/06/02 02:19:27.10 UTC MINOR: DEBUG #2001 Base OSPFv2
"OSPFv2: PKT

>> Outgoing OSPF packet on I/F PEER2 area 0.0.0.1
OSPF Version      : 2
Router Id         : 1.1.1.1
Area Id           : 0.0.0.1
Checksum          : ecb9
Auth Type         : Null
Auth Key          : 00 00 00 00 00 00 00 00
Packet Type       : HELLO
Packet Length     : 44 "

8 2017/06/02 02:19:26.55 UTC MINOR: DEBUG #2001 Base OSPFv2
"OSPFv2: PKT DROPPED
area mismatch"

7 2017/06/02 02:19:26.54 UTC MINOR: DEBUG #2001 Base OSPFv2
"OSPFv2: PKT

>> Incoming OSPF packet on I/F PEER2 area 0.0.0.2
OSPF Version      : 2
Router Id         : 200.200.200.200
Area Id           : 0.0.0.2
Checksum          : 5d27
Auth Type         : Null
Auth Key          : 00 00 00 00 00 00 00 00
Packet Type       : HELLO
Packet Length     : 44 "
SR is configured with PEER2 in Area 1 but it should be in Area 2, lets fix that
*A:SR>config>log# /configure router ospf
*A:SR>config>router>ospf# info
----------------------------------------------
            area 0.0.0.0
                interface "system"
                    no shutdown
                exit
            exit
            area 0.0.0.1
                interface "PEER1"
                    no shutdown
                exit
                interface "PEER2"
                    no shutdown
                exit
            exit
            no shutdown
----------------------------------------------
*A:SR>config>router>ospf# area 1 interface "PEER2" shutdown
*A:SR>config>router>ospf# area 1 no interface "PEER2"
*A:SR>config>router>ospf# area 2 interface "PEER2" no shutdown
*A:SR>config>router>ospf# info
----------------------------------------------
            area 0.0.0.0
                interface "system"
                    no shutdown
                exit
            exit
            area 0.0.0.1
                interface "PEER1"
                    no shutdown
                exit
            exit
            area 0.0.0.2
                interface "PEER2"
                    no shutdown
                exit
            exit
            no shutdown
----------------------------------------------
Now see if that fixes that problem.
*A:SR>config>router>ospf# show router ospf interface

===============================================================================
Rtr Base OSPFv2 Instance 0 Interfaces
===============================================================================
If Name               Area Id         Designated Rtr  Bkup Desig Rtr  Adm  Oper
-------------------------------------------------------------------------------
system                0.0.0.0         1.1.1.1         0.0.0.0         Up   DR
PEER1                 0.0.0.1         1.1.1.1         100.100.100.100 Up   DR
PEER2                 0.0.0.2         200.200.200.200 1.1.1.1         Up   BDR
-------------------------------------------------------------------------------
No. of OSPF Interfaces: 3
===============================================================================
Yes we can see both the DR and BDR for our OSPF peers but before we move on, we should stop the debug activity
*A:SR>config>router>ospf# /debug router no ospf
*A:SR>config>router>ospf# show debug
debug
exit
Now lets see if OSPF routing exchange is occurring.
*A:SR>config>router>ospf# show router route-table

===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
1.1.1.1/32                                    Local   Local     01h15m46s  0
       system                                                       0
10.1.2.0/27                                   Local   Local     01h08m27s  0
       PEER1                                                        0
10.1.3.0/27                                   Local   Local     01h08m57s  0
       PEER2                                                        0
-------------------------------------------------------------------------------
No. of Routes: 3
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================
Well that isn’t fixed yet (which should be no surprise as this is about MTU issues) so lets move onto the next phase and examine the state of our OSPF neighbors
*A:SR>config>router>ospf# show router ospf neighbor

===============================================================================
Rtr Base OSPFv2 Instance 0 Neighbors
===============================================================================
Interface-Name                   Rtr Id          State      Pri  RetxQ   TTL
   Area-Id
-------------------------------------------------------------------------------
PEER1                            100.100.100.100 ExchStart  1    0       34
   0.0.0.1
PEER2                            200.200.200.200 Exchange   1    0       32
   0.0.0.2
-------------------------------------------------------------------------------
No. of Neighbors: 2
===============================================================================
A router that is stuck in ExchStart or Exchange is a hallmark of OSPF MTU related problems.
Let’s start working on PEER1.
*A:SR>config>router>ospf# show router ospf neighbor "PEER1" detail

===============================================================================
Rtr Base OSPFv2 Instance 0 Neighbors for Interface "PEER1" (detail)
===============================================================================
-------------------------------------------------------------------------------
Neighbor : 10.1.2.2
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Neighbor Rtr Id : 100.100.100.100  Interface: PEER1
-------------------------------------------------------------------------------
Neighbor IP Addr : 10.1.2.2
Local IF IP Addr : 10.1.2.1
Area Id          : 0.0.0.1
Designated Rtr   : 1.1.1.1              Backup Desig Rtr : 100.100.100.100
Neighbor State   : ExchStart            Priority         : 1
Retrans Q Length : 0                    Options          : - E - -  -  - - --
Events           : 1068                 Last Event Time  : 06/02/2017 02:59:34
Up Time          : 0d 01:11:00          Time Before Dead : 38 sec
GR Helper        : Not Helping          GR Helper Age    : 0 sec
GR Exit Reason   : None                 GR Restart Reason: Unknown (0)
Bad Nbr States   : 0                    LSA Inst fails   : 0
Bad Seq Nums     : 0                    Bad MTUs         : 1066
Bad Packets      : 0                    LSA not in LSDB  : 0
Option Mismatches: 0                    Nbr Duplicates   : 0
Num Restarts     : 0                    Last Restart at  : Never
===============================================================================
There are quite a few Bad MTUs being reported – While some vendors have an option to ignore the OSPF MTU, there are quite a number of MTU implications that can occur within the core when you consider various tunnel options that this is not provided.

Before we start to change things lets see if our trusty log 99 to see says anything about this:

*A:SR>config>router>ospf# show log log-id 99 application OSPF message PEER1

===============================================================================
Event Log 99
===============================================================================
Description : Default System Log
Memory Log contents  [size=500   next event=1473  (wrapped)]

1472 2017/06/02 02:40:17.67 UTC WARNING: OSPF #2043 Base VR:  1 OSPFv2 (0)
"LCL_RTR_ID 1.1.1.1: Conflicting configuration mtuMismatch on interface PEER1 from 10.1.2.2 in dbDescript"
We can use another debug to determine what the actual MTU should be (as before with the area mismatch, log 99 gave us a hint as to the packet type we should be investigating):
*A:SR>config>router>ospf# /debug router ospf packet dbdescr ingress "PEER1"
Clear the log and see what we are recieving:
*A:SR>config>router>ospf# /clear log 10
*A:SR>config>router>ospf# /show log log-id 10

===============================================================================
Event Log 10
===============================================================================
Description : (Not Specified)
Memory Log contents  [size=100   next event=3  (not wrapped)]

2 2017/06/02 02:55:17.67 UTC MINOR: DEBUG #2001 Base OSPFv2
"OSPFv2: PKT DROPPED
MTU mismatch"

1 2017/06/02 02:55:17.67 UTC MINOR: DEBUG #2001 Base OSPFv2
"OSPFv2: PKT

>> Incoming OSPF packet on I/F PEER1 area 0.0.0.1
OSPF Version      : 2
Router Id         : 100.100.100.100
Area Id           : 0.0.0.1
Checksum          : e35a
Auth Type         : Null
Auth Key          : 00 00 00 00 00 00 00 00
Packet Type       : DB_DESC
Packet Length     : 32

Interface MTU     : 1504
Options           : 000042
Flags             : 7   INIT MORE MAST
Sequence Num      : 2514
"

Okay, so PEER1 requires an MTU of 1504, lets modify that within the OSPF configuration:
*A:SR>config>router>ospf# info
----------------------------------------------
            area 0.0.0.0
                interface "system"
                    no shutdown
                exit
            exit
            area 0.0.0.1
                interface "PEER1"
                    no shutdown
                exit
            exit
            area 0.0.0.2
                interface "PEER2"
                    no shutdown
                exit
            exit
            no shutdown
----------------------------------------------
*A:SR>config>router>ospf# area 1 interface "PEER1" mtu 1504
When applying a configuration it is good to verify things are working as expected:
*A:SR>config>router>ospf# show router ospf interface "PEER1" detail | match MTU
Passive          : False                Cfg MTU          : 1504
Oper MTU         : 1500                 Last Enabled     : 06/02/2017 01:44:23
Although we configured the MTU to be 1504, the Operational MTU is 1500 (This is because the IP MTU is 1500 so OSPF cant be given a larger MTU on this interface)
*A:SR>config>router>ospf# show router interface "PEER1" detail | match MTU
IP MTU           : (default)
IP Oper MTU      : 1500

When Ethernet Ports are configured as mode access and left at the default encapsulation (null) the Ethernet port MTU is 1514 bytes (to support a 1500 byte IP MTU and 14 bytes of Ethernet Header – FCS is not included in MTU calculations)
*A:SR>config>router>ospf# show port 1/1/1 | match MTU
Physical Link      : Yes                        MTU              : 1514
To get a 1504 byte IP MTU, we can just add 4 bytes to the Port Ethernet MTU
*A:SR>config>router>ospf# /configure port 1/1/1 ethernet mtu 1518
*A:SR>config>router>ospf# show router interface "PEER1" detail | match MTU
IP MTU           : (default)
IP Oper MTU      : 1504
*A:SR>config>router>ospf# show router ospf interface "PEER1" detail | match MTU
Passive          : False                Cfg MTU          : 1504
Oper MTU         : 1504                 Last Enabled     : 06/02/2017 01:44:23
This should mean that the OSPF neighbor will now perform the database exchange and enter the Full state.
*A:SR>config>router>ospf# show router ospf neighbor "PEER1"

===============================================================================
Rtr Base OSPFv2 Instance 0 Neighbors for Interface "PEER1"
===============================================================================
Interface-Name                   Rtr Id          State      Pri  RetxQ   TTL
   Area-Id
-------------------------------------------------------------------------------
PEER1                            100.100.100.100 Full       1    0       34
   0.0.0.1
-------------------------------------------------------------------------------
No. of Neighbors: 1
===============================================================================
*A:SR>config>router>ospf# show router route-table

===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
1.1.1.1/32                                    Local   Local     02h10m36s  0
       system                                                       0
10.1.2.0/27                                   Local   Local     02h03m16s  0
       PEER1                                                        0
10.1.3.0/27                                   Local   Local     02h03m46s  0
       PEER2                                                        0
100.100.100.100/32                            Remote  OSPF      00h03m18s  10
       10.1.2.2                                                     100
-------------------------------------------------------------------------------
No. of Routes: 4
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================
The OSPF issue with PEER1 appears to have been resolved so back to PEER2.

*A:SR>config>router>ospf# show router ospf neighbor "PEER2" detail

===============================================================================
Rtr Base OSPFv2 Instance 0 Neighbors for Interface "PEER2" (detail)
===============================================================================
-------------------------------------------------------------------------------
Neighbor : 10.1.3.3
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Neighbor Rtr Id : 200.200.200.200  Interface: PEER2
-------------------------------------------------------------------------------
Neighbor IP Addr : 10.1.3.3
Local IF IP Addr : 10.1.3.1
Area Id          : 0.0.0.2
Designated Rtr   : 200.200.200.200      Backup Desig Rtr : 1.1.1.1
Neighbor State   : Exchange             Priority         : 1
Retrans Q Length : 3                    Options          : - E - -  -  - O --
Events           : 3                    Last Event Time  : 06/02/2017 02:22:36
Up Time          : 0d 01:01:10          Time Before Dead : 37 sec
GR Helper        : Not Helping          GR Helper Age    : 0 sec
GR Exit Reason   : None                 GR Restart Reason: Unknown (0)
Bad Nbr States   : 0                    LSA Inst fails   : 0
Bad Seq Nums     : 0                    Bad MTUs         : 0
Bad Packets      : 0                    LSA not in LSDB  : 0
Option Mismatches: 0                    Nbr Duplicates   : 917
Num Restarts     : 0                    Last Restart at  : Never
===============================================================================
There are no Bad MTUs being reported here, all we can see is that we are forever in Exchange state – lets check log 99 to see if anything at all related to PEER2 is present
*A:SR>config>router>ospf# show log log-id 99 message PEER2

===============================================================================
Event Log 99
===============================================================================
Description : Default System Log
Memory Log contents  [size=500   next event=2033  (wrapped)]
There is nothing present (the older events have wrapped around since we are only keeping the last 500 events)
What I have found is when OSPF neighbors are stuck in ExchStart, your router is the one with the MTU too small but while the router that is stuck in Exchange is the one with the MTU that is too big for its peer.
To work out what the smaller MTU should be, we’ll send ping packets of various lengths to work out what is the biggest unfragmented packet that can be sent to PEER2. Note: when we send a ping and specify the size, we are actually calling out what the ICMP payload size should be, so we need to ensure for IPv4 we consider the 20 byte IP header and 8 byte ICMP header – so an IP interface with an IP-MTU of 1500 would work for a ping with a payload size of 1472 but would fail at 1473.
We can test this concept on a known quantity (PEER1 which has an IP MTU of 1504) we should be able to get a ping payload of 1476 through okay but 1477 should fail – make sure we set the DF bit!
*A:SR>config>router>ospf# ping 10.1.2.2 size 1476 do-not-fragment count 3
PING 10.1.2.2 1476 data bytes
1484 bytes from 10.1.2.2: icmp_seq=1 ttl=64 time=1.34ms.
1484 bytes from 10.1.2.2: icmp_seq=2 ttl=64 time=1.27ms.
1484 bytes from 10.1.2.2: icmp_seq=3 ttl=64 time=1.16ms.

---- 10.1.2.2 PING Statistics ----
3 packets transmitted, 3 packets received, 0.00% packet loss
round-trip min = 1.16ms, avg = 1.26ms, max = 1.34ms, stddev = 0.072ms
*A:SR>config>router>ospf# ping 10.1.2.2 size 1477 do-not-fragment count 3
PING 10.1.2.2 1477 data bytes

---- 10.1.2.2 PING Statistics ----
3 packets transmitted, 3 packets bounced, 0 packets received, 100% packet loss
This works as expected, so the concept appears sound.

*A:SR>config>router>ospf# show router interface "PEER2" detail | match MTU
IP MTU           : (default)
IP Oper MTU      : 1500
We know that we have a ceiling of 1500 and we know the MTU must be lower than this. But just to be certain, we’ll try based on a 1500 byte IP packet anyway
*A:SR>config>router>ospf# ping 10.1.3.3 size 1472 do-not-fragment count 3
PING 10.1.3.3 1472 data bytes
Request timed out. icmp_seq=1.
Request timed out. icmp_seq=2.
Request timed out. icmp_seq=3.

---- 10.1.3.3 PING Statistics ----
3 packets transmitted, 0 packets received, 100% packet loss
Unsurprising, the Peer MTU is less than 1500 bytes, lets try a slightly smaller payload
*A:SR>config>router>ospf# ping 10.1.3.3 size 1462 do-not-fragment count 3
PING 10.1.3.3 1462 data bytes
1470 bytes from 10.1.3.3: icmp_seq=1 ttl=64 time=1.44ms.
1470 bytes from 10.1.3.3: icmp_seq=2 ttl=64 time=1.36ms.
1470 bytes from 10.1.3.3: icmp_seq=3 ttl=64 time=1.34ms.

---- 10.1.3.3 PING Statistics ----
3 packets transmitted, 3 packets received, 0.00% packet loss
round-trip min = 1.34ms, avg = 1.38ms, max = 1.44ms, stddev = 0.042ms
Okay, time to divide and conquer to determine the largest payload that gets through
*A:SR>config>router>ospf# ping 10.1.3.3 size 1467 do-not-fragment count 3
PING 10.1.3.3 1467 data bytes
1475 bytes from 10.1.3.3: icmp_seq=1 ttl=64 time=1.24ms.
1475 bytes from 10.1.3.3: icmp_seq=2 ttl=64 time=1.30ms.
1475 bytes from 10.1.3.3: icmp_seq=3 ttl=64 time=2.16ms.

---- 10.1.3.3 PING Statistics ----
3 packets transmitted, 3 packets received, 0.00% packet loss
round-trip min = 1.24ms, avg = 1.57ms, max = 2.16ms, stddev = 0.417ms
*A:SR>config>router>ospf# ping 10.1.3.3 size 1469 do-not-fragment count 3
PING 10.1.3.3 1469 data bytes
Request timed out. icmp_seq=1.
Request timed out. icmp_seq=2.
Request timed out. icmp_seq=3.

---- 10.1.3.3 PING Statistics ----
3 packets transmitted, 0 packets received, 100% packet loss
*A:SR>config>router>ospf# ping 10.1.3.3 size 1468 do-not-fragment count 3
PING 10.1.3.3 1468 data bytes
1476 bytes from 10.1.3.3: icmp_seq=1 ttl=64 time=1.19ms.
1476 bytes from 10.1.3.3: icmp_seq=2 ttl=64 time=1.30ms.
1476 bytes from 10.1.3.3: icmp_seq=3 ttl=64 time=1.26ms.

---- 10.1.3.3 PING Statistics ----
3 packets transmitted, 3 packets received, 0.00% packet loss
round-trip min = 1.19ms, avg = 1.25ms, max = 1.30ms, stddev = 0.043ms
An ICMP payload of 1468 fits within an IP packet with a size of 1496 – adjust the OSPF MTU to 1496 and see if that results in getting a full adjacency.
*A:SR>config>router>ospf# area 2 interface "PEER2" mtu 1496
*A:SR>config>router>ospf# show router ospf interface "PEER2" detail | match MTU
Passive          : False                Cfg MTU          : 1496
Oper MTU         : 1496                 Last Enabled     : 06/02/2017 02:22:36
*A:SR>config>router>ospf# show router ospf neighbor "PEER2"

===============================================================================
Rtr Base OSPFv2 Instance 0 Neighbors for Interface "PEER2"
===============================================================================
Interface-Name                   Rtr Id          State      Pri  RetxQ   TTL
   Area-Id
-------------------------------------------------------------------------------
PEER2                            200.200.200.200 Full       1    0       35
   0.0.0.2
-------------------------------------------------------------------------------
No. of Neighbors: 1
===============================================================================
The adjacency is up – lets see what routes we have learnt
*A:SR>config>router>ospf# show router route-table

===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
1.1.1.1/32                                    Local   Local     02h40m46s  0
       system                                                       0
10.1.2.0/27                                   Local   Local     02h33m27s  0
       PEER1                                                        0
10.1.3.0/27                                   Local   Local     02h33m56s  0
       PEER2                                                        0
100.100.100.100/32                            Remote  OSPF      00h33m29s  10
       10.1.2.2                                                     100
200.200.200.200/32                            Remote  OSPF      00h01m27s  10
       10.1.3.3                                                     100
-------------------------------------------------------------------------------
No. of Routes: 5
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

We now have learnt routes from PEER1 and PEER2, time for a quick dataplane verification:
*A:SR>config>router>ospf# ping 100.100.100.100 source 1.1.1.1 count 1
PING 100.100.100.100 56 data bytes
64 bytes from 100.100.100.100: icmp_seq=1 ttl=64 time=1.38ms.

---- 100.100.100.100 PING Statistics ----
1 packet transmitted, 1 packet received, 0.00% packet loss
round-trip min = 1.38ms, avg = 1.38ms, max = 1.38ms, stddev = 0.000ms
*A:SR>config>router>ospf# ping 200.200.200.200 source 1.1.1.1 count 1
PING 200.200.200.200 56 data bytes
64 bytes from 200.200.200.200: icmp_seq=1 ttl=64 time=1.14ms.

---- 200.200.200.200 PING Statistics ----
1 packet transmitted, 1 packet received, 0.00% packet loss
round-trip min = 1.14ms, avg = 1.14ms, max = 1.14ms, stddev = 0.000ms
We now have successful routing exchange and data plane reachability.

The case of Nokia Virtual Service Router and the non-unique Chassis MAC Address

So I’m playing with eve-ng and have decided to work on a Layer 2 scenario and a few problems with my emulation environment came up which needed a way forward, which resulted in this rambling tale…

SROS 12.0R6 5 Router Topology

R1, R2 and R3 Will be the MPLS Core with VPLS configured, while R4 and R5 will be Layer 3 CE devices that talk to each other over the VPLS.

The CE Devices are pretty straight forward so we’ll get those up first

R4 is a single-ended configuration with Interface R5 on Port 1/1/1 having the IP 192.168.1.4/27

configure
    system
        name "R4"
    card 1
        card-type iom3-xp-b
        mda 1
            mda-type m5-1gb-sfp-b
            no shutdown               
        exit
        no shutdown
    exit
    port 1/1/1
        ethernet
        exit
        no shutdown
    exit
    router 
        interface "R5"
            address 192.168.1.4/27
            port 1/1/1
            no shutdown
        exit
        interface "system"
            no shutdown
        exit
    exit
exit all

R5 is a a little more complex, it has a LAG toward – Interface R4 on LAG-1 with Ports 1/1/1 and 1/1/2 having the IP 192.168.1.5/27

configure
    system
        name "R5"
    exit
    card 1
        card-type iom3-xp-b
        mda 1
            mda-type m5-1gb-sfp-b
            no shutdown               
        exit
        no shutdown
    exit
    port 1/1/1
        ethernet
            autonegotiate limited
        exit
        no shutdown
    exit
    port 1/1/2
        ethernet
            autonegotiate limited
        exit
        no shutdown
    exit
    lag 1                             
        port 1/1/1 
        port 1/1/2 
        lacp active administrative-key 32768 
        no shutdown
    exit
    router 
        interface "R4"
            address 192.168.1.5/27
            port lag-1
            no shutdown
        exit
        interface "system"
            no shutdown
        exit                          
    exit
exit all

Multi-speed Ethernet interfaces when associated with a LAG must have autonegotiate set to limited to control the bundle member speed so they all bundle members operate the same speed

Now to Develop the MPLS Core Configuration on R1, R2 and R3 – this is quite straight forward, we are just going to use OSPF and LDP on the directly connected interfaces:

configure
    system
        name "R1"
    exit
    card 1
        card-type iom3-xp-b
        mda 1
            mda-type m5-1gb-sfp-b
            no shutdown               
        exit
        no shutdown
    exit
    port 1/1/1
        ethernet
        exit
        no shutdown
    exit
    port 1/1/2
        ethernet
        exit
        no shutdown
    exit
    port 1/1/3
        shutdown
        ethernet
        exit
    exit
    router 
        interface "R2"
            address 10.1.2.1/27
            port 1/1/1
            no shutdown
        exit
        interface "R3"
            address 10.1.3.1/27
            port 1/1/2
            no shutdown
        exit
        interface "system"
            address 10.10.10.1/32
            no shutdown
        exit
        ospf
            area 0.0.0.0              
                interface "system"
                    no shutdown
                exit
                interface "R2"
                    no shutdown
                exit
                interface "R3"
                    no shutdown
                exit
            exit
        exit
        ldp
            interface-parameters
                interface "R2"
                exit
                interface "R3"
                exit
            exit
            targeted-session
            exit                      
            no shutdown
        exit
    exit
exit all

configure
    system
        name "R2"
    exit
    card 1
        card-type iom3-xp-b
        mda 1
            mda-type m5-1gb-sfp-b
            no shutdown               
        exit
        no shutdown
    exit
    port 1/1/1
        ethernet
        exit
        no shutdown
    exit
    port 1/1/2
        ethernet
        exit
        no shutdown
    exit
    port 1/1/3
        shutdown
        ethernet
        exit
    exit
    router 
        interface "R1"
            address 10.1.2.2/27
            port 1/1/1
            no shutdown
        exit
        interface "R3"
            address 10.2.3.2/27
            port 1/1/2
            no shutdown
        exit
        interface "system"
            address 10.10.10.2/32
            no shutdown
        exit
        ospf
            area 0.0.0.0              
                interface "system"
                    no shutdown
                exit
                interface "R1"
                    no shutdown
                exit
                interface "R3"
                    no shutdown
                exit
            exit
        exit
        ldp
            interface-parameters
                interface "R1"
                exit
                interface "R3"
                exit
            exit
            targeted-session
            exit                      
            no shutdown
        exit
    exit
exit all

configure
    system
        name "R3"
    exit
    card 1
        card-type iom3-xp-b
        mda 1
            mda-type m5-1gb-sfp-b
            no shutdown               
        exit
        no shutdown
    exit
    port 1/1/1
        ethernet
        exit
        no shutdown
    exit
    port 1/1/2
        ethernet
        exit
        no shutdown
    exit
    port 1/1/3
        shutdown
        ethernet
        exit
    exit
    router 
        interface "R1"
            address 10.1.3.3/27
            port 1/1/2
            no shutdown
        exit
        interface "R2"
            address 10.2.3.3/27
            port 1/1/3
            no shutdown
        exit
        interface "system"
            address 10.10.10.3/32
            no shutdown
        exit
        ospf
            area 0.0.0.0              
                interface "system"
                    no shutdown
                exit
                interface "R1"
                    no shutdown
                exit
                interface "R2"
                    no shutdown
                exit
            exit
        exit
        ldp
            interface-parameters
                interface "R1"
                exit
                interface "R2"
                exit
            exit
            targeted-session
            exit                      
            no shutdown
        exit
    exit
exit all

The Layer 2 Service that we are going to build is a VPLS and will be using Spoke-SDPs that connected to each adjacent router (an alternate could be to use a full-mesh but I specifically want to test STP operation here)

*A:R1>config>service# info 
----------------------------------------------
        sdp 2 mpls create
            far-end 10.10.10.2
            ldp
            keep-alive
                shutdown
            exit
            no shutdown
        exit
        sdp 3 mpls create
            far-end 10.10.10.3
            ldp
            keep-alive
                shutdown
            exit
            no shutdown
        exit

*A:R2>config>service# info 
----------------------------------------------
        sdp 1 mpls create
            far-end 10.10.10.1
            ldp
            keep-alive
                shutdown
            exit
            no shutdown
        exit
        sdp 3 mpls create
            far-end 10.10.10.3
            ldp
            keep-alive
                shutdown
            exit
            no shutdown
        exit

*A:R3>config>service# info 
----------------------------------------------
        sdp 1 mpls create
            far-end 10.10.10.1
            ldp
            keep-alive
                shutdown
            exit
            no shutdown
        exit
        sdp 2 mpls create
            far-end 10.10.10.2
            ldp
            keep-alive
                shutdown
            exit
            no shutdown
        exit

Verifying the SDPs are up:

A:R1# show service sdp 

============================================================================
Services: Service Destination Points
============================================================================
SdpId  AdmMTU  OprMTU  Far End          Adm  Opr         Del     LSP   Sig
----------------------------------------------------------------------------
2      0       8914    10.10.10.2       Up   Up          MPLS    L     TLDP
3      0       8914    10.10.10.3       Up   Up          MPLS    L     TLDP
----------------------------------------------------------------------------
Number of SDPs : 2
----------------------------------------------------------------------------
Legend: R = RSVP, L = LDP, B = BGP, M = MPLS-TP, n/a = Not Applicable
============================================================================

A:R2# show service sdp 

============================================================================
Services: Service Destination Points
============================================================================
SdpId  AdmMTU  OprMTU  Far End          Adm  Opr         Del     LSP   Sig
----------------------------------------------------------------------------
1      0       8914    10.10.10.1       Up   Up          MPLS    L     TLDP
3      0       8914    10.10.10.3       Up   Up          MPLS    L     TLDP
----------------------------------------------------------------------------
Number of SDPs : 2
----------------------------------------------------------------------------
Legend: R = RSVP, L = LDP, B = BGP, M = MPLS-TP, n/a = Not Applicable
============================================================================

A:R3# show service sdp 

============================================================================
Services: Service Destination Points
============================================================================
SdpId  AdmMTU  OprMTU  Far End          Adm  Opr         Del     LSP   Sig
----------------------------------------------------------------------------
1      0       8914    10.10.10.1       Up   Up          MPLS    L     TLDP
2      0       8914    10.10.10.2       Up   Up          MPLS    L     TLDP
----------------------------------------------------------------------------
Number of SDPs : 2
----------------------------------------------------------------------------
Legend: R = RSVP, L = LDP, B = BGP, M = MPLS-TP, n/a = Not Applicable
============================================================================

With the transport infrastructure in place VPLS 100 without the customer access components can be set up:

*A:R1>config>service>vpls$ pwc 
-------------------------------------------------------------------------------
Present Working Context :
-------------------------------------------------------------------------------
 <root>
  configure 
  service 
  vpls "100" customer 1 create 
-------------------------------------------------------------------------------
A:R1>config>service>vpls$ info 
----------------------------------------------
            stp
                no shutdown
            exit
            spoke-sdp 2:100 create
                no shutdown
            exit
            spoke-sdp 3:100 create
                no shutdown
            exit
            no shutdown

*A:R2>config>service>vpls$ pwc 
-------------------------------------------------------------------------------
Present Working Context :
-------------------------------------------------------------------------------
 <root>
  configure 
  service 
  vpls "100" customer 1 create 
-------------------------------------------------------------------------------
A:R2>config>service>vpls$ info 
----------------------------------------------
            stp
                no shutdown
            exit
            spoke-sdp 1:100 create
                no shutdown
            exit
            spoke-sdp 3:100 create
                no shutdown
            exit
            no shutdown

*A:R3>config>service>vpls$ pwc 
-------------------------------------------------------------------------------
Present Working Context :
-------------------------------------------------------------------------------
 <root>
  configure 
  service 
  vpls "100" customer 1 create 
-------------------------------------------------------------------------------
A:R3>config>service>vpls$ info 
----------------------------------------------
            stp
                no shutdown
            exit
            spoke-sdp 1:100 create
                no shutdown
            exit
            spoke-sdp 2:100 create
                no shutdown
            exit
            no shutdown

Verify that VPLS 100 is up and running:

*A:R1>config>service>*A:R1# show service id 100 base | match Ident post-lines 3 
Identifier                               Type         AdmMTU  OprMTU  Adm  Opr
-------------------------------------------------------------------------------
sdp:2:100 S(10.10.10.2)                  Spok         0       8914    Up   Up
sdp:3:100 S(10.10.10.3)                  Spok         0       8914    Up   Up

A:R2# show service id 100 base | match Ident post-lines 3 
Identifier                               Type         AdmMTU  OprMTU  Adm  Opr
-------------------------------------------------------------------------------
sdp:1:100 S(10.10.10.1)                  Spok         0       8914    Up   Up
sdp:3:100 S(10.10.10.3)                  Spok         0       8914    Up   Up

A:R3# show service id 100 base | match Ident post-lines 3 
Identifier                               Type         AdmMTU  OprMTU  Adm  Opr
-------------------------------------------------------------------------------
sdp:1:100 S(10.10.10.1)                  Spok         0       8914    Up   Up
sdp:2:100 S(10.10.10.2)                  Spok         0       8914    Up   Up

Looks good With 3 routers each connecting to each other using spokes will introduce a bridging loop so we need a loop avoidance mechanism – luckily we enabled STP, so lets see how STP is behaving:

*A:R1# show service id 100 stp                        

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id          : 80:00.da:00:ff:00:00:01  Top. Change Count : 4
Root Bridge        : This Bridge              Stp Oper State    : Up
Primary Bridge     : N/A                      Topology Change   : Inactive
Mode               : Rstp                     Last Top. Change  : 0d 00:10:13
Vcp Active Prot.   : N/A                      
Root Port          : N/A                      External RPC      : 0

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id     Oper-     Port-      Port-       Port-  Oper-  Link-  Active
                   State     Role       State       Num    Edge   Type   Prot.
-------------------------------------------------------------------------------
2:100              Up        Designated Forward     2049   True   Pt-pt  Rstp
3:100              Up        Backup     Discard     2050   False  Pt-pt  Rstp
===============================================================================

*A:R2# show service id 100 stp 

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id          : 80:00.da:00:ff:00:00:01  Top. Change Count : 3
Root Bridge        : This Bridge              Stp Oper State    : Up
Primary Bridge     : N/A                      Topology Change   : Inactive
Mode               : Rstp                     Last Top. Change  : 0d 00:10:47
Vcp Active Prot.   : N/A                      
Root Port          : N/A                      External RPC      : 0

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id     Oper-     Port-      Port-       Port-  Oper-  Link-  Active
                   State     Role       State       Num    Edge   Type   Prot.
-------------------------------------------------------------------------------
1:100              DwnstrmLp Designated Discard     2049   False  Pt-pt  Rstp
3:100              Up        Backup     Discard     2050   False  Pt-pt  Rstp
===============================================================================

*A:R3# show service id 100 stp 

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id          : 80:00.da:00:ff:00:00:01  Top. Change Count : 3
Root Bridge        : This Bridge              Stp Oper State    : Up
Primary Bridge     : N/A                      Topology Change   : Inactive
Mode               : Rstp                     Last Top. Change  : 0d 00:10:54
Vcp Active Prot.   : N/A                      
Root Port          : N/A                      External RPC      : 0

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id     Oper-     Port-      Port-       Port-  Oper-  Link-  Active
                   State     Role       State       Num    Edge   Type   Prot.
-------------------------------------------------------------------------------
1:100              Up        Designated Forward     2048   False  Pt-pt  Rstp
2:100              Up        Designated Forward     2049   False  Pt-pt  Rstp
===============================================================================

This doesn’t seem right, SDP 1:100 on R2 is saying that the downstream interface is looped and both interfaces are discarding!

If we look at the highlighted lines on each of the router outputs we notice that all Routers in the VPLS have the same Bridge ID, which is definitely a bad thing.

For SROS, the Bridge Id is partly derived from the chassis MAC address:

*A:R1# show chassis detail | match MAC  
  Base MAC address                  : da:00:ff:00:00:01

*A:R2# show chassis detail | match MAC  
  Base MAC address                  : da:00:ff:00:00:01

*A:R3# show chassis detail | match MAC  
  Base MAC address                  : da:00:ff:00:00:01

With real hardware, the Chassis MAC address actually is unique so this problem wont come up – however with the VSRs they’re all the same.

As an asside, the Chassis MAC address is used in a few places besides STP, one is with the SNMP engine id

*A:R1# show chassis detail | match MAC      
  Base MAC address                  : da:00:ff:00:00:01
*A:R1# show system information | match Engine 
SNMP Engine ID         : 0000197f0000da00ff000001
SNMP Engine Boots      : 11

It is possible within the configuration to manually set the Engine ID (I think it would probably be best to do this in production just in case you end up replacing faulty hardware)

With SROS version 14.0R4 a new option for the boot options file (or bof) was introduced which allows the manual setting of the chassis MAC address (followed by a reboot):

*A:R14# bof system-base-mac 00:11:22:33:44:02 
*A:R14# bof save 
Writing BOF to cf3:/bof.cfg ... OK
Completed.
Writing configuration to cf3:\config.cfg
Saving configuration ... OK
Completed.
A:R14# /admin reboot 
Are you sure you want to reboot (y/n)? y

Which is great but this particular set up is using SROS 12.0R6 and that BOF option doesn’t exist an alternate method is required.

For STP we can cast our mind back to remember what the Bridge ID consists of… It’s both the Priority (which by default is 32768) and the Bridge MAC address.

So as a quick and nasty fix, I should just be able to change the STP Priority in VPLS 100 on R1/R2/R3 and resolve the STP problem, it also will allow me to specifically select a root bridge which is probably a good thing to do.
*A:R1# configure service vpls 100 stp priority 4096
*A:R2# configure service vpls 100 stp priority 8192
*A:R3# configure service vpls 100 stp priority 16384
Lets see how things are going now:

*A:R1# show service id 100 stp 

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id          : 10:00.da:00:ff:00:00:01  Top. Change Count : 6
Root Bridge        : This Bridge              Stp Oper State    : Up
Primary Bridge     : N/A                      Topology Change   : Inactive
Mode               : Rstp                     Last Top. Change  : 0d 00:00:35
Vcp Active Prot.   : N/A                      
Root Port          : N/A                      External RPC      : 0

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id     Oper-     Port-      Port-       Port-  Oper-  Link-  Active
                   State     Role       State       Num    Edge   Type   Prot.
-------------------------------------------------------------------------------
2:100              Up        Designated Forward     2049   False  Pt-pt  Rstp
3:100              Up        Designated Forward     2050   False  Pt-pt  Rstp
===============================================================================

*A:R2# show service id 100 stp 

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id          : 20:00.da:00:ff:00:00:01  Top. Change Count : 4
Root Bridge        : 10:00.da:00:ff:00:00:01  Stp Oper State    : Up
Primary Bridge     : N/A                      Topology Change   : Inactive
Mode               : Rstp                     Last Top. Change  : 0d 00:01:07
Vcp Active Prot.   : N/A                      
Root Port          : 2049                     External RPC      : 10

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id     Oper-     Port-      Port-       Port-  Oper-  Link-  Active
                   State     Role       State       Num    Edge   Type   Prot.
-------------------------------------------------------------------------------
1:100              Up        Root       Forward     2049   False  Pt-pt  Rstp
3:100              Up        Designated Forward     2050   False  Pt-pt  Rstp
===============================================================================

*A:R3# show service id 100 stp 

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id          : 40:00.da:00:ff:00:00:01  Top. Change Count : 4
Root Bridge        : 10:00.da:00:ff:00:00:01  Stp Oper State    : Up
Primary Bridge     : N/A                      Topology Change   : Inactive
Mode               : Rstp                     Last Top. Change  : 0d 00:01:52
Vcp Active Prot.   : N/A                      
Root Port          : 2048                     External RPC      : 10

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id     Oper-     Port-      Port-       Port-  Oper-  Link-  Active
                   State     Role       State       Num    Edge   Type   Prot.
-------------------------------------------------------------------------------
1:100              Up        Root       Forward     2048   False  Pt-pt  Rstp
2:100              Up        Alternate  Discard     2049   False  Pt-pt  Rstp
===============================================================================

Success, all routers have different bridge IDs and all agree that R1 is the root and only one port is in discarding state.

Now we will create the CE router attachments (Service Access Points) on the Core starting with R3 which is facing R4 – by default Ethernet ports are in network mode, to be able to bind to a service, the port must be mode access (or hybrid)

*A:R3# /configure port 1/1/1     
*A:R3>config>port# shutdown 
*A:R3>config>port# ethernet mode access 
*A:R3>config>port# ethernet encap-type null 
*A:R3>config>port# no shutdown 
*A:R3>config>port# /configure service vpls 100 
*A:R3>config>service>vpls# sap 1/1/1 create 
*A:R3>config>service>vpls>sap$ show service id 100 base

===============================================================================
Service Basic Information
===============================================================================
Service Id        : 100                 Vpn Id            : 0
Service Type      : VPLS                
Name              : (Not Specified)
Description       : (Not Specified)
Customer Id       : 1                   Creation Origin   : manual
Last Status Change: 04/21/2017 13:20:28 
Last Mgmt Change  : 04/21/2017 13:44:59 
Etree Mode        : Disabled            
Admin State       : Up                  Oper State        : Up
MTU               : 1514                Def. Mesh VC Id   : 100
SAP Count         : 1                   SDP Bind Count    : 2
Snd Flush on Fail : Disabled            Host Conn Verify  : Disabled
Propagate MacFlush: Disabled            Per Svc Hashing   : Disabled
Allow IP Intf Bind: Disabled            
Def. Gateway IP   : None                
Def. Gateway MAC  : None                
Temp Flood Time   : Disabled            Temp Flood        : Inactive
Temp Flood Chg Cnt: 0                   
VSD Domain        : <none>            
 
-------------------------------------------------------------------------------
Service Access & Destination Points
-------------------------------------------------------------------------------
Identifier                               Type         AdmMTU  OprMTU  Adm  Opr
-------------------------------------------------------------------------------
sap:1/1/1                                null         1514    1514    Up   Up
sdp:1:100 S(10.10.10.1)                  Spok         0       8914    Up   Up
sdp:2:100 S(10.10.10.2)                  Spok         0       8914    Up   Up
===============================================================================

Now things are going to get a little more complicated on R1 and R2 as we are going to establish a Multi-Chassis LAG towards R5. R5 is unaware of the MC-LAG, it is just talking LACP to R1 and R2 thinking they are just one system. R1 and R2 require synchronisation between each other to set up the Active-Standby LAG.

We’ll start by creating regular LAG-1 Facing R5 on R1 and R2 with a single port in each:

*A:R1# /configure port 1/1/3 shutdown                          
*A:R1# /configure port 1/1/3 ethernet mode access 
*A:R1# /configure port 1/1/3 ethernet encap-type null 
*A:R1# /configure port 1/1/3 ethernet autonegotiate limited 
*A:R1# /configure port 1/1/3 no shutdown                    
*A:R1# /configure lag 1 
*A:R1>config>lag$ mode access 
*A:R1>config>lag$ lacp active 
*A:R1>config>lag$ port 1/1/3 
*A:R1>config>lag$ no shutdown

*A:R2# /configure port 1/1/3 shutdown                          
*A:R2# /configure port 1/1/3 ethernet mode access 
*A:R2# /configure port 1/1/3 ethernet encap-type null 
*A:R2# /configure port 1/1/3 ethernet autonegotiate limited 
*A:R2# /configure port 1/1/3 no shutdown                    
*A:R2# /configure lag 1 
*A:R2>config>lag$ mode access 
*A:R2>config>lag$ lacp active 
*A:R2>config>lag$ port 1/1/3 
*A:R2>config>lag$ no shutdown

Now to set up MC-LAG we need to set up a multi-chassis peering between R1 and R2 (multi-chassis redundancy supports more than just MC-LAG):

*A:R1>config>lag# /configure redundancy multi-chassis peer 10.10.10.2 create
*A:R1>config>redundancy>multi-chassis>peer# no shutdown

*A:R2>config>lag# /configure redundancy multi-chassis peer 10.10.10.1 create 
*A:R2>config>redundancy>multi-chassis>peer# no shutdown

Then we create the MC-LAG itself, we require the lacp-key, system-id and priority to be the same on each router:

*A:R1>config>redundancy>multi-chassis>peer# mc-lag
*A:R1>config>redundancy>mc>peer>mc-lag#lag 1 lacp-key 2468 remote-lag 1 system-id 00:00:be:ef:ca:fe system-priority 1000 
*A:R1>config>redundancy>mc>peer>mc-lag#no shutdown

*A:R2>config>redundancy>multi-chassis>peer# mc-lag
*A:R2>config>redundancy>mc>peer>mc-lag#lag 1 lacp-key 2468 remote-lag 1 system-id 00:00:be:ef:ca:fe system-priority 1000 
*A:R2>config>redundancy>mc>peer>mc-lag#no shutdown

Now the MC-LAG should be up and running, first we’ll check the peering

*A:R1>config>redundancy>mc>peer>mc-lag# show redundancy multi-chassis all 

===============================================================================
Multi-Chassis Peers
===============================================================================
Peer IP          Peer Admin      Client    Admin        Oper         State
 Src IP           Auth                                               
-------------------------------------------------------------------------------
10.10.10.2       Enabled         MC-Sync:  --           --           --
 10.10.10.1       None           MC-Ring:  --           --           --
                                 MC-Endpt: --           --           --
                                 MC-Lag:   Enabled      Enabled      --
                                 MC-IPsec: --           --           Disabled
===============================================================================

*A:R2>config>redundancy>mc>peer>mc-lag# show redundancy multi-chassis all 

===============================================================================
Multi-Chassis Peers
===============================================================================
Peer IP          Peer Admin      Client    Admin        Oper         State
 Src IP           Auth                                               
-------------------------------------------------------------------------------
10.10.10.1       Enabled         MC-Sync:  --           --           --
 10.10.10.2       None           MC-Ring:  --           --           --
                                 MC-Endpt: --           --           --
                                 MC-Lag:   Enabled      Enabled      --
                                 MC-IPsec: --           --           Disabled
===============================================================================

Looks promising, lets check our LAG status
*A:R1>config>redundancy>mc>peer>mc-lag# show lag 

===============================================================================
Lag Data
===============================================================================
Lag-id         Adm     Opr     Weighted Threshold Up-Count MC Act/Stdby
-------------------------------------------------------------------------------
1              up      down    No       0         0        standby
-------------------------------------------------------------------------------
Total Lag-ids: 1       Single Chassis: 0        MC Act: 0       MC Stdby: 1
===============================================================================

*A:R2>config>redundancy>mc>peer>mc-lag# show lag 

===============================================================================
Lag Data
===============================================================================
Lag-id         Adm     Opr     Weighted Threshold Up-Count MC Act/Stdby
-------------------------------------------------------------------------------
1              up      down    No       0         0        standby
-------------------------------------------------------------------------------
Total Lag-ids: 1       Single Chassis: 0        MC Act: 0       MC Stdby: 1
===============================================================================

Ummm… both of these are showing that they are in Multi-Chassis Standby

It turns out that within the MC-LAG configuration, the Base Chassis MAC needs to be unique too. While we cannot directly change the Base MAC prior to SROS version 14.0R4 there is actually an alternative method available. if we set the out-of-band management ethernet IP address, this will influence the chassis MAC address.

*A:R1>config>lag# show bof 
===============================================================================
BOF (Memory)
===============================================================================
    primary-image    cf3:\timos\both.tim
    primary-config   cf3:\config.cfg
    autonegotiate
    duplex           full
    speed            100
    wait             3
    persist          off
    no li-local-save
    no li-separate
    console-speed    115200
===============================================================================
*A:R1>config>lag# /bof address 192.168.100.1/24 
*A:R1>config>lag# /bof save 
Writing BOF to cf3:/bof.cfg ... OK
Completed.
*A:R1>config>lag# show bof 
===============================================================================
BOF (Memory)
===============================================================================
    primary-image    cf3:\timos\both.tim
    primary-config   cf3:\config.cfg
    address          192.168.100.1/24 active
    autonegotiate
    duplex           full
    speed            100
    wait             3
    persist          off
    no li-local-save
    no li-separate
    console-speed    115200
===============================================================================

Save and reboot
*A:R1>config>lag# /admin save 
Writing configuration to cf3:\config.cfg
Saving configuration ... OK
Completed.
A:R1>config>lag# /admin reboot 
Are you sure you want to reboot (y/n)? y

We’ll do the same thing with R2 but give it a different IP so the MAC Addresses should be different:
*A:R2>config>lag# /bof address 192.168.100.2/24 
*A:R2>config>lag# /bof save 
Writing BOF to cf3:/bof.cfg ... OK
Completed.
*A:R2>config>lag# /admin save 
Writing configuration to cf3:\config.cfg
Saving configuration ... OK
Completed.
A:R2>config>lag# /admin reboot 
Are you sure you want to reboot (y/n)? y 

After the reboot we can compare R1 and R2’s Base MAC Address
A:R1# show chassis detail | match MAC 
  Base MAC address                  : c8:01:ff:00:00:00

A:R2# show chassis detail | match MAC 
  Base MAC address                  : c8:02:ff:00:00:00

Okay they are different now – has it resolved our MC-LAG issue?
A:R1# show lag 1 port 

===============================================================================
Lag Port States
LACP Status: e - Enabled, d - Disabled
===============================================================================
Lag-id Port-id   Adm   Act/Stdby Opr   Primary  Sub-group     Forced  Priority
-------------------------------------------------------------------------------
1(e)   1/1/3     up    active    up    yes      1             -       32768
===============================================================================

A:R2# show lag 1 port 

===============================================================================
Lag Port States
LACP Status: e - Enabled, d - Disabled
===============================================================================
Lag-id Port-id   Adm   Act/Stdby Opr   Primary  Sub-group     Forced  Priority
-------------------------------------------------------------------------------
1(e)   1/1/3     up    standby   down  yes      1             -       32768
===============================================================================

A:R5# show lag 1 port 

===============================================================================
Lag Port States
LACP Status: e - Enabled, d - Disabled
===============================================================================
Lag-id Port-id   Adm   Act/Stdby Opr   Primary  Sub-group     Forced  Priority
-------------------------------------------------------------------------------
1(e)   1/1/1     up    active    up    yes      1             -       32768
       1/1/2     up    active    down           1             -       32768
===============================================================================

Yes R1, R2 and R5 are in alignment, now lets put the LAG into VPLS 100 on R1 and R2
A:R1# /configure service vpls 100 sap lag-1 create
A:R2# /configure service vpls 100 sap lag-1 create
Lets see if R5 can ping R4
A:R5# ping 192.168.1.4 count 1 
PING 192.168.1.4 56 data bytes
64 bytes from 192.168.1.4: icmp_seq=1 ttl=64 time=12.3ms.

---- 192.168.1.4 PING Statistics ----
1 packet transmitted, 1 packet received, 0.00% packet loss
round-trip min = 12.3ms, avg = 12.3ms, max = 12.3ms, stddev = 0.000ms

Success!

Lets check the MAC address table in vpls 100 (Forwarding Data Base):

*A:R1>config>service>vpls>sap$ show service id 100 fdb detail 

===============================================================================
Forwarding Database, Service 100
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
100       50:00:00:07:00:01 sdp:3:100                L/0      04/21/17 14:47:33
100       da:00:ff:00:01:42 sap:lag-1                L/0      04/21/17 14:52:57
-------------------------------------------------------------------------------
No. of MAC Entries: 2
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

*A:R2>config>service>vpls>sap$ show service id 100 fdb detail 

===============================================================================
Forwarding Database, Service 100
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
100       50:00:00:07:00:01 sdp:1:100                L/90     04/21/17 14:53:01
100       da:00:ff:00:01:42 sdp:1:100                L/90     04/21/17 14:45:05
-------------------------------------------------------------------------------
No. of MAC Entries: 2
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

*A:R2>config>service>vpls>sap$ show service id 100 fdb detail

===============================================================================
Forwarding Database, Service 100
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
100       50:00:00:07:00:01 sap:1/1/1                L/0      04/21/17 14:52:42
100       da:00:ff:00:01:42 sdp:1:100                L/0      04/21/17 14:44:46
-------------------------------------------------------------------------------
No. of MAC Entries: 2
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

Now to check out the MC-LAG resiliency, we’ll start a continuous ping on R5 to R4 and then shutdown port 1/1/3 (LAG-1) on R1
*A:R1>config>service>vpls>sap$ /configure port 1/1/3 shutdown
And Check if R2 LAG 1 Port 1/1/3 goes from standby to active
*A:R2>config>service>vpls>sap$ show lag 1 port 

===============================================================================
Lag Port States
LACP Status: e - Enabled, d - Disabled
===============================================================================
Lag-id Port-id   Adm   Act/Stdby Opr   Primary  Sub-group     Forced  Priority
-------------------------------------------------------------------------------
1(e)   1/1/3     up    active    up    yes      1             -       32768
===============================================================================

We can see the interface has come up and there were a few packets lost but the link recovered – we could speed up the link convergence time but I think the general concept has been demonstrated sucessfully.

The moral of the story here – with Virtual SROS systems, it’s worth ensuring you have a unique chassis MAC address!