TACACS+ Authentication with Nokia Service Routers

Nokia SROS supports the use of AAA for a range of tasks, some of the more interesting and complicated are related to subscriber management when the Service Router is acting as a Broadband Network Gateway (BNG) however AAA is also useful for the network operations teams to provide centralised authentication for a fleet of routers where managing individual local accounts is not really something to contemplate.

SROS supports the use of RADIUS or TACACS+ for this management access control and today TACACS+ will be the method used with a linux daemon based on code from http://www.shrubbery.net/tac_plus/ which will be configured to support a Nokia Service Router (however this configuration would be quite Cisco IOS friendly) and the SROS router will use TACACS+ for authentication and identifying what access rights the user has my mapping using profiles.

Nokia SR and TACACS+ Server Test Topology

As you can see above, R1 (instantiated service router in eve-ng) has port 1/1/3 bridged to the internal Ethernet of the computer running eve-ng so both the router interface and the internal Ethernet are on the same IP subnet allowing connectivity to the TACACS+ server that will be run on the laptop.

To install the TACACS+ software, as eve-ng is built on Ubuntu 16.04, installation is as simple as invoking:
[php]root@m4600:~# apt-get install tacacs+[/php]
The config was then modified to look like below:
[php]root@m4600:~# cat /etc/tacacs+/tac_plus.conf
# shared secret with TACACS client
key = “tac_secret”
# Set where to send accounting records
accounting syslog;
accounting file = /var/log/tac_plus/tac_plus.acct

acl = mgmt_acl {
# regex to allow access hosts from 192.168.1.0/24
permit = 192\.168\.1\.([1-9]|[1-9]\d|1\d{2}|2[0-4]\d|25[0-4])
}

# administrative group, priv-lvl 15 to be mapped to SROS administative profile
group = administrative {
default service = permit
expires = “Jan 1 2020”
acl = mgmt_acl
service = exec {
priv-lvl = 15
}
}
# limited group, priv-lvl 1 to be mapped to SROS limited profile
group = limited {
default service = permit
expires = “Jan 1 2020”
acl = mgmt_acl
service = exec {
priv-lvl = 1
}
}

# our tacacs test accounts
# des password is generated by running tac_pwd on the plaintext
user = testadmin {
member = administrative
login = des JZ1fHFoSp.v/E
# plaintext password = pass
}

user = testlimited {
member = limited
login = des O8ZepJOyIIuYo
# plaintext password = test
}[/php]

A couple of the key things here besides the key which is the shared secret between the TACACS+ server and the router is that there are two groups defined administrative and limited, where the only difference is the priv-lvl. With Cisco platforms, this is what is used for TACACS+ uses during the authorisation stage to tell IOS what access rights a user has. SROS is able to map this to a “profile”.

Out of the box, SROS has two built in profiles, administrative (used for most installation and commissioning activities) and default which is somewhat less capable, however it is possible to define specific profiles in line with the roles of your users. In the config above there is a group called limited which will be identified by priv-lvl 1.

On R1 we can define a custom profile in the system security configuration context:
[php]A:R1# /configure system security profile “limited”
A:R1>config>system>security>profile# info
———————————————-
default-action deny-all
entry 10
match “show router route-table”
action permit
exit
entry 20
match “show users”
action permit
exit
entry 30
match “show system security user”
action permit
exit
entry 40
match “logout”
action permit
exit
[/php] This example is certainly quite limited in what can be done due to the default-action deny-all, requiring specific white-listing of commands
To enable TACACS+ support on the router we first need to configure the TACACS server using the aggreed shared secret (configuring the timeout is optional but it specifies how many seconds we shall wait for a response – if the server is down, this is effectively how long you will wait to fall back to local authentication)
[php]A:R1>config>system>security>profile# /configure system security tacplus
*A:R1>config>system>security>tacplus$ server 1 address 192.168.1.47 secret “tac_secret”
*A:R1>config>system>security>tacplus$ timeout 5[/php]
Now to create the priv-lvl mapping to profiles:
[php]*A:R1>config>system>security>tacplus$ priv-lvl-map
A:R1>config>system>security>tacplus$ priv-lvl-map
A:R1>config>system>security>tacplus>priv$ priv-lvl 1 “limited”
A:R1>config>system>security>tacplus>priv$ priv-lvl 15 “administrative”[/php]
We also need to enable authorisation to be associated with these mappings:
[php]A:R1>config>system>security>tacplus>priv$ back
A:R1>config>system>security>tacplus$ authorization use-priv-lvl[/php]
Now to actually enable tacacs authentication, within the password context we specify the authentication order to include the methods we prefer.
[php]*A:R1>config>system>security>tacplus$ /configure system security password
*A:R1>config>system>security>password# authentication-order tacplus local exit-on-reject[/php]
If TACACS+ is unavailable, we fall back to local authentication accounts – if we hadn’t include “exit-on-reject”, a failed authentication attempt with TACACS+ (reject) would move onto the next authentication mechanisms (local)

SROS performs a AAA server health check by sending dummy authentication requests to a server and determines if the server is alive based on obtaining a response, this can end up with the authentication logs getting a lot of failed access attempts, however it can be disabled if desired:
[php]*A:R1>config>system>security>password# no health-check[/php]
For this testing, I’ll be using telnet, so I need to enable the telnet-server (outside of a lab, I would not suggest this at all!)
[php]*A:R1>config>system>security>password# back
*A:R1>config>system>security# telnet-server[/php]
So to recap the router configuration:
[php]*A:R1>config>system>security# info
———————————————-
telnet-server
profile “limited”
default-action deny-all
entry 10
match “show router route-table”
action permit
exit
entry 20
match “show users”
action permit
exit
entry 30
match “show system security user”
action permit
exit
entry 40
match “logout”
action permit
exit
exit
password
authentication-order tacplus local exit-on-reject
no health-check
exit
tacplus
authorization use-priv-lvl
priv-lvl-map
priv-lvl 1 “limited”
priv-lvl 15 “administrative”
exit
timeout 5
server 1 address 192.168.1.47 secret “1mSYRiobfhHAdFA9cZH3wBviQtXKFDld” hash2
exit
[/php]
Time to test if this works. Start the tacacs service on (m4600 has the IP of 192.168.1.47 which is what R1 will be communicating with)
[php]root@m4600:~# tac_plus -d 16 -L -C /etc/tacacs+/tac_plus.conf[/php]
And start viewing syslog
[php]root@m4600:~# tail -f /var/log/syslog
May 14 14:39:14 m4600 tac_plus[28164]: Reading config
May 14 14:39:14 m4600 tac_plus[28164]: Version F4.0.4.27a Initialized 1
May 14 14:39:14 m4600 tac_plus[28164]: tac_plus server F4.0.4.27a starting
May 14 14:39:14 m4600 tac_plus[28165]: Backgrounded
May 14 14:39:14 m4600 tac_plus[28166]: socket FD 0 AF 2
May 14 14:39:14 m4600 tac_plus[28166]: socket FD 2 AF 10
May 14 14:39:14 m4600 tac_plus[28166]: uid=0 euid=0 gid=0 egid=0 s=-1637085952[/php]
Open up another session on m4600 and telnet to 192.168.1.123 using the credentials of testadmin/pass:
[php]May 14 14:39:23 m4600 tac_plus[28201]: connect from 192.168.1.123 [192.168.1.123]
May 14 14:39:23 m4600 tac_plus[28201]: cfg_acl_check(mgmt_acl, 192.168.1.123)
May 14 14:39:23 m4600 tac_plus[28201]: ip 192.168.1.123 matched permit regex 192\.168\.1\.([1-9]|[1-9]\d|1\d{2}|2[0-4]\d|25[0-4]) of acl filter mgmt_acl
May 14 14:39:23 m4600 tac_plus[28201]: host ACLs for user ‘testadmin’ permit
May 14 14:39:23 m4600 tac_plus[28201]: login query for ‘testadmin’ port console from 192.168.1.123 accepted
May 14 14:39:23 m4600 tac_plus[28202]: connect from 192.168.1.123 [192.168.1.123]
May 14 14:39:23 m4600 tac_plus[28202]: cfg_acl_check(mgmt_acl, 192.168.1.123)
May 14 14:39:23 m4600 tac_plus[28202]: ip 192.168.1.123 matched permit regex 192\.168\.1\.([1-9]|[1-9]\d|1\d{2}|2[0-4]\d|25[0-4]) of acl filter mgmt_acl
May 14 14:39:23 m4600 tac_plus[28202]: host ACLs for user ‘testadmin’ permit
May 14 14:39:23 m4600 tac_plus[28202]: authorization query for ‘testadmin’ console from 192.168.1.123 accepted[/php]
Lets go back to the telnet session and check who we are and our access rights:
[php highlight=”30,38″]*A:R1# show users
===============================================================================
User Type Login time Idle time
From
===============================================================================
Console — 0d 00:00:21

testadmin Telnet 14MAY2017 04:41:41 0d 00:00:00
192.168.1.47
——————————————————————————-
Number of users : 1
===============================================================================
*A:R1>config>system>security# show system security user testadmin detail

===============================================================================
Users
===============================================================================
User ID New User Permissions Password Login Failed Local
Pwd console ftp li snmp netconf Expires Attempts Logins Conf
——————————————————————————-
testadmin n y n n n n never 1 0 n
——————————————————————————-
Number of users : 1
===============================================================================

===============================================================================
Temporary User Configuration Detail
===============================================================================
===============================================================================
user id : testadmin
——————————————————————————-
console parameters
——————————————————————————-
new pw required : n/a cannot change pw : n/a
home directory :
restricted to home : no
login exec file :
profile : administrative
locked-out : no
===============================================================================[/php]
Okay, that’s good. Lets log out and log back in R1 using the credentials of testlimited/test:
[php]May 14 14:43:37 m4600 tac_plus[29058]: connect from 192.168.1.123 [192.168.1.123]
May 14 14:43:37 m4600 tac_plus[29058]: cfg_acl_check(mgmt_acl, 192.168.1.123)
May 14 14:43:37 m4600 tac_plus[29058]: ip 192.168.1.123 matched permit regex 192\.168\.1\.([1-9]|[1-9]\d|1\d{2}|2[0-4]\d|25[0-4]) of acl filter mgmt_acl
May 14 14:43:37 m4600 tac_plus[29058]: host ACLs for user ‘testlimited’ permit
May 14 14:43:37 m4600 tac_plus[29058]: login query for ‘testlimited’ port telnet from 192.168.1.123 accepted
May 14 14:43:37 m4600 tac_plus[29059]: connect from 192.168.1.123 [192.168.1.123]
May 14 14:43:37 m4600 tac_plus[29059]: cfg_acl_check(mgmt_acl, 192.168.1.123)
May 14 14:43:37 m4600 tac_plus[29059]: ip 192.168.1.123 matched permit regex 192\.168\.1\.([1-9]|[1-9]\d|1\d{2}|2[0-4]\d|25[0-4]) of acl filter mgmt_acl
May 14 14:43:37 m4600 tac_plus[29059]: host ACLs for user ‘testlimited’ permit
May 14 14:43:37 m4600 tac_plus[29059]: authorization query for ‘testlimited’ telnet from 192.168.1.123 accepted[/php]
Looks promising from the TACACS server, lets go back to the telnet session and check who we are and our access rights:
[php]*A:R1# show users
===============================================================================
User Type Login time Idle time
From
===============================================================================
Console — 0d 00:04:55

testlimited Telnet 14MAY2017 04:43:36 0d 00:00:00
192.168.1.47
——————————————————————————-
Number of users : 1
===============================================================================
*A:R1# show router route-table

===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
——————————————————————————-
192.168.1.0/24 Local Local 01h07m25s 0
TACACS 0
——————————————————————————-
No. of Routes: 1
Flags: n = Number of times nexthop is repeated
B = BGP backup route available
L = LFA nexthop available
S = Sticky ECMP requested
===============================================================================
*A:R1# admin display-config
MINOR: CLI Command not allowed for this user.[/php]
Certainly appears to be a limited user,

[php highlight=”18,26″]*A:R1# show system security user testlimited detail

===============================================================================
Users
===============================================================================
User ID New User Permissions Password Login Failed Local
Pwd console ftp li snmp netconf Expires Attempts Logins Conf
——————————————————————————-
testlimited n y n n n n never 1 0 n
——————————————————————————-
Number of users : 1
===============================================================================

===============================================================================
Temporary User Configuration Detail
===============================================================================
===============================================================================
user id : testlimited
——————————————————————————-
console parameters
——————————————————————————-
new pw required : n/a cannot change pw : n/a
home directory :
restricted to home : no
login exec file :
profile : limited
locked-out : no
===============================================================================[/php]
Okay, so we’re correctly associated with the limited profile account.

Role based access control is a good idea for managing your network and being able to leverage your existing AAA infrastructure helps make operating a heterogeneous network that little bit easier.

The case of Nokia Virtual Service Router and the non-unique Chassis MAC Address

So I’m playing with eve-ng and have decided to work on a Layer 2 scenario and a few problems with my emulation environment came up which needed a way forward, which resulted in this rambling tale…

SROS 12.0R6 5 Router Topology

R1, R2 and R3 Will be the MPLS Core with VPLS configured, while R4 and R5 will be Layer 3 CE devices that talk to each other over the VPLS.

The CE Devices are pretty straight forward so we’ll get those up first

R4 is a single-ended configuration with Interface R5 on Port 1/1/1 having the IP 192.168.1.4/27
[python linenumbers=”false” tab=”R4 CE Config”]
configure
system
name “R4”
card 1
card-type iom3-xp-b
mda 1
mda-type m5-1gb-sfp-b
no shutdown
exit
no shutdown
exit
port 1/1/1
ethernet
exit
no shutdown
exit
router
interface “R5”
address 192.168.1.4/27
port 1/1/1
no shutdown
exit
interface “system”
no shutdown
exit
exit
exit all
[/python]

R5 is a a little more complex, it has a LAG toward – Interface R4 on LAG-1 with Ports 1/1/1 and 1/1/2 having the IP 192.168.1.5/27
[python linenumbers=”false” tab=”R5 CE Config”]
configure
system
name “R5”
exit
card 1
card-type iom3-xp-b
mda 1
mda-type m5-1gb-sfp-b
no shutdown
exit
no shutdown
exit
port 1/1/1
ethernet
autonegotiate limited
exit
no shutdown
exit
port 1/1/2
ethernet
autonegotiate limited
exit
no shutdown
exit
lag 1
port 1/1/1
port 1/1/2
lacp active administrative-key 32768
no shutdown
exit
router
interface “R4”
address 192.168.1.5/27
port lag-1
no shutdown
exit
interface “system”
no shutdown
exit
exit
exit all
[/python]
Multi-speed Ethernet interfaces when associated with a LAG must have autonegotiate set to limited to control the bundle member speed so they all bundle members operate the same speed

Now to Develop the MPLS Core Configuration on R1, R2 and R3 – this is quite straight forward, we are just going to use OSPF and LDP on the directly connected interfaces:

[codegroup]
[python linenumbers=”false” tab=”R1 Core Base Config”]
configure
system
name “R1”
exit
card 1
card-type iom3-xp-b
mda 1
mda-type m5-1gb-sfp-b
no shutdown
exit
no shutdown
exit
port 1/1/1
ethernet
exit
no shutdown
exit
port 1/1/2
ethernet
exit
no shutdown
exit
port 1/1/3
shutdown
ethernet
exit
exit
router
interface “R2”
address 10.1.2.1/27
port 1/1/1
no shutdown
exit
interface “R3”
address 10.1.3.1/27
port 1/1/2
no shutdown
exit
interface “system”
address 10.10.10.1/32
no shutdown
exit
ospf
area 0.0.0.0
interface “system”
no shutdown
exit
interface “R2”
no shutdown
exit
interface “R3”
no shutdown
exit
exit
exit
ldp
interface-parameters
interface “R2”
exit
interface “R3″
exit
exit
targeted-session
exit
no shutdown
exit
exit
exit all
[/python]
[python linenumbers=”false” tab=”R2 Core Base Config”]
configure
system
name “R2”
exit
card 1
card-type iom3-xp-b
mda 1
mda-type m5-1gb-sfp-b
no shutdown
exit
no shutdown
exit
port 1/1/1
ethernet
exit
no shutdown
exit
port 1/1/2
ethernet
exit
no shutdown
exit
port 1/1/3
shutdown
ethernet
exit
exit
router
interface “R1”
address 10.1.2.2/27
port 1/1/1
no shutdown
exit
interface “R3”
address 10.2.3.2/27
port 1/1/2
no shutdown
exit
interface “system”
address 10.10.10.2/32
no shutdown
exit
ospf
area 0.0.0.0
interface “system”
no shutdown
exit
interface “R1”
no shutdown
exit
interface “R3”
no shutdown
exit
exit
exit
ldp
interface-parameters
interface “R1”
exit
interface “R3″
exit
exit
targeted-session
exit
no shutdown
exit
exit
exit all
[/python]
[python linenumbers=”false” tab=”R3 Core Base Config”]
configure
system
name “R3”
exit
card 1
card-type iom3-xp-b
mda 1
mda-type m5-1gb-sfp-b
no shutdown
exit
no shutdown
exit
port 1/1/1
ethernet
exit
no shutdown
exit
port 1/1/2
ethernet
exit
no shutdown
exit
port 1/1/3
shutdown
ethernet
exit
exit
router
interface “R1”
address 10.1.3.3/27
port 1/1/2
no shutdown
exit
interface “R2”
address 10.2.3.3/27
port 1/1/3
no shutdown
exit
interface “system”
address 10.10.10.3/32
no shutdown
exit
ospf
area 0.0.0.0
interface “system”
no shutdown
exit
interface “R1”
no shutdown
exit
interface “R2”
no shutdown
exit
exit
exit
ldp
interface-parameters
interface “R1”
exit
interface “R2″
exit
exit
targeted-session
exit
no shutdown
exit
exit
exit all
[/python][/codegroup]
The Layer 2 Service that we are going to build is a VPLS and will be using Spoke-SDPs that connected to each adjacent router (an alternate could be to use a full-mesh but I specifically want to test STP operation here)
[codegroup]
[python linenumbers=”false” tab=”R1 SDP to R2 and R3″]
*A:R1>config>service# info
———————————————-
sdp 2 mpls create
far-end 10.10.10.2
ldp
keep-alive
shutdown
exit
no shutdown
exit
sdp 3 mpls create
far-end 10.10.10.3
ldp
keep-alive
shutdown
exit
no shutdown
exit
[/python]
[python linenumbers=”false” tab=”R2 SDP to R1 and R3″]
*A:R2>config>service# info
———————————————-
sdp 1 mpls create
far-end 10.10.10.1
ldp
keep-alive
shutdown
exit
no shutdown
exit
sdp 3 mpls create
far-end 10.10.10.3
ldp
keep-alive
shutdown
exit
no shutdown
exit
[/python]
[python linenumbers=”false” tab=”R3 SDP to R1 and R2″]
*A:R3>config>service# info
———————————————-
sdp 1 mpls create
far-end 10.10.10.1
ldp
keep-alive
shutdown
exit
no shutdown
exit
sdp 2 mpls create
far-end 10.10.10.2
ldp
keep-alive
shutdown
exit
no shutdown
exit
[/python][/codegroup]
Verifying the SDPs are up:
[codegroup]
[python linenumbers=”false” tab=”R1 SDP State”]
A:R1# show service sdp

============================================================================
Services: Service Destination Points
============================================================================
SdpId AdmMTU OprMTU Far End Adm Opr Del LSP Sig
—————————————————————————-
2 0 8914 10.10.10.2 Up Up MPLS L TLDP
3 0 8914 10.10.10.3 Up Up MPLS L TLDP
—————————————————————————-
Number of SDPs : 2
—————————————————————————-
Legend: R = RSVP, L = LDP, B = BGP, M = MPLS-TP, n/a = Not Applicable
============================================================================
[/python]
[python linenumbers=”false” tab=”R2 SDP State”]
A:R2# show service sdp

============================================================================
Services: Service Destination Points
============================================================================
SdpId AdmMTU OprMTU Far End Adm Opr Del LSP Sig
—————————————————————————-
1 0 8914 10.10.10.1 Up Up MPLS L TLDP
3 0 8914 10.10.10.3 Up Up MPLS L TLDP
—————————————————————————-
Number of SDPs : 2
—————————————————————————-
Legend: R = RSVP, L = LDP, B = BGP, M = MPLS-TP, n/a = Not Applicable
============================================================================
[/python]
[python linenumbers=”false” tab=”R3 SDP State”]
A:R3# show service sdp

============================================================================
Services: Service Destination Points
============================================================================
SdpId AdmMTU OprMTU Far End Adm Opr Del LSP Sig
—————————————————————————-
1 0 8914 10.10.10.1 Up Up MPLS L TLDP
2 0 8914 10.10.10.2 Up Up MPLS L TLDP
—————————————————————————-
Number of SDPs : 2
—————————————————————————-
Legend: R = RSVP, L = LDP, B = BGP, M = MPLS-TP, n/a = Not Applicable
============================================================================
[/python]
[/codegroup]
With the transport infrastructure in place VPLS 100 without the customer access components can be set up:
[codegroup]
[python linenumbers=”false” tab=”Initial R1 VPLS 100 Config”]
*A:R1>config>service>vpls$ pwc
——————————————————————————-
Present Working Context :
——————————————————————————-

configure
service
vpls “100” customer 1 create
——————————————————————————-
A:R1>config>service>vpls$ info
———————————————-
stp
no shutdown
exit
spoke-sdp 2:100 create
no shutdown
exit
spoke-sdp 3:100 create
no shutdown
exit
no shutdown
[/python]
[python linenumbers=”false” tab=”Initial R2 VPLS 100 Config”]
*A:R2>config>service>vpls$ pwc
——————————————————————————-
Present Working Context :
——————————————————————————-

configure
service
vpls “100” customer 1 create
——————————————————————————-
A:R2>config>service>vpls$ info
———————————————-
stp
no shutdown
exit
spoke-sdp 1:100 create
no shutdown
exit
spoke-sdp 3:100 create
no shutdown
exit
no shutdown
[/python]
[python linenumbers=”false” tab=”Initial R3 VPLS 100 Config”]
*A:R3>config>service>vpls$ pwc
——————————————————————————-
Present Working Context :
——————————————————————————-

configure
service
vpls “100” customer 1 create
——————————————————————————-
A:R3>config>service>vpls$ info
———————————————-
stp
no shutdown
exit
spoke-sdp 1:100 create
no shutdown
exit
spoke-sdp 2:100 create
no shutdown
exit
no shutdown
[/python]
[/codegroup]
Verify that VPLS 100 is up and running:
[codegroup]
[python linenumbers=”false” tab=”R1 VPLS 100 Spoke SDP State”]
*A:R1>config>service>*A:R1# show service id 100 base | match Ident post-lines 3
Identifier Type AdmMTU OprMTU Adm Opr
——————————————————————————-
sdp:2:100 S(10.10.10.2) Spok 0 8914 Up Up
sdp:3:100 S(10.10.10.3) Spok 0 8914 Up Up
[/python]
[python linenumbers=”false” tab=”R2 VPLS 100 Spoke SDP State”]
A:R2# show service id 100 base | match Ident post-lines 3
Identifier Type AdmMTU OprMTU Adm Opr
——————————————————————————-
sdp:1:100 S(10.10.10.1) Spok 0 8914 Up Up
sdp:3:100 S(10.10.10.3) Spok 0 8914 Up Up
[/python]
[python linenumbers=”false” tab=”R3 VPLS 100 Spoke SDP State”]
A:R3# show service id 100 base | match Ident post-lines 3
Identifier Type AdmMTU OprMTU Adm Opr
——————————————————————————-
sdp:1:100 S(10.10.10.1) Spok 0 8914 Up Up
sdp:2:100 S(10.10.10.2) Spok 0 8914 Up Up
[/python]
[/codegroup]
Looks good With 3 routers each connecting to each other using spokes will introduce a bridging loop so we need a loop avoidance mechanism – luckily we enabled STP, so lets see how STP is behaving:
[codegroup]
[python linenumbers=”false” tab=”R1 VPLS 100 STP State” highlight=”6-7″]
*A:R1# show service id 100 stp

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id : 80:00.da:00:ff:00:00:01 Top. Change Count : 4
Root Bridge : This Bridge Stp Oper State : Up
Primary Bridge : N/A Topology Change : Inactive
Mode : Rstp Last Top. Change : 0d 00:10:13
Vcp Active Prot. : N/A
Root Port : N/A External RPC : 0

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id Oper- Port- Port- Port- Oper- Link- Active
State Role State Num Edge Type Prot.
——————————————————————————-
2:100 Up Designated Forward 2049 True Pt-pt Rstp
3:100 Up Backup Discard 2050 False Pt-pt Rstp
===============================================================================
[/python]
[python linenumbers=”false” tab=”R2 VPLS 100 STP State” highlight=”6-7″]
*A:R2# show service id 100 stp

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id : 80:00.da:00:ff:00:00:01 Top. Change Count : 3
Root Bridge : This Bridge Stp Oper State : Up
Primary Bridge : N/A Topology Change : Inactive
Mode : Rstp Last Top. Change : 0d 00:10:47
Vcp Active Prot. : N/A
Root Port : N/A External RPC : 0

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id Oper- Port- Port- Port- Oper- Link- Active
State Role State Num Edge Type Prot.
——————————————————————————-
1:100 DwnstrmLp Designated Discard 2049 False Pt-pt Rstp
3:100 Up Backup Discard 2050 False Pt-pt Rstp
===============================================================================
[/python]
[python linenumbers=”false” tab=”R3 VPLS 100 STP State” highlight=”6-7″]
*A:R3# show service id 100 stp

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id : 80:00.da:00:ff:00:00:01 Top. Change Count : 3
Root Bridge : This Bridge Stp Oper State : Up
Primary Bridge : N/A Topology Change : Inactive
Mode : Rstp Last Top. Change : 0d 00:10:54
Vcp Active Prot. : N/A
Root Port : N/A External RPC : 0

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id Oper- Port- Port- Port- Oper- Link- Active
State Role State Num Edge Type Prot.
——————————————————————————-
1:100 Up Designated Forward 2048 False Pt-pt Rstp
2:100 Up Designated Forward 2049 False Pt-pt Rstp
===============================================================================
[/python][/codegroup]
This doesn’t seem right, SDP 1:100 on R2 is saying that the downstream interface is looped and both interfaces are discarding!

If we look at the highlighted lines on each of the router outputs we notice that all Routers in the VPLS have the same Bridge ID, which is definitely a bad thing.

For SROS, the Bridge Id is partly derived from the chassis MAC address:
[python linenumbers=”false”]*A:R1# show chassis detail | match MAC
Base MAC address : da:00:ff:00:00:01[/python]
[python linenumbers=”false”]*A:R2# show chassis detail | match MAC
Base MAC address : da:00:ff:00:00:01[/python]
[python linenumbers=”false”]*A:R3# show chassis detail | match MAC
Base MAC address : da:00:ff:00:00:01[/python]
With real hardware, the Chassis MAC address actually is unique so this problem wont come up – however with the VSRs they’re all the same.

As an asside, the Chassis MAC address is used in a few places besides STP, one is with the SNMP engine id
[python linenumbers=”false” highlight=”2,4″]*A:R1# show chassis detail | match MAC
Base MAC address : da:00:ff:00:00:01
*A:R1# show system information | match Engine
SNMP Engine ID : 0000197f0000da00ff000001
SNMP Engine Boots : 11[/python]

It is possible within the configuration to manually set the Engine ID (I think it would probably be best to do this in production just in case you end up replacing faulty hardware)

With SROS version 14.0R4 a new option for the boot options file (or bof) was introduced which allows the manual setting of the chassis MAC address (followed by a reboot):
[python linenumbers=”false”]*A:R14# bof system-base-mac 00:11:22:33:44:02
*A:R14# bof save
Writing BOF to cf3:/bof.cfg … OK
Completed.
Writing configuration to cf3:\config.cfg
Saving configuration … OK
Completed.
A:R14# /admin reboot
Are you sure you want to reboot (y/n)? y[/python]
Which is great but this particular set up is using SROS 12.0R6 and that BOF option doesn’t exist an alternate method is required.

For STP we can cast our mind back to remember what the Bridge ID consists of… It’s both the Priority (which by default is 32768) and the Bridge MAC address.

So as a quick and nasty fix, I should just be able to change the STP Priority in VPLS 100 on R1/R2/R3 and resolve the STP problem, it also will allow me to specifically select a root bridge which is probably a good thing to do.
[python linenumbers=”false” tab= “R1 VPLS 100 STP”]*A:R1# configure service vpls 100 stp priority 4096[/python]
[python linenumbers=”false” tab= “R2 VPLS 100 STP”]*A:R2# configure service vpls 100 stp priority 8192[/python]
[python linenumbers=”false” tab= “R3 VPLS 100 STP”]*A:R3# configure service vpls 100 stp priority 16384[/python]
Lets see how things are going now:

[codegroup]
[python linenumbers=”false” tab= “R1 VPLS 100 STP”]*A:R1# show service id 100 stp

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id : 10:00.da:00:ff:00:00:01 Top. Change Count : 6
Root Bridge : This Bridge Stp Oper State : Up
Primary Bridge : N/A Topology Change : Inactive
Mode : Rstp Last Top. Change : 0d 00:00:35
Vcp Active Prot. : N/A
Root Port : N/A External RPC : 0

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id Oper- Port- Port- Port- Oper- Link- Active
State Role State Num Edge Type Prot.
——————————————————————————-
2:100 Up Designated Forward 2049 False Pt-pt Rstp
3:100 Up Designated Forward 2050 False Pt-pt Rstp
===============================================================================[/python]
[python linenumbers=”false” tab= “R2 VPLS 100 STP”]*A:R2# show service id 100 stp

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id : 20:00.da:00:ff:00:00:01 Top. Change Count : 4
Root Bridge : 10:00.da:00:ff:00:00:01 Stp Oper State : Up
Primary Bridge : N/A Topology Change : Inactive
Mode : Rstp Last Top. Change : 0d 00:01:07
Vcp Active Prot. : N/A
Root Port : 2049 External RPC : 10

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id Oper- Port- Port- Port- Oper- Link- Active
State Role State Num Edge Type Prot.
——————————————————————————-
1:100 Up Root Forward 2049 False Pt-pt Rstp
3:100 Up Designated Forward 2050 False Pt-pt Rstp
===============================================================================[/python]
[python linenumbers=”false” tab= “R3 VPLS 100 STP”]*A:R3# show service id 100 stp

===============================================================================
Stp info, Service 100
===============================================================================
Bridge Id : 40:00.da:00:ff:00:00:01 Top. Change Count : 4
Root Bridge : 10:00.da:00:ff:00:00:01 Stp Oper State : Up
Primary Bridge : N/A Topology Change : Inactive
Mode : Rstp Last Top. Change : 0d 00:01:52
Vcp Active Prot. : N/A
Root Port : 2048 External RPC : 10

===============================================================================
Stp port info
===============================================================================
Sap/Sdp/PIP Id Oper- Port- Port- Port- Oper- Link- Active
State Role State Num Edge Type Prot.
——————————————————————————-
1:100 Up Root Forward 2048 False Pt-pt Rstp
2:100 Up Alternate Discard 2049 False Pt-pt Rstp
===============================================================================[/python][/codegroup]
Success, all routers have different bridge IDs and all agree that R1 is the root and only one port is in discarding state.

Now we will create the CE router attachments (Service Access Points) on the Core starting with R3 which is facing R4 – by default Ethernet ports are in network mode, to be able to bind to a service, the port must be mode access (or hybrid)
[python linenumbers=”false”]*A:R3# /configure port 1/1/1
*A:R3>config>port# shutdown
*A:R3>config>port# ethernet mode access
*A:R3>config>port# ethernet encap-type null
*A:R3>config>port# no shutdown
*A:R3>config>port# /configure service vpls 100
*A:R3>config>service>vpls# sap 1/1/1 create
*A:R3>config>service>vpls>sap$ show service id 100 base

===============================================================================
Service Basic Information
===============================================================================
Service Id : 100 Vpn Id : 0
Service Type : VPLS
Name : (Not Specified)
Description : (Not Specified)
Customer Id : 1 Creation Origin : manual
Last Status Change: 04/21/2017 13:20:28
Last Mgmt Change : 04/21/2017 13:44:59
Etree Mode : Disabled
Admin State : Up Oper State : Up
MTU : 1514 Def. Mesh VC Id : 100
SAP Count : 1 SDP Bind Count : 2
Snd Flush on Fail : Disabled Host Conn Verify : Disabled
Propagate MacFlush: Disabled Per Svc Hashing : Disabled
Allow IP Intf Bind: Disabled
Def. Gateway IP : None
Def. Gateway MAC : None
Temp Flood Time : Disabled Temp Flood : Inactive
Temp Flood Chg Cnt: 0
VSD Domain :

——————————————————————————-
Service Access & Destination Points
——————————————————————————-
Identifier Type AdmMTU OprMTU Adm Opr
——————————————————————————-
sap:1/1/1 null 1514 1514 Up Up
sdp:1:100 S(10.10.10.1) Spok 0 8914 Up Up
sdp:2:100 S(10.10.10.2) Spok 0 8914 Up Up
===============================================================================[/python]
Now things are going to get a little more complicated on R1 and R2 as we are going to establish a Multi-Chassis LAG towards R5. R5 is unaware of the MC-LAG, it is just talking LACP to R1 and R2 thinking they are just one system. R1 and R2 require synchronisation between each other to set up the Active-Standby LAG.

We’ll start by creating regular LAG-1 Facing R5 on R1 and R2 with a single port in each:
[codegroup]
[python linenumbers=”false” tab=”R1″]*A:R1# /configure port 1/1/3 shutdown
*A:R1# /configure port 1/1/3 ethernet mode access
*A:R1# /configure port 1/1/3 ethernet encap-type null
*A:R1# /configure port 1/1/3 ethernet autonegotiate limited
*A:R1# /configure port 1/1/3 no shutdown
*A:R1# /configure lag 1
*A:R1>config>lag$ mode access
*A:R1>config>lag$ lacp active
*A:R1>config>lag$ port 1/1/3
*A:R1>config>lag$ no shutdown[/python]
[python linenumbers=”false” tab=”R2″]*A:R2# /configure port 1/1/3 shutdown
*A:R2# /configure port 1/1/3 ethernet mode access
*A:R2# /configure port 1/1/3 ethernet encap-type null
*A:R2# /configure port 1/1/3 ethernet autonegotiate limited
*A:R2# /configure port 1/1/3 no shutdown
*A:R2# /configure lag 1
*A:R2>config>lag$ mode access
*A:R2>config>lag$ lacp active
*A:R2>config>lag$ port 1/1/3
*A:R2>config>lag$ no shutdown[/python]
[/codegroup]
Now to set up MC-LAG we need to set up a multi-chassis peering between R1 and R2 (multi-chassis redundancy supports more than just MC-LAG):
[codegroup]
[python linenumbers=”false” tab=”R1 MC Peer with R2″]*A:R1>config>lag# /configure redundancy multi-chassis peer 10.10.10.2 create
*A:R1>config>redundancy>multi-chassis>peer# no shutdown[/python]
[python linenumbers=”false” tab=”R2 MC Peer with R1″]*A:R2>config>lag# /configure redundancy multi-chassis peer 10.10.10.1 create
*A:R2>config>redundancy>multi-chassis>peer# no shutdown[/python]
[/codegroup]
Then we create the MC-LAG itself, we require the lacp-key, system-id and priority to be the same on each router:
[codegroup]
[python linenumbers=”false” tab=”R1 MC-LAG to R5″]*A:R1>config>redundancy>multi-chassis>peer# mc-lag
*A:R1>config>redundancy>mc>peer>mc-lag#lag 1 lacp-key 2468 remote-lag 1 system-id 00:00:be:ef:ca:fe system-priority 1000
*A:R1>config>redundancy>mc>peer>mc-lag#no shutdown[/python]
[python linenumbers=”false” tab=”R2 MC-LAG to R5″]*A:R2>config>redundancy>multi-chassis>peer# mc-lag
*A:R2>config>redundancy>mc>peer>mc-lag#lag 1 lacp-key 2468 remote-lag 1 system-id 00:00:be:ef:ca:fe system-priority 1000
*A:R2>config>redundancy>mc>peer>mc-lag#no shutdown[/python]
[/codegroup]
Now the MC-LAG should be up and running, first we’ll check the peering
[python linenumbers=”false”]*A:R1>config>redundancy>mc>peer>mc-lag# show redundancy multi-chassis all

===============================================================================
Multi-Chassis Peers
===============================================================================
Peer IP Peer Admin Client Admin Oper State
Src IP Auth
——————————————————————————-
10.10.10.2 Enabled MC-Sync: — — —
10.10.10.1 None MC-Ring: — — —
MC-Endpt: — — —
MC-Lag: Enabled Enabled —
MC-IPsec: — — Disabled
===============================================================================[/python]
[python linenumbers=”false”]*A:R2>config>redundancy>mc>peer>mc-lag# show redundancy multi-chassis all

===============================================================================
Multi-Chassis Peers
===============================================================================
Peer IP Peer Admin Client Admin Oper State
Src IP Auth
——————————————————————————-
10.10.10.1 Enabled MC-Sync: — — —
10.10.10.2 None MC-Ring: — — —
MC-Endpt: — — —
MC-Lag: Enabled Enabled —
MC-IPsec: — — Disabled
===============================================================================[/python]
Looks promising, lets check our LAG status
[codegroup][python linenumbers=”false” tab=”R1 LAG Status”]*A:R1>config>redundancy>mc>peer>mc-lag# show lag

===============================================================================
Lag Data
===============================================================================
Lag-id Adm Opr Weighted Threshold Up-Count MC Act/Stdby
——————————————————————————-
1 up down No 0 0 standby
——————————————————————————-
Total Lag-ids: 1 Single Chassis: 0 MC Act: 0 MC Stdby: 1
===============================================================================[/python]
[python linenumbers=”false” tab=”R2 LAG Status”]*A:R2>config>redundancy>mc>peer>mc-lag# show lag

===============================================================================
Lag Data
===============================================================================
Lag-id Adm Opr Weighted Threshold Up-Count MC Act/Stdby
——————————————————————————-
1 up down No 0 0 standby
——————————————————————————-
Total Lag-ids: 1 Single Chassis: 0 MC Act: 0 MC Stdby: 1
===============================================================================[/python][/codegroup]
Ummm… both of these are showing that they are in Multi-Chassis Standby

It turns out that within the MC-LAG configuration, the Base Chassis MAC needs to be unique too. While we cannot directly change the Base MAC prior to SROS version 14.0R4 there is actually an alternative method available. if we set the out-of-band management ethernet IP address, this will influence the chassis MAC address.
[python linenumbers=”false”]*A:R1>config>lag# show bof
===============================================================================
BOF (Memory)
===============================================================================
primary-image cf3:\timos\both.tim
primary-config cf3:\config.cfg
autonegotiate
duplex full
speed 100
wait 3
persist off
no li-local-save
no li-separate
console-speed 115200
===============================================================================
*A:R1>config>lag# /bof address 192.168.100.1/24
*A:R1>config>lag# /bof save
Writing BOF to cf3:/bof.cfg … OK
Completed.
*A:R1>config>lag# show bof
===============================================================================
BOF (Memory)
===============================================================================
primary-image cf3:\timos\both.tim
primary-config cf3:\config.cfg
address 192.168.100.1/24 active
autonegotiate
duplex full
speed 100
wait 3
persist off
no li-local-save
no li-separate
console-speed 115200
===============================================================================[/python]
Save and reboot
[python linenumbers=”false”]*A:R1>config>lag# /admin save
Writing configuration to cf3:\config.cfg
Saving configuration … OK
Completed.
A:R1>config>lag# /admin reboot
Are you sure you want to reboot (y/n)? y[/python]
We’ll do the same thing with R2 but give it a different IP so the MAC Addresses should be different:
[python linenumbers=”false”]*A:R2>config>lag# /bof address 192.168.100.2/24
*A:R2>config>lag# /bof save
Writing BOF to cf3:/bof.cfg … OK
Completed.
*A:R2>config>lag# /admin save
Writing configuration to cf3:\config.cfg
Saving configuration … OK
Completed.
A:R2>config>lag# /admin reboot
Are you sure you want to reboot (y/n)? y [/python]
After the reboot we can compare R1 and R2’s Base MAC Address
[python linenumbers=”false”]A:R1# show chassis detail | match MAC
Base MAC address : c8:01:ff:00:00:00[/python]
[python linenumbers=”false”]A:R2# show chassis detail | match MAC
Base MAC address : c8:02:ff:00:00:00[/python]
Okay they are different now – has it resolved our MC-LAG issue?
[codegroup][python linenumbers=”false” tab=”R1 LAG Port”]A:R1# show lag 1 port

===============================================================================
Lag Port States
LACP Status: e – Enabled, d – Disabled
===============================================================================
Lag-id Port-id Adm Act/Stdby Opr Primary Sub-group Forced Priority
——————————————————————————-
1(e) 1/1/3 up active up yes 1 – 32768
===============================================================================[/python]

[python linenumbers=”false” tab=”R1 LAG Port”]A:R2# show lag 1 port

===============================================================================
Lag Port States
LACP Status: e – Enabled, d – Disabled
===============================================================================
Lag-id Port-id Adm Act/Stdby Opr Primary Sub-group Forced Priority
——————————————————————————-
1(e) 1/1/3 up standby down yes 1 – 32768
===============================================================================[/python]
[python linenumbers=”false” tab=”R5 LAG Port”]A:R5# show lag 1 port

===============================================================================
Lag Port States
LACP Status: e – Enabled, d – Disabled
===============================================================================
Lag-id Port-id Adm Act/Stdby Opr Primary Sub-group Forced Priority
——————————————————————————-
1(e) 1/1/1 up active up yes 1 – 32768
1/1/2 up active down 1 – 32768
===============================================================================[/python][/codegroup]
Yes R1, R2 and R5 are in alignment, now lets put the LAG into VPLS 100 on R1 and R2
[python linenumbers=”false”]A:R1# /configure service vpls 100 sap lag-1 create[/python]
[python linenumbers=”false”]A:R2# /configure service vpls 100 sap lag-1 create[/python]
Lets see if R5 can ping R4
[python linenumbers=”false”]A:R5# ping 192.168.1.4 count 1
PING 192.168.1.4 56 data bytes
64 bytes from 192.168.1.4: icmp_seq=1 ttl=64 time=12.3ms.

—- 192.168.1.4 PING Statistics —-
1 packet transmitted, 1 packet received, 0.00% packet loss
round-trip min = 12.3ms, avg = 12.3ms, max = 12.3ms, stddev = 0.000ms[/python]
Success!

Lets check the MAC address table in vpls 100 (Forwarding Data Base):
[codegroup]
[python linenumbers=”false” tab=”R1 FDB” highlight “9,10”]*A:R1>config>service>vpls>sap$ show service id 100 fdb detail

===============================================================================
Forwarding Database, Service 100
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
——————————————————————————-
100 50:00:00:07:00:01 sdp:3:100 L/0 04/21/17 14:47:33
100 da:00:ff:00:01:42 sap:lag-1 L/0 04/21/17 14:52:57
——————————————————————————-
No. of MAC Entries: 2
——————————————————————————-
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================[/python]
[python linenumbers=”false” tab=”R2 FDB” highlight “9,10”]*A:R2>config>service>vpls>sap$ show service id 100 fdb detail

===============================================================================
Forwarding Database, Service 100
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
——————————————————————————-
100 50:00:00:07:00:01 sdp:1:100 L/90 04/21/17 14:53:01
100 da:00:ff:00:01:42 sdp:1:100 L/90 04/21/17 14:45:05
——————————————————————————-
No. of MAC Entries: 2
——————————————————————————-
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================[/python]
[python linenumbers=”false” tab=”R3 FDB” highlight “9,10”]*A:R2>config>service>vpls>sap$ show service id 100 fdb detail

===============================================================================
Forwarding Database, Service 100
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
——————————————————————————-
100 50:00:00:07:00:01 sap:1/1/1 L/0 04/21/17 14:52:42
100 da:00:ff:00:01:42 sdp:1:100 L/0 04/21/17 14:44:46
——————————————————————————-
No. of MAC Entries: 2
——————————————————————————-
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================[/python][/codegroup]
Now to check out the MC-LAG resiliency, we’ll start a continuous ping on R5 to R4 and then shutdown port 1/1/3 (LAG-1) on R1
[python linenumbers=”false”]*A:R1>config>service>vpls>sap$ /configure port 1/1/3 shutdown[/python]
And Check if R2 LAG 1 Port 1/1/3 goes from standby to active
[python linenumbers=”false”]*A:R2>config>service>vpls>sap$ show lag 1 port

===============================================================================
Lag Port States
LACP Status: e – Enabled, d – Disabled
===============================================================================
Lag-id Port-id Adm Act/Stdby Opr Primary Sub-group Forced Priority
——————————————————————————-
1(e) 1/1/3 up active up yes 1 – 32768
===============================================================================[/python]
We can see the interface has come up and there were a few packets lost but the link recovered – we could speed up the link convergence time but I think the general concept has been demonstrated sucessfully.

The moral of the story here – with Virtual SROS systems, it’s worth ensuring you have a unique chassis MAC address!

Working with eve-ng (Active-Backup Bond Interfaces with eth0 and wlan0)

After a lack of updates, its’ time for a new blog post – this post is about linux networking particularly using active-backup bond interfaces for wired and wireless LAN interfaces, which is part of creating my updated virtual network lab environment.

Unetlab which was pretty much an alternative to GNS3 has now evolved into eve-ng which is quite a nice system. Amongst other things is it has a custom linux kernel that doesn’t block L2 Slow protocols like LACP. One of the things I specifically like about Eve besides having the facility to use html5 sessions to handle telnet/VNC consoles (as well as native tools) is that there are some SROS specific modifications that support the distributed VSR models as well as passing SMBIOS parameters etc.

I did a bare metal install pretty much following the process described in http://www.eve-ng.net/index.php/documentation/installation/bare-install but did a few more things.

I installed xcfe4 so I can have a graphical desktop with firefox so I can use eve locally, not just remoting into it.

I also did a few changes to the base install network configuration to allow the use of the copper ethernet as the primary interface but falling back to wireless.

Normally you cannot add a wireless interface into a bridge (normally eve binds eth0 into bridge pnet0 but simply adding wlan0 didn’t work)

Fortunately you can add a bond into a bridge, and the bond is less picky about who joins.

These are the items I added to /etc/network/interfaces

#Bond0 Config
auto bond0
iface bond0 inet manual
    bond-slaves eth0 wlan0
    bond-mode 1
    bond-miimon 100
  • bond-slaves are the link members of the bond (eth0 and wlan0 are my copper and wireless lan interfaces respectively)
  • bond-mode 1 is active-backup – Only one interface at a time will be operational, with the preference to the interface that is configured as bind-primary
  • bond-miimon 100 means that every 100ms the link state is checked
# Wireless interface
allow-hotplug wlan0
iface wlan0 inet manual
    wpa-ssid ReplaceThisWithYourSSID
    wpa-psk ReplaceThisWithYourPresharedKey
    bond-master bond0

I’m not sure if its mandatory but allow-hotplug wlan0 seems to help and bond-master seemed to be required.

The eth0 section was modified to the following

# The primary network interface
allow-hotplug eth0
iface eth0 inet manual
    bond-master bond0
    bond-primary eth0
  • Here allow-hotplug eth0 seems to wake the system to the fact a cable has been connected
  • bond-master bond0 as with wlan0, this appears to be needed
  • bond-primary eth0 means that when both eth0 and wlan0 are up, eth0 should be the one used.

And finally pnet0 was modified to use bond0 instead of eth0

auto pnet0
iface pnet0 inet dhcp
    bridge_ports bond0
    bridge_stp off

So after issuing a “service networking restart”, here’s our verifcation that the bond interface is up:

root@m4600:~# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: eth0 (primary_reselect always)
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: d0:67:e5:57:12:9e
Slave queue ID: 0

Slave Interface: wlan0
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 24:77:03:b1:9f:78
Slave queue ID: 0
root@m4600:~# brctl show pnet0
bridge name     bridge id               STP enabled     interfaces
pnet0           8000.d067e557129e       no              bond0
root@m4600:~# ip -4 addr show pnet0
4: pnet0:  mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 192.168.1.31/24 brd 192.168.1.255 scope global pnet0
       valid_lft forever preferred_lft forever

Quick Network Verification:

root@m4600:~# ping 8.8.8.8 -c 3
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=55 time=30.1 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=55 time=24.9 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=55 time=28.3 ms

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 24.924/27.798/30.131/2.164 ms

Summarise the bond status:

root@m4600:~# grep -A 1 "Interface\|Primary" /proc/net/bonding/bond0
Primary Slave: eth0 (primary_reselect always)
Currently Active Slave: wlan0
--
Slave Interface: eth0
MII Status: down
--
Slave Interface: wlan0
MII Status: up

Now Pull out the Ethernet cable

root@m4600:~# ping 8.8.8.8 -c 3
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=55 time=28.0 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=55 time=35.6 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=55 time=27.9 ms

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 27.910/30.537/35.614/3.593 ms

Verify the bond interface is using wlan0

root@m4600:~# grep -A 1 "Interface\|Primary" /proc/net/bonding/bond0
Primary Slave: eth0 (primary_reselect always)
Currently Active Slave: wlan0
--
Slave Interface: eth0
MII Status: down
--
Slave Interface: wlan0
MII Status: up

Re Insert the Ethernet cable

root@m4600:~# ping 8.8.8.8 -c 3
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=55 time=23.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=55 time=21.9 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=55 time=23.1 ms

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 21.968/22.822/23.397/0.628 ms
root@m4600:~# grep -A 1 "Interface\|Primary" /proc/net/bonding/bond0
Primary Slave: eth0 (primary_reselect always)
Currently Active Slave: eth0
--
Slave Interface: eth0
MII Status: up
--
Slave Interface: wlan0
MII Status: up

So this is all good. (Actually during this testing, I was SSHed into the laptop and the session didn’t break)