...
High Availability Tests Scenarios
Test | Failure Scenario | TestON Test name |
Roadmap
This tests runs through all the state and functionality checks of the HA Test suite but waits 60 seconds instead of inducing a failure. This is run as a 7 node ONOS cluster. |
HAsanity |
now
Restart 3 of 7 ONOS nodes by gracefully stopping the process once the system is running and stable. | HAstopNodes |
Minority of ONOS Nodes continuous shutdown |
Continuously (1000 times) restart 1 of 7 ONOS nodes iteratively by gracefully stopping the process once the system is running and stable. Then verify the node correctly restarts and joins the cluster. | HAcontinuousStopNodes |
Restart 3 of 7 ONOS nodes by killing the process once the system is running and stable. | HAkillNodes |
now
Restart 7 of 7 ONOS nodes by killing the process once the system is running and stable. | HAclusterRestart |
Restart 1 of 1 ONOS nodes by killing the process once the system is running and stable. | HAsingleInstanceRestart |
now
Partition the Control Network by creating IP Table rules once the system is in a stable state. During Partition:
After partition is healed:
|
HAfullNetPartition |
Dynamic Clustering: Swap nodes | Change membership of an ONOS cluster at run time
|
HAswapNodes |
Dynamic Clustering: Scale up/down | Change the size of an ONOS cluster at run time
| HAscaling |
Offline Backup Recovery | Take a backup of ONOS data and resore ONOS using the backup
| HAbackupRecover |
ISSU | Perform an In-Service Software Upgrade (ISSU) of ONOS
| HAupgrade |
ISSU - Rollback | Rollback an In-Service Software Upgrade (ISSU) of ONOS
| HAupgradeRollback |
State and Functionality Checks in the HA Test Suite
Description | Passing Criteria |
Roadmap
Topology Discovery |
|
Device Mastership |
|
Intents |
|
Switch Failure |
|
Link Failure |
|
Leadership Election | Applications can run for leadership of topics. This service should be safe, stable and fault tolerant.
|
Distributed Sets | Call each of the following APIs and make sure they are functional and cluster wide
In addition, we also check that sets are unaffected by ONOS failures |
Distributed Atomic Counters | Call each of the following APIs and make sure they are functional and cluster wide
In addition, we also check that sets are unaffected by ONOS failures. Note: In-memory counters will not persist across cluster wide restarts |
now
Cluster Service |
|
now
Application Service |
|
now
...