Tag: cassandra

28Mar

Resolving config_database_cassandra Container Restart Loop

A not-so-fun error occurred earlier today when my standalone Contrail/Tungsten Fabric Controller host went down. After bringing it back up, Cassandra DB was reporting the following errors:

Standalone Controller is not a recommended design, due to the nature of components running in it, plus how vRouter connects with controllers. This post is discussing a PoC setup.

Using the command contrail-status to display all services on that node (output will be different on vRouter nodes):

== Contrail control ==
control: initializing (Database:Cassandra connection down)
nodemgr: initializing

== Contrail config-database ==
nodemgr: initializing (Cassandra state detected DOWN. )

== Contrail database ==
nodemgr: initializing

== Contrail analytics ==
snmp-collector: initializing (Database:Cassandra[] connection down)
query-engine: initializing
alarm-gen: initializing (Database:Cassandra[] connection down)
nodemgr: initializing
collector: initializing (Database:Cassandra, Database:contrail-01.ameen.lab:Global connection down)
topology: initializing (Database:Cassandra[] connection down)

== Contrail webui ==

== Contrail config ==
svc-monitor: initializing (Database:Cassandra[] connection down)
nodemgr: initializing
device-manager: initializing (ApiServer:ApiServer[] connection down)
api: initializing (Database:Cassandra[] connection down)
schema: initializing (ApiServer:ApiServer[] connection down)

Also, some services were reporting state UP for less than 2 minutes, while the controller node itself was up for almost an hour:

Pod              Service         Original Name                          State    Status      
config-database  cassandra       contrail-external-cassandra            running  Up 11 seconds  
database         cassandra       contrail-external-cassandra            running  Up About a minute  
control          nodemgr         contrail-nodemgr                       running  Up About a minute 
config-database  nodemgr         contrail-nodemgr                       running  Up 34 seconds 

Checking on the Cassandra container revealed the issue: Read More »

Written with love ♥