Uploaded image for project: 'JBoss A-MQ'
  1. JBoss A-MQ
  2. ENTMQ-382

mq-discover; mq broker instances disappear from cluster-list when zookeeper session is expired and new session established - it can result in two active broker in master/slave setup.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Blocker
    • JBoss A-MQ 6.1
    • JBoss A-MQ 6.0
    • None
    • None

    Description

      When a zookeeper session is expired and a new session created from the container (as below) the container is listed as "active" in the container-list but the cluster-list does not list the broker running within that container.| 2013-06-22 00:23:27,246 | INFO| .40.26.207:2181) | ClientCnxn | .zookeeper.ClientCnxn$SendThread 1049 | 58 - org.fusesource.fabric.fabric-linkedin-zookeeper -|

      7.2.0.redhat-024 | Unable to reconnect to ZooKeeper service, session 0x23f655f62400001 has expired, closing socket connection

      ... shortly after a new session is created| 2013-06-22 00:23:27,586 | INFO| .40.26.209:2181) | ClientCnxn | .zookeeper.ClientCnxn$SendThread 1175 | 58 - org.fusesource.fabric.fabric-linkedin-zookeeper -|

      7.2.0.redhat-024 | Session establishment complete on server <myip_address>:2181, sessionid = 0x23f655f6240001c, negotiated timeout = 30000

      I am assuming this is because the ephemeral node for the broker cluster is not recreated when the zookeeper session is restarted after expiry. I think this behavior is problematic:1. potential loss of slave instances from the cluster groupzookeeper session expires on slave instanceephemeral zknode is removed as it is associated with that sessionnew zookeeper session is created but the ephemeral node is not recreated in the clusterthe instance will not be "discovered" as part of the mq-discovery mechanism as no node is registered in zookeeper2. potentially have two active brokers in the cluster group (two masters)zookeeper session expires on master instanceephemeral zknode is removed as it is associated with that sessionnew zookeeper session is created but the ephemeral node is not recreated in the clusterslave broker is promoted to masteroriginal master broker is still running (but is not listed in the group cluster).HOW TO REPLICATE
      =============(scenario 1):
      -----------------issue following karaf/fabric commands| 1.fabric:create|

      2.fabric:mq-create --group mq_g50 --create-container child_1,child_2 my_mq_profile

      Assuming child_1 is the master; pause container child_2 for >30 seconds (using the "kill -17 PID" to pause and "kill -19 PID" to resume)| 3. container-list - will show child_2 container as active again (as expected)|

      4. cluster-list - will show no reference to child_2 broker

      (Scenario 2)
      --------------------
      setup same as scenario 1 BUT
      1. ensure the kahadb is not sharing the same master slave lock
      2. pause master container rather than slave.

      Attachments

        Issue Links

          Activity

            People

              dejanbosanac Dejan Bosanac
              rhn-support-pfox Patrick Fox (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: