Effective October 27, 2012, online and email support for FuseSource products will move to Red Hat support channels. For more information, please see the JIRA Migration to Red Hat FAQ.
As of October 27th, please open all new issues in the Red Hat Customer Portal .
Issue Details (XML | Word | Printable)

Key: MB-1156
Type: Improvement Improvement
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Gary Tully
Reporter: Dave Stanley
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
FUSE Message Broker

Add support for lease based lock to jdbc persistent adapter.

Created: 11/May/12 04:19 PM   Updated: 20/Jun/12 03:12 PM
Component/s: broker
Affects Version/s: None
Fix Version/s: 5.5.1-fuse-07-11

Environment: Fuse MB 5.5.x
Issue Links:
Fixed By
 

External Issue URL: https://issues.apache.org/jira/browse/AMQ-3654


 Description  « Hide
Add support for lease based lock to jdbc persistent adapter.

The current transaction based locking mechanism works well for single broker but
is difficult to configure correctly in a master/slave scenario.

The two main problems are:

1) You need to use a combination of an IOExceptionHandler and the lockKeepAlivePeriod
in order to get the desired master/slave behavior, making the current solution difficult
to configure.

2) It would be preferable for a master broker to stay alive when it loses its connection to
to the DB, but shutdown its transports and then go into retry mode to reacquire the lock.
In reacquire mode, its effectively a slave with its transports shutdown.

This is possible currently using an IOExceptionHandler, but the problem
is once the broker refreshes its connection, it doesn't refresh the lock
and so there are edge conditions where the slave can incorrectly acquire
the lock after a failover (and mixed configs between master and slave are
required to work around the issue).

This enhancement is to add support for a lease based lock. This would allow
us to simplify master/slave jdbc config.

Master Behavior:

When the first broker starts it acquires a lease for a time slice, renewing it periodically.

When the master broker:
+ Terminates - it automatically releases the lease.

+ Crashes - the slave detects that the lease has expired and if it can
acquire the lease it becomes the new master.

+ Network Glitch - once master detects the connection is gone (and so timeouts would
be configured via broker thread or jdbc driver timeouts), it goes into retry mode. This
would shutdown transport connectors until it can reacquire lease.

Benefits:

In this manner, both graceful and ungraceful broker process terminations are detected. Also
we don't have the transaction log overhead associated with the current mechanism.

Downside:

Lease based approach would add additional lease renewal traffic going between the broker and db. The slaves
will need to periodically try and acquire the lease. The master will need to periodically renew
its lease. Its not expected this would be significant.

Solution will require clocks to be sync'd between master and slave for reliable operation. Would be nice
if brokers where able to detect/protect against out of sync peers.

Config:

1) lease_ping_time - amount of time between lease renewals

2) lease_reap_time - if broker doesn't renew within lease_reap_time,
the db lock is released and open to a new master. lease_reap_time should
be larger than lease_ping_time.

Nice to have:

+ activemq-admin command to tell you the current master
+ activemq-admin command to do a soft-failover (force expiration of current lease) ?



 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Gary Tully added a comment - 13/Jun/12 09:03 PM
https://issues.apache.org/jira/browse/AMQ-3654 shares a common problem that a lease based lock can help

Gary Tully added a comment - 14/Jun/12 04:06 PM - edited
Additional LeaseLocker on the 5.5.1 branch.
<ioExceptionHandler>
            <jDBCIOExceptionHandler/>
        </ioExceptionHandler>

        <persistenceAdapter>
            <jdbcPersistenceAdapter lockKeepAlivePeriod="1000" lockAcquireSleepInterval="2000">
                <databaseLocker>
                    <lease-database-locker/>
                </databaseLocker>
            </jdbcPersistenceAdapter>
        </persistenceAdapter>

The IOExceptionHandler will pause/resume the transport connectors on any IO exception related to access to the DB. This is important because transport restart is gated on a successful keepAlive, so that we avoid contending masters if the lease expires before the db comes back online.
The lease based lock is acquired by blocking at start and retained by the keepAlivePeriod. To retain, the lease is extended by the lockAcquireSleepInterval, so in theory the master is always (lockAcquireSleepInterval-lockKeepAlivePeriod) ahead of the slave w.r.t the lease.
The lease is dropped on normal shutdown.
If the broker system clock is not in sync with the db, a maxAllowableDiffFromDBTime > 0 will adjust the lease duration if the skew exceeds the absolute maxAllowableDiffFromDBTime value, allowing the db to dictate the utc basis for the lease.
There is no support for moving from a master state back to a slave. If the lease is lost, the master will exit.