Tuesday, July 21, 2009

Failover in EMS

Step 1
A backup server detects a failure of the primary in either of two ways:
  • Heartbeat Failure—the primary server sends heartbeat messages to the backup server to indicate that it is still operating. When a network failure stops the servers from communicating with each other, the backup server detects the interruption in the steady stream of heartbeats.
  • Connection Failure—the backup server can detect the failure of its TCP connection with the primary server. When the primary process terminates unexpectedly, the backup server detects the broken connection.

Step2
When a backup server (B) detects the failure of the primary server (A), then B attempts to assume the role of primary server. First, B obtains the lock on the current shared state. When B can access this information, it becomes the new primary server.

  • If B cannot obtain the lock immediately, it alternates between attempting to obtain the lock (and become the primary server), and attempting to reconnect to A (and resume as a backup server)—until one of these attempts succeeds.
  • When server A comes back again, it becomes a backup server and server B continues to be the primary.
  • Clients of A that are configured to failover to backup server B automatically transfer to B when it becomes the new primary server.

Step 3:

After a failure, message redelivery is attempted based on the type of messages and destination configuration.

  • Persistent: When a failure occurs, messages with delivery mode PERSISTENT, that were not successfully acknowledged before the failure, are redelivered.
  • Failsafe: EMS guarantees that a message with PERSISTENT delivery mode and a failsafe destination will not be lost during a failure.
  • Any messages that have been successfully acknowledged or committed are not redelivered, in compliance with the JMS 1.1 specification.
  • All topic subscribers continue normal operation after a failover.
  • Queues: For queue receivers, any messages that have been sent to receivers, but have not been acknowledged before the failover, may be sent to other receivers immediately after the failover. A receiver trying to acknowledge a message after a failover may receive the javax.jms.IllegalStateException. This exception signifies that the attempted acknowledgement is for a message that has already been sent to another queue receiver.
  • After a failover, attempting to commit the active transaction results in a javax.jms.TransactionRolledBackException. Clients that use transactions must handle this exception, and resend any messages sent during the transaction.

0 comments:

Popular Posts

  © Blogger templates The Professional Template by Ourblogtemplates.com 2008

Back to TOP