Effective October 27, 2012, online and email support for FuseSource products will move to Red Hat support channels. For more information, please see the JIRA Migration to Red Hat FAQ.
As of October 27th, please open all new issues in the Red Hat Customer Portal .
Issue Details (XML | Word | Printable)

Key: MB-1199
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Timothy Bish
Reporter: Susan Javurek
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
FUSE Message Broker

NPE exception in kahadb after index issue

Created: 02/Aug/12 02:11 PM   Updated: 06/May/13 03:39 PM
Component/s: broker
Affects Version/s: 5.5.1-fuse-04-01
Fix Version/s: 5.5.1-fuse-10-16

File Attachments: 1. Text File testcase.patch (5 kB)


External Issue URL: https://issues.apache.org/jira/browse/AMQ-3982


 Description  « Hide
2012-07-24 07:11:19,476 [n.ch] Scheduler] ERROR Queue                          - Problem retrieving message for browse
java.util.NoSuchElementException: Could not locate data file /var/lib/activemq/data/activemq_gridmsg104.cern.ch/tmp_storage/db-1093.log
        at org.apache.activemq.store.kahadb.plist.PList$PListIterator.next(PList.java:214)
        at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.next(FilePendingMessageCursor.java:508)
        at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.next(FilePendingMessageCursor.java:492)
        at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.next(FilePendingMessageCursor.java:305)
        at org.apache.activemq.broker.region.Queue.doBrowse(Queue.java:1053)
        at org.apache.activemq.broker.region.Queue.expireMessages(Queue.java:804)
        at org.apache.activemq.broker.region.Queue.access$100(Queue.java:91)
        at org.apache.activemq.broker.region.Queue$2.run(Queue.java:131)
        at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
Caused by: java.io.IOException: Could not locate data file /var/lib/activemq/data/activemq_gridmsg104.cern.ch/tmp_storage/db-1093.log
        at org.apache.kahadb.journal.Journal.getDataFile(Journal.java:350)
        at org.apache.kahadb.journal.Journal.read(Journal.java:597)
        at org.apache.activemq.store.kahadb.plist.PListStore.getPayload(PListStore.java:394)
        at org.apache.activemq.store.kahadb.plist.PList$PListIterator.next(PList.java:212)
        ... 10 more

After a restart, we now see NPE related to KahaDB:

2012-07-24 11:39:05,287 [n.ch] Scheduler] ERROR Queue                          - Problem retrieving message for browse
java.lang.NullPointerException
	at org.apache.kahadb.index.ListNode.access$200(ListNode.java:37)
	at org.apache.kahadb.index.ListNode$NodeMarshaller.writePayload(ListNode.java:265)
	at org.apache.kahadb.index.ListNode$NodeMarshaller.writePayload(ListNode.java:255)
	at org.apache.kahadb.page.Transaction.store(Transaction.java:248)
	at org.apache.kahadb.index.ListIndex.storeNode(ListIndex.java:326)
	at org.apache.kahadb.index.ListNode.store(ListNode.java:348)
	at org.apache.kahadb.index.ListNode.access$500(ListNode.java:37)
	at org.apache.kahadb.index.ListNode$ListIterator.remove(ListNode.java:227)
	at org.apache.activemq.store.kahadb.plist.PList$PListIterator$1.execute(PList.java:228)
	at org.apache.kahadb.page.Transaction.execute(Transaction.java:765)
	at org.apache.activemq.store.kahadb.plist.PList$PListIterator.remove(PList.java:225)
	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.remove(FilePendingMessageCursor.java:517)
	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.remove(FilePendingMessageCursor.java:321)
	at org.apache.activemq.broker.region.Queue.doBrowse(Queue.java:1061)
	at org.apache.activemq.broker.region.Queue.expireMessages(Queue.java:804)
	at org.apache.activemq.broker.region.Queue.access$100(Queue.java:91)
	at org.apache.activemq.broker.region.Queue$2.run(Queue.java:131)
	at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
	at java.util.TimerThread.mainLoop(Timer.java:512)
	at java.util.TimerThread.run(Timer.java:462)

Seems to be a variant of https://issues.apache.org/jira/browse/AMQ-3434 and Gary put together a test case which we believe represents the problem.



 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Gary Tully added a comment - 10/Aug/12 10:54 AM
I have a test case that runs for <5mins and always shows a problem, but the symptoms vary (From iterators reading too much to chunk stream not exist to NPE). I am getting closer to the root cause, but don't yet have it narrowed doen completely.

Joe Luo added a comment - 10/Aug/12 11:31 AM
Sometimes, following stack trace can also be seen:

2012-08-08 20:49:47,918 [n.ch] Scheduler] ERROR Queue - Problem retrieving message for browse
java.util.NoSuchElementException: Chunk stream does not exist, page: 75281 is marked free
at org.apache.kahadb.index.ListNode$ListIterator.getFromNextNode(ListNode.java:165)
at org.apache.kahadb.index.ListNode$ListIterator.hasNext(ListNode.java:176)
at org.apache.activemq.store.kahadb.plist.PList$PListIterator.hasNext(PList.java:204)
at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.hasNext(FilePendingMessageCursor.java:503)
at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.hasNext(FilePendingMessageCursor.java:297)
at org.apache.activemq.broker.region.Queue.doBrowse(Queue.java:1052)
at org.apache.activemq.broker.region.Queue.expireMessages(Queue.java:804)
at org.apache.activemq.broker.region.Queue.access$100(Queue.java:91)
at org.apache.activemq.broker.region.Queue$2.run(Queue.java:131)
at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
Caused by: java.io.EOFException: Chunk stream does not exist, page: 75281 is marked free
at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:455)
at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:432)
at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:429)
at org.apache.kahadb.page.Transaction.load(Transaction.java:405)
at org.apache.kahadb.page.Transaction.load(Transaction.java:362)
at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306)


Gary Tully added a comment - 10/Aug/12 02:44 PM
An additional test that breaks and can demonstrate the problem:
org.apache.activemq.store.kahadb.plist.PListTest#testConcurrentAddIterate

Gary Tully added a comment - 20/Aug/12 02:48 PM
great news on this one, looks like the problem is in reading and seek. Top find tim.

Susan Javurek added a comment - 22/Aug/12 02:12 PM
Hi Tim,

It looks like a new snapshot built last night:

apache-activemq-5.5.1.fuse-20120822.002833-49-bin.tar.gz.md5 Wed Aug 22 09:07:15 UTC 2012 32

Will it be in there or do you think we need to wait for the next one?

Susan