Database corruption after clean rebuild

classic Classic list List threaded Threaded
10 messages Options
Javier Garcia Javier Garcia
Reply | Threaded
Open this post in threaded view
|

Database corruption after clean rebuild

I can't build a healthy database for notmuch. My mail directory has
quite a few mails, around 20,000.

$ rm -rf ~/.mail/.notmuch
$ notmuch new
$ xapian-check ~/.mail/.notmuch/xapian/
> docdata:
> blocksize=8K items=63 firstunused=1 revision=2 levels=0 root=0
> B-tree checked okay
> docdata table structure checked OK
>
> termlist:
> blocksize=8K items=43520 firstunused=8291 revision=2 levels=2 root=748
> xapian-check: DatabaseError: 1 unused block(s) missing from the free
list, first is 0

I've tried to repair the database to no avail

$ xapian-check ~/.mail/.notmuch/xapian/ F
> docdata:
> B-tree checked okay
> docdata table structure checked OK
>
> termlist:
> xapian-check: DatabaseError: 1 unused block(s) missing from the free
list, first is 0

I've also tried to delete the mail directory and sync again with the
same result. I used mbsync(isync) to build it.

Notmuch version: 0.25
Xapian-core: 1.4.5
isync version: 1.2.3
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
David Bremner-2 David Bremner-2
Reply | Threaded
Open this post in threaded view
|

Re: Database corruption after clean rebuild

Javier Garcia <[hidden email]> writes:

> I can't build a healthy database for notmuch. My mail directory has
> quite a few mails, around 20,000.
>
> $ rm -rf ~/.mail/.notmuch
> $ notmuch new
> $ xapian-check ~/.mail/.notmuch/xapian/
>> docdata:
>> blocksize=8K items=63 firstunused=1 revision=2 levels=0 root=0
>> B-tree checked okay
>> docdata table structure checked OK
>>
>> termlist:
>> blocksize=8K items=43520 firstunused=8291 revision=2 levels=2 root=748
>> xapian-check: DatabaseError: 1 unused block(s) missing from the free
> list, first is 0
>
There was recently a similar report that turned out to be related to a
reference loop in the mail.  Do you actually have any symptoms of
database corruption other than the message about the free list? if not,
it might be worth trying the attached patch, which attempts to break
reference loops.


From 753c8d366f3ffde2a14de7157b55b27b555b39d8 Mon Sep 17 00:00:00 2001
From: David Bremner <[hidden email]>
Date: Mon, 2 Apr 2018 08:02:05 -0300
Subject: [PATCH] WIP: test patch for reference loop problem

---
 lib/thread.cc | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/thread.cc b/lib/thread.cc
index 3561b27f..356d63ce 100644
--- a/lib/thread.cc
+++ b/lib/thread.cc
@@ -391,10 +391,15 @@ static void
 _resolve_thread_relationships (notmuch_thread_t *thread)
 {
     notmuch_message_node_t *node;
-    notmuch_message_t *message, *parent;
+    notmuch_message_t *message, *first_message = NULL, *parent;
     const char *in_reply_to;
 
-    for (node = thread->message_list->head; node; node = node->next) {
+    node = thread->message_list->head;
+    if (node) {
+ first_message = node->message;
+ node = node->next;
+    }
+    for (; node; node = node->next) {
  message = node->message;
  in_reply_to = _notmuch_message_get_in_reply_to (message);
  if (in_reply_to && strlen (in_reply_to) &&
@@ -406,6 +411,19 @@ _resolve_thread_relationships (notmuch_thread_t *thread)
     _notmuch_message_list_add_message (thread->toplevel_list, message);
     }
 
+    /* XXX: this is probably nonsense: if we didn't find any top level
+     * messages, choose one at random */
+    if (first_message) {
+ in_reply_to = _notmuch_message_get_in_reply_to (first_message);
+ if (thread->toplevel_list->head && in_reply_to && strlen (in_reply_to) &&
+    g_hash_table_lookup_extended (thread->message_hash,
+  in_reply_to, NULL,
+  (void **) &parent))
+    _notmuch_message_add_reply (parent, first_message);
+ else
+    _notmuch_message_list_add_message (thread->toplevel_list, first_message);
+    }
+
     /* XXX: After scanning through the entire list looking for parents
      * via "In-Reply-To", we should do a second pass that looks at the
      * list of messages IDs in the "References" header instead. (And
--
2.16.3


_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Javier Garcia Javier Garcia
Reply | Threaded
Open this post in threaded view
|

Re: Database corruption after clean rebuild

I've applied the path to notmuch 0.26.1 without success.

$ rm -rf ~/.mail/.notmuch
$ LD_LIBRARY_PATH=/hidden-path/notmuch-0.26.1/lib/:$LD_LIBRARY_PATH
./notmuch new
   Found 20065 total files (that's not much mail).
   Processed 20065 total files in 58s (341 files/sec.).
   Added 19605 new messages to the database.

$ xapian-check .mail/.notmuch/xapian/
   docdata:
   blocksize=8K items=63 firstunused=1 revision=2 levels=0 root=0
   B-tree checked okay
   docdata table structure checked OK
   termlist:
   blocksize=8K items=43520 firstunused=8293 revision=2 levels=2 root=748
   xapian-check: DatabaseError: 1 unused block(s) missing from the free
list, first is 0

With or without the patch, the "corrupted" database works fine most of
the time. For instance this works:

$ notmuch tag +new2 -- tag:new

It's just that afew can't work with the db in this state, it complains
saying the database is corrupted.

$ rm -rf ~/.mail/.notmuch
$ notmuch new
$ afew -tn -vv
   <normal operations>
   terminate called after throwing an instance of
'Xapian::DatabaseCorruptError'
   Aborted (core dumped)

Afew doesn't always crash, even though the database is always corrupted.
Afew crashes when it's called just after a fresh notmuch database is
built and randomly thereafter. The error is always the same.

The following one-liner can solve most of afew crashes. It works well
for the cases in which afew is called right after notmuch database
creation. For random crashes is not as effective.

$ notmuch tag +new2 -- tag:new


On 07/04/18 03:49, David Bremner wrote:

> Javier Garcia <[hidden email]> writes:
>
>> I can't build a healthy database for notmuch. My mail directory has
>> quite a few mails, around 20,000.
>>
>> $ rm -rf ~/.mail/.notmuch
>> $ notmuch new
>> $ xapian-check ~/.mail/.notmuch/xapian/
>>> docdata:
>>> blocksize=8K items=63 firstunused=1 revision=2 levels=0 root=0
>>> B-tree checked okay
>>> docdata table structure checked OK
>>>
>>> termlist:
>>> blocksize=8K items=43520 firstunused=8291 revision=2 levels=2 root=748
>>> xapian-check: DatabaseError: 1 unused block(s) missing from the free
>> list, first is 0
>>
> There was recently a similar report that turned out to be related to a
> reference loop in the mail.  Do you actually have any symptoms of
> database corruption other than the message about the free list? if not,
> it might be worth trying the attached patch, which attempts to break
> reference loops.
>

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
David Bremner-2 David Bremner-2
Reply | Threaded
Open this post in threaded view
|

Re: Database corruption after clean rebuild

Javier Garcia <[hidden email]> writes:

> I've applied the path to notmuch 0.26.1 without success.
>
> $ rm -rf ~/.mail/.notmuch
> $ LD_LIBRARY_PATH=/hidden-path/notmuch-0.26.1/lib/:$LD_LIBRARY_PATH
> ./notmuch new
>    Found 20065 total files (that's not much mail).
>    Processed 20065 total files in 58s (341 files/sec.).
>    Added 19605 new messages to the database.
>
> $ xapian-check .mail/.notmuch/xapian/
>    docdata:
>    blocksize=8K items=63 firstunused=1 revision=2 levels=0 root=0
>    B-tree checked okay
>    docdata table structure checked OK
>    termlist:
>    blocksize=8K items=43520 firstunused=8293 revision=2 levels=2 root=748
>    xapian-check: DatabaseError: 1 unused block(s) missing from the free
> list, first is 0

OK, so probably not related to reference loops (although that patch is
not very well tested).  It's not clear how notmuch is triggering it, but
this looks like the same bug in Xapian that olly fixed recently [1].

A possible next step is to try building xapian master, and linking
notmuch against that.

Maybe Patrick or Justus (in copy) has some idea why you're only seeing
problems in afew.

Another debugging direction is to try to duplicate your problem with
some subset of mail that you're willing to share (bisection is the usual
strategy).

[1] https://notmuchmail.org/pipermail/notmuch/2018/026369.html
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Javier Garcia Javier Garcia
Reply | Threaded
Open this post in threaded view
|

Re: Database corruption after clean rebuild


Unfortunately I can't share my emails without the approval of other
parties. The minimum subsets that trigger the error are in the range of
1000-5000 mails, so asking each and everyone of them is out of my reach.
I tried to replicate the problem using just spam folders without success.

The following is a solid workaround I've stumbled upon. Afew no longer
complains and database corruption is gone.

$ notmuch compact
$ xapian-check ~/.mail/.notmuch/xapian
   <check messages>
   No errors found

I built xapian-core 1.50 but I can't compile notmuch 0.26.1 against it.
I will wait and test again in a few weeks.

If you are interested in my setup, the error happens with this minimal
configuration.

#~/.config/afew/config
[Filter.1]
query = 'folder:"//(INBOX|Inbox|inbox)$/" AND (NOT tag:inbox)'
tags = +inbox;-new
message = Messages in INBOX folder are tagged as inbox

[Filter.2]
query = '(NOT folder:"//(INBOX|Inbox|inbox)$/") AND (tag:inbox)'
tags = -inbox
message = Messages not in INBOX folder cannot be inbox

#~/.notmuch-config
[database]
path=/home-path/.mail
[new]
tags=new

On 07/04/18 12:51, David Bremner wrote:

> Javier Garcia <[hidden email]> writes:
>
>> I've applied the path to notmuch 0.26.1 without success.
>>
>> $ rm -rf ~/.mail/.notmuch
>> $ LD_LIBRARY_PATH=/hidden-path/notmuch-0.26.1/lib/:$LD_LIBRARY_PATH
>> ./notmuch new
>>    Found 20065 total files (that's not much mail).
>>    Processed 20065 total files in 58s (341 files/sec.).
>>    Added 19605 new messages to the database.
>>
>> $ xapian-check .mail/.notmuch/xapian/
>>    docdata:
>>    blocksize=8K items=63 firstunused=1 revision=2 levels=0 root=0
>>    B-tree checked okay
>>    docdata table structure checked OK
>>    termlist:
>>    blocksize=8K items=43520 firstunused=8293 revision=2 levels=2 root=748
>>    xapian-check: DatabaseError: 1 unused block(s) missing from the free
>> list, first is 0
> OK, so probably not related to reference loops (although that patch is
> not very well tested).  It's not clear how notmuch is triggering it, but
> this looks like the same bug in Xapian that olly fixed recently [1].
>
> A possible next step is to try building xapian master, and linking
> notmuch against that.
>
> Maybe Patrick or Justus (in copy) has some idea why you're only seeing
> problems in afew.
>
> Another debugging direction is to try to duplicate your problem with
> some subset of mail that you're willing to share (bisection is the usual
> strategy).
>
> [1] https://notmuchmail.org/pipermail/notmuch/2018/026369.html

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
David Bremner-2 David Bremner-2
Reply | Threaded
Open this post in threaded view
|

Re: Database corruption after clean rebuild

Javier Garcia <[hidden email]> writes:

> Unfortunately I can't share my emails without the approval of other
> parties. The minimum subsets that trigger the error are in the range of
> 1000-5000 mails, so asking each and everyone of them is out of my reach.
> I tried to replicate the problem using just spam folders without success.
>
> The following is a solid workaround I've stumbled upon. Afew no longer
> complains and database corruption is gone.
>
> $ notmuch compact
> $ xapian-check ~/.mail/.notmuch/xapian
>    <check messages>
>    No errors found

Right, I should have thought of compaction, that's a workaround Olly
mentioned before. That strongly suggests that you are hitting the known
Xapian bug.


d
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Olly Betts Olly Betts
Reply | Threaded
Open this post in threaded view
|

Re: Database corruption after clean rebuild

On Sat, Apr 07, 2018 at 12:17:39PM -0300, David Bremner wrote:

> Javier Garcia <[hidden email]> writes:
>
> > The following is a solid workaround I've stumbled upon. Afew no longer
> > complains and database corruption is gone.
> >
> > $ notmuch compact
> > $ xapian-check ~/.mail/.notmuch/xapian
> >    <check messages>
> >    No errors found
>
> Right, I should have thought of compaction, that's a workaround Olly
> mentioned before. That strongly suggests that you are hitting the known
> Xapian bug.

Yes - the error exactly matches that too.

Cheers,
    Olly
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Gregor Zattler Gregor Zattler
Reply | Threaded
Open this post in threaded view
|

Re: Database corruption after clean rebuild

In reply to this post by Javier Garcia
Hi notmuch developers,

I also had this database corruption, I waited for the fix to land
in notmuch 0.26.2, build it, moved the xapian directory away, did
a notmuch new and restored the tags from a dump.  But the problem
remains:

~$ xapian-check ~/Mail/.notmuch/xapian
docdata:
blocksize=8K items=10841 firstunused=75 revision=82 levels=1 root=2
B-tree checked okay
docdata table structure checked OK

termlist:
blocksize=8K items=1893162 firstunused=368983 revision=82 levels=3 root=177608
xapian-check: DatabaseError: 1 unused block(s) missing from the free list, first is 0


this is very similar to the old database which I had moved away:
~$ xapian-check ~/Mail/.notmuch/xapian-2018-04-29-00-22/
docdata:
blocksize=8K items=10863 firstunused=78 revision=59623 levels=1 root=2
B-tree checked okay
docdata table structure checked OK

termlist:
blocksize=8K items=1894648 firstunused=360821 revision=59623 levels=3 root=360580
xapian-check: DatabaseError: 1 unused block(s) missing from the free list, first is 0


Now I did notmuch compact and the database check says there are
no errors.


This seems to me as if the fix had not helped or there is another problem.


$ notmuch --version
notmuch 0.26.2+26~g9e158fb
~$ xapian-compact --version
xapian-compact - xapian-core 1.4.3

Thanks for developing notmuch, Gregor


* Javier Garcia <[hidden email]> [2018-04-07; 17:09]:

> Unfortunately I can't share my emails without the approval of other
> parties. The minimum subsets that trigger the error are in the range of
> 1000-5000 mails, so asking each and everyone of them is out of my reach.
> I tried to replicate the problem using just spam folders without success.
>
> The following is a solid workaround I've stumbled upon. Afew no longer
> complains and database corruption is gone.
>
> $ notmuch compact
> $ xapian-check ~/.mail/.notmuch/xapian
>    <check messages>
>    No errors found
>
> I built xapian-core 1.50 but I can't compile notmuch 0.26.1 against it.
> I will wait and test again in a few weeks.
>
> If you are interested in my setup, the error happens with this minimal
> configuration.
>
> #~/.config/afew/config
> [Filter.1]
> query = 'folder:"//(INBOX|Inbox|inbox)$/" AND (NOT tag:inbox)'
> tags = +inbox;-new
> message = Messages in INBOX folder are tagged as inbox
>
> [Filter.2]
> query = '(NOT folder:"//(INBOX|Inbox|inbox)$/") AND (tag:inbox)'
> tags = -inbox
> message = Messages not in INBOX folder cannot be inbox
>
> #~/.notmuch-config
> [database]
> path=/home-path/.mail
> [new]
> tags=new
>
> On 07/04/18 12:51, David Bremner wrote:
>> Javier Garcia <[hidden email]> writes:
>>
>>> I've applied the path to notmuch 0.26.1 without success.
>>>
>>> $ rm -rf ~/.mail/.notmuch
>>> $ LD_LIBRARY_PATH=/hidden-path/notmuch-0.26.1/lib/:$LD_LIBRARY_PATH
>>> ./notmuch new
>>>    Found 20065 total files (that's not much mail).
>>>    Processed 20065 total files in 58s (341 files/sec.).
>>>    Added 19605 new messages to the database.
>>>
>>> $ xapian-check .mail/.notmuch/xapian/
>>>    docdata:
>>>    blocksize=8K items=63 firstunused=1 revision=2 levels=0 root=0
>>>    B-tree checked okay
>>>    docdata table structure checked OK
>>>    termlist:
>>>    blocksize=8K items=43520 firstunused=8293 revision=2 levels=2 root=748
>>>    xapian-check: DatabaseError: 1 unused block(s) missing from the free
>>> list, first is 0
>> OK, so probably not related to reference loops (although that patch is
>> not very well tested).  It's not clear how notmuch is triggering it, but
>> this looks like the same bug in Xapian that olly fixed recently [1].
>>
>> A possible next step is to try building xapian master, and linking
>> notmuch against that.
>>
>> Maybe Patrick or Justus (in copy) has some idea why you're only seeing
>> problems in afew.
>>
>> Another debugging direction is to try to duplicate your problem with
>> some subset of mail that you're willing to share (bisection is the usual
>> strategy).
>>
>> [1] https://notmuchmail.org/pipermail/notmuch/2018/026369.html
>
> _______________________________________________
> notmuch mailing list
> [hidden email]
> https://notmuchmail.org/mailman/listinfo/notmuch
Ciao; Gregor
--
 -... --- .-. . -.. ..--.. ...-.-

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
David Bremner-2 David Bremner-2
Reply | Threaded
Open this post in threaded view
|

Re: Database corruption after clean rebuild

Gregor Zattler <[hidden email]> writes:

> Hi notmuch developers,
>
> I also had this database corruption, I waited for the fix to land
> in notmuch 0.26.2, build it, moved the xapian directory away, did
> a notmuch new and restored the tags from a dump.  But the problem
> remains:
>
> ~$ xapian-check ~/Mail/.notmuch/xapian
> docdata:
> blocksize=8K items=10841 firstunused=75 revision=82 levels=1 root=2
> B-tree checked okay
> docdata table structure checked OK
>
> termlist:
> blocksize=8K items=1893162 firstunused=368983 revision=82 levels=3 root=177608
> xapian-check: DatabaseError: 1 unused block(s) missing from the free list, first is 0

This is a known bug in Xapian, fixed in xapian master. The message will
go away if you run notmuch-compact, or you can just ignore it.

d
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Gregor Zattler Gregor Zattler
Reply | Threaded
Open this post in threaded view
|

Re: Database corruption after clean rebuild

Hi David,
* David Bremner <[hidden email]> [2018-04-29; 11:27]:

> Gregor Zattler <[hidden email]> writes:
>> I also had this database corruption, I waited for the fix to land
>> in notmuch 0.26.2, build it, moved the xapian directory away, did
>> a notmuch new and restored the tags from a dump.  But the problem
>> remains:
>>
>> ~$ xapian-check ~/Mail/.notmuch/xapian
>> docdata:
>> blocksize=8K items=10841 firstunused=75 revision=82 levels=1 root=2
>> B-tree checked okay
>> docdata table structure checked OK
>>
>> termlist:
>> blocksize=8K items=1893162 firstunused=368983 revision=82 levels=3 root=177608
>> xapian-check: DatabaseError: 1 unused block(s) missing from the free list, first is 0
>
> This is a known bug in Xapian, fixed in xapian master. The message will
> go away if you run notmuch-compact, or you can just ignore it.

OK, thanks.  Gregor

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch