Info about notmuch database

classic Classic list List threaded Threaded
7 messages Options
boyska boyska
Reply | Threaded
Open this post in threaded view
|

Info about notmuch database

Hello!
I like notmuch a lot, so I'm writing a (conceptually) similar software
about addressbook: it will scan all your emails, storing email
addresses
in a xapian database (you can think of it as little brother database[1]
on
steroids)
The part that I'd like to re-implement is "notmuch new": it seems that
in the xapian db there is not only informations about each mail, but
also the mtime of each directory. My impression is this being
"chaotic",
but probably I am just missing the point.

So, here's the question: how is the db "structured"? is there any
documentation to look at?

[1] http://www.spinnaker.de/lbdb/

--
boyska
GPG: 0x520CE393
_______________________________________________
notmuch mailing list
[hidden email]
http://notmuchmail.org/mailman/listinfo/notmuch
Thomas Jost Thomas Jost
Reply | Threaded
Open this post in threaded view
|

Re: Info about notmuch database

On Wed, 04 Jan 2012 15:49:19 +0000, boyska <[hidden email]> wrote:

> Hello!
> I like notmuch a lot, so I'm writing a (conceptually) similar software
> about addressbook: it will scan all your emails, storing email
> addresses
> in a xapian database (you can think of it as little brother database[1]
> on
> steroids)
> The part that I'd like to re-implement is "notmuch new": it seems that
> in the xapian db there is not only informations about each mail, but
> also the mtime of each directory. My impression is this being
> "chaotic",
> but probably I am just missing the point.
>
> So, here's the question: how is the db "structured"? is there any
> documentation to look at?
>
> [1] http://www.spinnaker.de/lbdb/
>
> --
> boyska
> GPG: 0x520CE393
There's a description of the DB "schema" in lib/database.cc in the
notmuch source code. But you may also consider just using libnotmuch
instead, if that's enough for what you want to do.

Also: why Xapian? I'm already using something similar I wrote with
Python, storing everything in a dictionary, using Pickle to save that to
disk: 162 lines of code and 45 kb of data are enough to store my
addressbook and have completion in Emacs...

Regards,

--
Thomas/Schnouki

_______________________________________________
notmuch mailing list
[hidden email]
http://notmuchmail.org/mailman/listinfo/notmuch

attachment0 (499 bytes) Download Attachment
boyska boyska
Reply | Threaded
Open this post in threaded view
|

Re: Info about notmuch database

On Thu, Jan 05, 2012 at 04:04:22PM +0100, Thomas Jost wrote:

> On Wed, 04 Jan 2012 15:49:19 +0000, boyska <[hidden email]> wrote:
> > Hello!
> > I like notmuch a lot, so I'm writing a (conceptually) similar software
> > about addressbook: it will scan all your emails, storing email
> > addresses
> > in a xapian database (you can think of it as little brother database[1]
> > on
> > steroids)
> > The part that I'd like to re-implement is "notmuch new": it seems that
> > in the xapian db there is not only informations about each mail, but
> > also the mtime of each directory. My impression is this being
> > "chaotic",
> > but probably I am just missing the point.
> >
> > So, here's the question: how is the db "structured"? is there any
> > documentation to look at?
> >
> > [1] http://www.spinnaker.de/lbdb/
> >
> > --
> > boyska
> > GPG: 0x520CE393
>
> There's a description of the DB "schema" in lib/database.cc in the
> notmuch source code. But you may also consider just using libnotmuch
> instead, if that's enough for what you want to do.

thanks, found it, much clearer now.
But I really can't understand why not just putting these things on a
separate file :) atomic consistency issues?

> Also: why Xapian? I'm already using something similar I wrote with
> Python, storing everything in a dictionary, using Pickle to save that to
> disk: 162 lines of code and 45 kb of data are enough to store my
> addressbook and have completion in Emacs...

dictionary approach is fine to manage a "manual" addressbook, where you
store addresses. But what I want is an _automatic_ addressbook, like the
lbdb one, which just indexes all seen emails.
The grep approach is better from this point of view, but still not
advanced enough for me.
For example, I'd like to store "cooccorrences": if some email is used in
the same mail of some other, then it must contain a relationship; for
example, your email should be correlated to the notmuch mailinglist,
because you wrote to it. (they should be 0-weighted xapian term).  Also,
I want to give more importance to email addresses which are frequently
seen, and much less to not-so-frequently seen. Xapian makes these really
easy, so the question is "why not using it?" ;)
_______________________________________________
notmuch mailing list
[hidden email]
http://notmuchmail.org/mailman/listinfo/notmuch
Thomas Jost Thomas Jost
Reply | Threaded
Open this post in threaded view
|

Re: Info about notmuch database

On Thu, 5 Jan 2012 16:38:07 +0100, boyska <[hidden email]> wrote:
> > There's a description of the DB "schema" in lib/database.cc in the
> > notmuch source code. But you may also consider just using libnotmuch
> > instead, if that's enough for what you want to do.
>
> thanks, found it, much clearer now.
> But I really can't understand why not just putting these things on a
> separate file :) atomic consistency issues?

I doubt it's for consistency (see commit 824dad76), more likely it's
because people should use libnotmuch rather than directly hacking into
the DB ;)


> > Also: why Xapian? I'm already using something similar I wrote with
> > Python, storing everything in a dictionary, using Pickle to save that to
> > disk: 162 lines of code and 45 kb of data are enough to store my
> > addressbook and have completion in Emacs...
>
> dictionary approach is fine to manage a "manual" addressbook, where you
> store addresses. But what I want is an _automatic_ addressbook, like the
> lbdb one, which just indexes all seen emails.

That's what my little script does too: index emails and how many times
they appear in the DB so that completion shows more frequently used ones
first. The indexing is done after running "notmuch new", when running my
auto-tagging script. I'm too lazy to maintain a "manual" addressbook
correctly :)

> The grep approach is better from this point of view, but still not
> advanced enough for me.
> For example, I'd like to store "cooccorrences": if some email is used in
> the same mail of some other, then it must contain a relationship; for
> example, your email should be correlated to the notmuch mailinglist,
> because you wrote to it. (they should be 0-weighted xapian term).  Also,
> I want to give more importance to email addresses which are frequently
> seen, and much less to not-so-frequently seen. Xapian makes these really
> easy, so the question is "why not using it?" ;)

Nice ideas, and Xapian is probably a good choice for doing that kind of
stuff :)

Do you plan to use this addressbook with notmuch-address.el, or will it
be a standalone program?

Regards,

--
Thomas/Schnouki

_______________________________________________
notmuch mailing list
[hidden email]
http://notmuchmail.org/mailman/listinfo/notmuch

attachment0 (499 bytes) Download Attachment
boyska boyska
Reply | Threaded
Open this post in threaded view
|

Re: Info about notmuch database

On Thu, Jan 05, 2012 at 05:35:55PM +0100, Thomas Jost wrote:

> On Thu, 5 Jan 2012 16:38:07 +0100, boyska <[hidden email]> wrote:
> > > There's a description of the DB "schema" in lib/database.cc in the
> > > notmuch source code. But you may also consider just using libnotmuch
> > > instead, if that's enough for what you want to do.
> >
> > thanks, found it, much clearer now.
> > But I really can't understand why not just putting these things on a
> > separate file :) atomic consistency issues?
>
> I doubt it's for consistency (see commit 824dad76), more likely it's
> because people should use libnotmuch rather than directly hacking into
> the DB ;)

Fine; I'll probably keep the whole output of "find" as the data of a
SINGLE entry, instead of one entry for directory. This just seems easier
to me.

> Do you plan to use this addressbook with notmuch-address.el, or will it
> be a standalone program?

It will be a standalone program, meant to be used with mutt-query [1].
So just call "notmany thomas" on commandline, and your email will
appear.
I don't use emacs, so I won't write an emacs tool (nor I know how
notmuch-address.el works), but I am trying to keep library and UI
separate, so writing a wrapper suitable for emacs is possible, and
probably very easy.

[1] http://wiki.mutt.org/?QueryCommand

_______________________________________________
notmuch mailing list
[hidden email]
http://notmuchmail.org/mailman/listinfo/notmuch
spaetz spaetz
Reply | Threaded
Open this post in threaded view
|

Re: Info about notmuch database

In reply to this post by Thomas Jost
On Thu, 05 Jan 2012 16:04:22 +0100, Thomas Jost <[hidden email]> wrote:
> There's a description of the DB "schema" in lib/database.cc in the
> notmuch source code. But you may also consider just using libnotmuch
> instead, if that's enough for what you want to do.
>
> Also: why Xapian? I'm already using something similar I wrote with
> Python, storing everything in a dictionary, using Pickle to save that to
> disk: 162 lines of code and 45 kb of data are enough to store my
> addressbook and have completion in Emacs...

Ohh, that sounds nice. Is that public somewhere?

Sebastian

_______________________________________________
notmuch mailing list
[hidden email]
http://notmuchmail.org/mailman/listinfo/notmuch

attachment0 (203 bytes) Download Attachment
Thomas Jost Thomas Jost
Reply | Threaded
Open this post in threaded view
|

Re: Info about notmuch database

On Sun, 08 Jan 2012 13:59:30 +0100, Sebastian Spaeth <[hidden email]> wrote:

> On Thu, 05 Jan 2012 16:04:22 +0100, Thomas Jost <[hidden email]> wrote:
> > There's a description of the DB "schema" in lib/database.cc in the
> > notmuch source code. But you may also consider just using libnotmuch
> > instead, if that's enough for what you want to do.
> >
> > Also: why Xapian? I'm already using something similar I wrote with
> > Python, storing everything in a dictionary, using Pickle to save that to
> > disk: 162 lines of code and 45 kb of data are enough to store my
> > addressbook and have completion in Emacs...
>
> Ohh, that sounds nice. Is that public somewhere?
>
> Sebastian
https://github.com/Schnouki/dotfiles/blob/master/notmuch/addrbook.py

Maybe I should add more comments in it :)

Regards,

--
Thomas/Schnouki

_______________________________________________
notmuch mailing list
[hidden email]
http://notmuchmail.org/mailman/listinfo/notmuch

attachment0 (499 bytes) Download Attachment