Linked by Thom Holwerda on Thu 18th Jan 2007 15:11 UTC, submitted by Torsten Rahn
Benchmarks "A number of search engines are available for the Gnome and KDE desktop environments, many based around the open source Lucene search engine. It would be tremendous if we could adopt one of these search engines for the Gnome platform, so we can provide the type of integrated search experience for our users that they really need, irrespective of which distort they are using. So to help in this assessment we have carried out a comparison of four different Unix based indexers [.pdf]."
Thread beginning with comment 203115
To read all comments associated with this story, please click here.
Well one thing is obvious
by tristan on Thu 18th Jan 2007 18:10 UTC
tristan
Member since:
2006-02-01

If there's one thing that this report makes absolutely clear, it's that managed environments like Mono and Java have absolutely no place in the world of low-level userspace daemons. Beagle's memory usage was more than 10 times that of Tracker and Strigi, and JIndex was even worse. Beagle is good at what it does, but anyone who thinks that it is a long-term solution for metadata indexing is either mad, or working for Novell.

So given that we are left with a choice between Tracker and Strigi, this question is of course, which one? Clearly both of them still have some deficiencies (which is why most distros are going for Beagle right now), and both are in heavy development. A clear winner is hard to pick.

Unfortunately I think it might be the case that Gnome goes for Tracker and KDE goes for Strigi, which would be a shame, as this is definitely an opportunity for the two desktops to work together to achieve a common goal. Even if they do go for separate solutions, I hope that they could work out a common search API, so that the user wouldn't have to have to daemons running to be able to use Gnome and KDE apps at the same time.

It's also worth noting that Tracker aims to be not "just" an indexer, but a complete metadata database. So, for example, Rhythmbox (or Amarok!) wouldn't need to maintain its own database, but would instead just be able to query tracker for all the audio files on the system. I don't know whether the author of Strigi has similar plans, or indeed whether such a thing is practical.


Lastly, I have to say that this is the least professionally-written report I've ever seen. I realise that it's primarily for internal use, but if I were to hand my boss something like this ("yes, I do like cakes") I would probably find myself in some quite hot water...

Edited 2007-01-18 18:14

Reply Score: 5

RE: Well one thing is obvious
by g2devi on Thu 18th Jan 2007 19:14 in reply to "Well one thing is obvious"
g2devi Member since:
2005-07-09

From what I've read (sort of hinted at in the review), Tracker and Strigi are working on a common API so that, in theory, one could use Tracker as the front end database (that'll be used for more than just indexing, e.g. a bookmarks database) and Strigi as the backend indexer. Not sure if this will pan out.

Reply Parent Score: 2

RE: Well one thing is obvious
by anda_skoa on Thu 18th Jan 2007 19:23 in reply to "Well one thing is obvious"
anda_skoa Member since:
2005-07-07

If there's one thing that this report makes absolutely clear, it's that managed environments like Mono and Java have absolutely no place in the world of low-level userspace daemons

True, however I found it quite awesome how fast the Java index starts up after the first time (startup times diagram, page 10)

I hope that they could work out a common search API

There is a lively discussion about this on teh main freedesktop.org mailinglist, starting back in November and still continuing (subject "Simple search API"):
http://lists.freedesktop.org/archives/xdg/2006-November/thread.html

I don't know whether the author of Strigi has similar plans

Well, Strigi already indexes metadata and I think there is a goal to make it collaborate with Nepomuk's data relation framework

Reply Parent Score: 2

RE[2]: Well one thing is obvious
by Jamie on Thu 18th Jan 2007 19:31 in reply to "RE: Well one thing is obvious"
Jamie Member since:
2005-07-06

Well, Strigi already indexes metadata and I think there is a goal to make it collaborate with Nepomuk's data relation framework

Not as good as having an all in one integrated database/indexer like tracker (and vista). The problem with a dedicated indexer is they cant efficiently be coupled with a database without duplicating all the metadata in both databases and then which engine do you use for searching: Lucene or the DB?

Tracker was designed from the ground up to integrate both using a tightly coupled sqlite and a custom inverted word index and this gives you tremendous power as a result without duplicating metadata and having one interface for searching.

Edited 2007-01-18 19:35

Reply Parent Score: 1