[ale] rehashed - Linux laptops

Jim Lynch ale_nospam at fayettedigital.com
Mon Jan 12 20:01:18 EST 2009


Ed Cashin wrote:
> On Mon, Jan 12, 2009 at 3:32 PM, Jim Lynch
> <ale_nospam at fayettedigital.com> wrote:
>   
>> Ed Cashin wrote:
>>     
>>> It looks like gmane has the archives at least since 2003.
>>>       
> ...
>   
>>> And there's search available---not super great search, but
>>> search nonetheless.
>>>       
> ...
>   
>> I'm just curious, what do you find lacking in the search?  The reason I
>> asked is that it is using the Xapian search engine, which I find quite
>> good and am interesting in why people don't like it.
>>     
>
> To be honest, this is a very old opinion, and I would have trouble
> justifying it.  Now that you mention it, I see there are "AND" and "OR"
> options presented if I click "Searching" on the left instead of just
> going to the group page.
>
> I suppose perl or egrep-style regular expression searching would
> be "super great".  Maybe that's impractical.
>   

OK, thanks, yes, this type of search engine is not a full text search.  
A full text search engine that allows regex and unlimited wildcards, 
like an editor, are impractical after documents get over a certain 
size/number.  Once we have large storage devices made from quantum 
transistors, it will be easier, but now you don't have the time to wait 
for a full text search on large quantities of documents.  So shortcuts 
have to be taken and that's what the "probabilistic" search engines try 
to do.  Xapian does allow a post expansion RE, (refer*) but can get 
rather slow if it finds lots of terms that fit the pattern. 

They probably don't go into it in the docs on Gmane but phrase 
searches("Search for me"), boolean searches (AND OR NOT), proximity 
searches (walnut NEAR fruit, dog NEAR/3 cat) and a few more are 
available.  For a free (GPL) product it is remarkably fast.  They are 
still actively developing it and making it better all the time.  IR  
(information retrieval) is a fairly complex topic.  One that  I've been 
working with for a number of years and still don't have my head 
completely around.  That may be 'cause my head might not be big enough.  ;)

Jim.





More information about the Ale mailing list