Some newbie questions:
Thanks!
Shawn
Hello Shawn,
We developed a standalone provider as a proof of concept that can merge multiple html/text modules into a single result on each tab. I'm including the vb file in this answer to let you watch how we're doing it (sorry, french comments). Please note however that since it doesn't match how DotNetNuke handles page composition there are some limitations (for example we didn't decide yet how to handle right management when they are not homogenous, we're defaulting to the tab rights). It is definitely possible though.
EDIT: apparently i cannot attach a vb file, you will find it here
Regarding your ISearchable controller, did you make sure in the Indexer settings that you checked the module that you want to index?
Logging is mainly available on the indexing steps. It gives you an idea of what's happening for the indexer process; how many documents were pulled out of the various modules, how many documents will be overwritten, etc; the logging happens under the Debug Info category of DNN so you have to activate it to see the logs. We are aware of a problem in the search logging and are in the process of correcting it; apparently some search terms aren't logged correctly and we will certainly fall back to logging the entire Lucene Query.
Regarding the quote wrapped search terms, Lucene.Net handles it natively but it doesn't handle correctly MultiPhrases queries (e.g "search all t*") so some filters don't use it: the light filter uses a keyword query, and the standard filter uses a combination of keyword and prefix query. However the advanced filter makes use of this feature in the "have this exact phrase" textbox (first one) where "this test" and "test this" won't find the same results. We don't have a filter that parses the input directly as Lucene syntax but this can be easily added.
"Clear portal index" should do what it says :) but what i saw sometimes was a clear followed immediatly by an automatic indexation by DNN. If two consecutive clicks on CLear portal index still yield results there may be a problem we're not aware of; do you have any error message in the logs?
Boosting the fields happens during field declaration; LuceneSearch declares some fields out of the box so their boost is already defined and we don't offer a way to change the boosts through the UI. You can however change boosts in code by using the following :
FieldFactory.Instance().GetDefinition("MyFieldName").Boost = 0.1;
Finally, documentation may be a bit behind at the moment since we are reworking some parts of LuceneSearch; most notably there were some big UI improvements in the latest versions that are not reflected in our docs. I think for the moment you'd better post on the forum for questions; i will try and answer you as quickly as possible; if you have multiple questions, we can also schedule a Skype session. Your input and questions will be appreciated since they can show us what points need to be clarified.
Don't hesitate to let me know if you have more questions.
Best regards, Samy
Hi Samy,
Thanks very much for the reply.
Thanks for the stand-alone provider sample code. Its meaning is understood but I can't help but wonder if a spider-type approach might be better. Crawling portals has advantages - including the need for a custom provider. But, the drawback is the loss of granularity in terms of excluding specific page content (e.g., navigation menus). Will need to think on this one.
Still not seeing any logs generated - I've enabled debugging in web.config, set the level for log4net to debug, enable the progress events in the event viewer. Am I missing something else? Using DNNLog, I am seeing messages from my custom provider but no other kinds of indexing progress hints.
Quote-wrapped search terms - my vote would be if the search value is entered with quotes the filter should switch to something like the "exact phrase", separating the search fields with an "OR" e.g., Content:"search term" OR Title:"search term" OR .... Just my opinion. I've used the Java Lucene (actually Apache solr) on an intranet and did just that.
Clear Portal Index definitely not clearing. My guess is a file locking issue.
Will mess around with the FieldFactory stuff for boosting purposes. Thanks for the info.
A couple of other issues found:
Overall I like the module but it desperately needs better documentation, a decent custom provider example, and better logging. It's like flying blind attempting to figure out the exact query getting passed to Lucene.net.
Thanks again for your continue support.
Shawn,
It's possible to change the provider i sent you in order to have it watch only at some types of content, so i guess that's the option i would look into in order to create a tab indexing provider that is able to intelligently decide what's worth indexing on a page; for example include every content from a tab that comes from modules X, Y, and Z, except those who have the "menu" container applied. It can be very versatile, although i don't really see how we could offer this kind of behavior with a simple UI. However it's definitely feasible.
We do have a spider-type provider that looks for documents linked from HTML/Text content (called ReflexiveLinkedDocumentProvider) but i don't think that's the spidering you had in mind.
Regarding logs, it's really strange that you're not seeing any; here is what you should be seeing if you activate the indexation logs:
Are you sure you enabled the DEBUG entries in the log? Here is a screenshot of the configuration i have on my machine
We are in the process of evaluating alternate solutions for our logging system, for example by branding log frameworks in our modules, such as log4net; we can't rely on it being present out of the box since DotNetNuke only includes it as of version 6 (if i remember well). But this is definitely a point we are working on; thanks for pointing it out and confirming our priorities.
Regarding the quote-wrapped search terms, that's an option we haven't considered yet; as of now, each component defines the kind of LuceneQuery it will create.
About the various issues:
Thanks for all your feedback; i should start working again soon on the module and will look into what you reported. I have a custom search results article in the works but keep an eye on the blog and a custom provider tutorial will pop up soon.
Best regards,
Samy
You may be happy to hear that the quote-wrapped search issue has been investigated and solved for the light filter. What happened is that i released the latest lightfilter in a mode where it ignores the Lucene syntax but in exhange allows for searching on a different set of fields.
In order to change that, you can either of the following actions:
- edit the /DesktopModules/Aricie.LuceneSearch/Controls/AdvancedFilters/LightFilter.ascx, look at the KeyWordFilter control on line 18 and change its QueryMode parameter from "Filter" to "Search"
- download this file, rename its extension to .ascx and replace you previous LightFilter.ascx with it.
It should let you search with quotes; let me know if you have any further questions on this point.