|Date||Wed, 09 May 2007 09:50:32 +1200|
|Subject||Re: [greenstone-users] Search Questions|
> Searcher Side:
> 1. Can I replace search engine with my own?
Yes, but this is a non-trivial task involving C++ and Perl code.
> 2. Does your search engine look through actual document contents - as
> a background process or at the time of the actual user search?
The text content of documents is extracted when the collection is built
and indexed for fast searching at run-time.
> 3. What FORMATs of actual documents does your search engine look at?
> (Ascii, Microsoft, PDF, etc.)
> 4. When searching the contents of a PDF file, does the background
> process, using OCR, create an additional file in another format? What
> 5. Does your OCR routine search FORMATS other than PDF? If yes, what
> formats can the OCR search?
> 6. What are the resolution requirements for your OCR routines?
Greenstone does not do OCRing.
> 8. Can the search engine search using both requested metadata element
> values and keywords from the document contents?
Yes, this is done using the fielded (advanced) search.
> 9. Can I start with keywords from the document contents and then
> later filter the results using user inputted metadata element values?
> 10. Can I start with user input metadata element values and then
> later filter down the results with document contents?
This is not built into the interface but the user can modify the
original keyword search to add the metadata filtering as above.
> 11. After an initial search, can I refine my search by only looking
> at the results of the previous search?
Again, this can be achieved by combining the two searches in the
advanced search page.