I found, at my best, no one has the perfect solution to documents with various formats. My steps:
1, index all searchable files and generate database files to be searched for, EVERY NIGHT! ( One may post documents and bulletins in daytime , lots of huge size documents)
2, Search though the above generated files when the engine is lanched.
Unfortunately there seems no many tools for converting binary files to plain text files. Searchable files must be in plain text like .html files or *.txt
3, Search engine like that of rolia seems can only search .html files only
hmm :). This is a limitation. The catdoc and xls2csv tools may work and i am trying to use others.
Any idea? Any Demo sites?
Will appreciate it.
Thanx
1, index all searchable files and generate database files to be searched for, EVERY NIGHT! ( One may post documents and bulletins in daytime , lots of huge size documents)
2, Search though the above generated files when the engine is lanched.
Unfortunately there seems no many tools for converting binary files to plain text files. Searchable files must be in plain text like .html files or *.txt
3, Search engine like that of rolia seems can only search .html files only
hmm :). This is a limitation. The catdoc and xls2csv tools may work and i am trying to use others.
Any idea? Any Demo sites?
Will appreciate it.
Thanx