![]()
|
|
AddType text/plain .doc .log |
DefaultType text/html # like DefaultContents currently |
The file source ``plug-in'' (whatever that ends up being) would return a content-type, but if not returned then Swish-e would map the type from the file name using the mime.types file or any AddType directives.
Again, internally Swish-e only knows about text/[TXT|HTML|XML], so there should be a way to map other types, otherwise Swish-e might ignore the file. We could continue to use the three type names or switch completely to content-types.
For example, if we continued to use [TXT|HTML|XML]
MapType TXT text/directory text/logfile
MapType HTML text/html
|
Or maybe just extend the current directives
IndexContents HTML .htm .html text/html |
Where the content-type would have precedence over the file extensions.
This would tell Swish-e that those types are handled by those internal handlers.
Then as I've mentioned before, you might specify filters as such
FilterDocument application/msword /path/to/word-to-text |
And word-to-text would convert to text and return one of the three content-types that Swish-e knows how to parse, or a different content type if were to chain filters.
[ TOC ]
Moseley: Updated Jan 13, 2001
If the PropertyNames directive was enhanced to be able to limit the number of characters stored, optionally extract text from HTML, and was able to define what type of docs (text, XML, HTML) it applied to, then the existing PropertyNames feature would work like the new StoreDescription feature but be useful for more than just one use.
I'm not clear how to enhance the syntax of Properties and/or Metanames, but best and commonly understood. That's a good idea. Below are some older ideas that I had. But you will get the idea...
The metaname structure could have flags for properties:
1 - limiting to a length
2 - stripping HTML
3 - encoding HTML entities on output
|
Oct 9, 2001 - The code is now in Swish-e to limit a string property to a length. The stripping of HTML is an issue for discussion. And encoding entities on output should be a result_outpu.c issue.
[ TOC ]
This would be to allow some directives to be set per directory, or perl file extenstion (or content-type).
[ TOC ]
$Id: SWISH-3.0.pod,v 1.6 2002/04/15 02:34:43 whmoseley Exp $
. [ TOC ]
![]()