I'm surprised nobody else has asked.
The question is: How should one get SQL FTS to index files' content
which have a suffix/extension not among those supplied by default BUT
perfectly searchable by existing filters? I do *not* mean content
which requires a new IFilter.
I have an IMAGE column with content taken from files, and I have
populated the "extension mapping" column with the original file's
extension. All is working fine. However...
Users could add *any* type of file. Take an example: if they add
"fred.txt" it will be indexed (with text filter), but if the file
happens to be, say, "fred.cs", it will not be indexed.
So far as I can see, via the Citeknet IFilter Explorer, for SQL
2000/2005 purposes (but not other MSSearch apps) essentially only
extensions with subkeys in registry
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ContentIndex Common\Filters
\Extension
(about 30 entries) get indexed. This seems to be a list installed by
Microsoft.
Now, ".cs" files themselves are known as "text/plain", and have the
right "PersistentHandler", but because they do not appear there they
are not indexed by SQL FTS; and so on for many other file types which
are plain text.
That gives me 2 possibilities:
1. Tell my users to (ask their admins to) add extensions to this list
(somehow, e.g. directly in registry).
2. Invent a "pseudo-extension" column in my SQL table, which does not
hold the true file extension but rather the extension to tell SQL FTS
to use for indexing, e.g. ".txt" for a ".cs" file. Then allow a
"mappings" table (or whatever) to be set up, and use that to decide
for each file added into IMAGE column what to put in the "Extension"
column.
Which is the "correct"/"recommended" way to do this? Have others
chosen either path?
You could use a trigger or a computed column, but is would be more efficient
to resolve this in the client application (i.e. map test files with
extensions other than .txt to .txt).
ML
Matija Lah, SQL Server MVP
http://milambda.blogspot.com/
|||OK, thank you, so you see this as an issue which requires each
application to map whatever file extensions it happens to come across
to one of the few known to SQL FTS as supplied, rather than augmenting
those actually known to SQL FTS? I'm still surprised not to have come
across other posts asking this; this behaviour isn't happenning only
to me, is it...
On 6 Aug, 23:30, ML <M...@.discussions.microsoft.com> wrote:
> You could use a trigger or a computed column, but is would be more efficient
> to resolve this in the client application (i.e. map test files with
> extensions other than .txt to .txt).
> ML
> --
> Matija Lah, SQL Server MVPhttp://milambda.blogspot.com/
|||Perhaps you're the only one overthinking it...?

ML
Matija Lah, SQL Server MVP
http://milambda.blogspot.com/
No comments:
Post a Comment