Monday, February 20, 2012

NEAR operator

Is it possible in any way to control the NEAR operator so that it returns
only records containing the search words within a certain distance - e.g
within 3 words or 5 words or a paragraph etc.?
Apparently the way the NEAR operator works is that it returns all (or almost
all..) the records containing the specified words, and then ranks them based
on the words 'nearness'. The problem with this approach is that if the result
of the search are display ordered not by rank but by some other criteria
using 'NEAR' is exactly the same as using 'AND' - and as a matter of fact
newspaper librarians and journalist always sort the result of a search by
publishing date, not by ranking, so no NEAR operator with SQL full text for
them.
Thank you
- Michele
Michele,
Unfortunately, no. There is no way to control how the NEAR operator
determines "nearness" as it is hard-coded at 50 words and the "definition of
nearness is fixed inside mssearch", and are not user controllable :-(
The following quote (from a Microsoft FTS Developer) was taken from another
thread on this subject related to SQL Server 2005 (Yukon), but also applies
to SQL Server 2000 and proximity (or NEAR) searches and RANK:
"distance between terms for a match
number of matches
document length
etc..
so it is possible for a document with term1 right next to term2 to return a
lower rank than another document with many matches with greater distance
between terms:
eg:
document1 = term1 term2 word word word word word word word word word word
word word word.... word word
document2 = term1 word term1 word term1 word term2 word term1 word term2
word term1 word term2 word term1 word term2 word term1 word term2
document1 may have a lower rank that document2 because it has fewer matches
even though the one match it has is very "near"."
Hopefully this sheds more light on this subject.
Thanks,
John
"Michele Mottini" <Michele Mottini@.discussions.microsoft.com> wrote in
message news:380FD11C-01C0-4AC8-BC8B-82EC0524F6B7@.microsoft.com...
> Is it possible in any way to control the NEAR operator so that it returns
> only records containing the search words within a certain distance - e.g
> within 3 words or 5 words or a paragraph etc.?
> Apparently the way the NEAR operator works is that it returns all (or
almost
> all..) the records containing the specified words, and then ranks them
based
> on the words 'nearness'. The problem with this approach is that if the
result
> of the search are display ordered not by rank but by some other criteria
> using 'NEAR' is exactly the same as using 'AND' - and as a matter of fact
> newspaper librarians and journalist always sort the result of a search by
> publishing date, not by ranking, so no NEAR operator with SQL full text
for
> them.
> Thank you
> - Michele
>

No comments:

Post a Comment