sql - How to improve mysql NATURAL LANGUAGE MODE search query? -
this query
select * mytable match (name) against ("apple m1" in natural language mode)
if search apple m1
results orange m1
third or more position apple m-1
– value stored , assuming should first!
my question is: there way fine tune mysql search?
they best way improve mysql natural language mode search use boolean full-text searches instead. same natural language mode search, can use additional modifiers finetune results, e.g. by
> <
these 2 operators used change word's contribution relevance value assigned row. > operator increases contribution , < operator decreases it.
there 1 minor difference, boolean mode search not order automatically according relevance, have order yourself.
select * mytable match (name) against (">apple m1" in boolean mode) order match (name) against (">apple m1" in boolean mode) desc
and remark: both versions of fulltext search not find m-1
if match against m1
(even minimum wordlength setting of 2). exakt (usually case-insensitive) word matches, not similar words (unless use *
). "just" weigh combination of (exact) words algorithm, and, if use them, modifiers.
update additional clarification according comments:
if match against apple m1
, returns rows contain (case-insensitive) apple
or m1
in order, e.g. m1 apple
, apple m4
, apple m-1
, orange m1
. not find apples m4
or orange m-1
, because not words. e.g. like '%m-1%'
wouldn't find apple m1
either. if like, can match against apple*
find apple
, apples
, it's @ end of word, *apple*
not possible, have use like '%apple%'
then.
these rows ordered scoring algorithm, score words less common in texts higher common words. , if add >apple
, give apple
higher value. number, can add them select, e.g. select ..., match (name) against (">apple m1" in boolean mode) score
feeling that.
there other things consider:
only words have minimum length added index. length given
innodb_ft_min_token_size
innodb orft_min_word_len
myisam. should set e.g. 2 includem1
(otherwise, word not have effect in search. since in example, foundorange m1
, assume set correctly).-
considered hyphen.m-1
in text split 2 wordsm
,1
(that may or may not included according mininum word lenght setting, maybe set 1). can change behaviour adding-
characterset (see fine-tuning mysql full-text search, part beginningmodify character set file
), not findblue-green
anymore if searchblue
and/orgreen
.the full text search uses stopwords. these words not included in index. list includes
a
,i
, minimum wordlength of 1, not find them. can edit list.
some ideas potential problem m1
/m-1
. adjust exact requirements, have add more information searches , data (and maybe question), ideas:
you can replace userinput contains
-
including both versions search query: once-
, enclosed in""
, once without. if user entersapple m-1
, create searchapple m1 "m-1"
(that work or without modified characterset, without new characterset, min word length has 1). if user entersm1
, should detect , replacem1 "m-1"
too.another alternative save additional column clean, hyphenless words , add column full text index ,
match (name, clean_name) against ("m1" ...
.and can of course combine , match, e.g. if detect product number in input, can use
where match(...) against(...) or product_id 'm%1%'
, orwhere match(...) against(...) or product_id = 'm-1' or product_id = 'm1'
orwhere match(...) against(...) or name '%m%1%'
, latter lot slower , contain lot of noise. , might not score correctly, @ least in resultset.
but said, depend on data , requirements.