elasticsearch custom_score multiplication is inaccurate -
i've inserted documents identical except 1 floating-point field, called a
.
when script
of custom_score
query set _score
, resulting score 0.40464813 particular query matching fields. when script
changed _score * a
(mvel) same query, a
9.908349251612433, final score becomes 4.0619955.
now, if run calculation via chrome's js console, 4.009394996051871.
- 4.0619955 (elasticsearch)
- 4.009394996051871 (chrome)
this quite difference , produces incorrect ordering of results. why be, , there way correct it?
if run simple calculation using numbers provided, result expect.
curl -xpost 'http://127.0.0.1:9200/test/test?pretty=1' -d ' { "a" : 9.90834925161243 } ' curl -xget 'http://127.0.0.1:9200/test/test/_search?pretty=1' -d ' { "query" : { "custom_score" : { "script" : "0.40464813 *doc[\u0027a\u0027].value", "query" : { "match_all" : {} } } } } ' # { # "hits" : { # "hits" : [ # { # "_source" : { # "a" : 9.90834925161243 # }, # "_score" : 4.009395, # "_index" : "test", # "_id" : "lpesz0j6rt-xt76aatcfow", # "_type" : "test" # } # ], # "max_score" : 4.009395, # "total" : 1 # }, # "timed_out" : false, # "_shards" : { # "failed" : 0, # "successful" : 5, # "total" : 5 # }, # "took" : 1 # }
i think running here testing little data across multiple shards.
doc frequencies calculated per shard default. if have 2 identical docs on shard_1 , 1 doc on shard_2, docs on shard_1 score lower docs on shard_2.
with more data, document frequencies tend out on shards. when testing small amounts of data either want create index 1 shard, or add search_type=dfs_query_then_fetch
query string params.
this calculates global doc frequencies across involved shards before calculating scores.
if set explain
true
in query, can see how scores being calculated
Comments
Post a Comment