algorithm - Is it possible to compute the minimum of a set of numbers modulo a given number in amortized sublinear time? -
is there data structure representing large set s
of (64-bit) integers, starts out empty , supports following 2 operations:
insert(s)
inserts numbers
s
;minmod(m)
returns numbers
ins
suchs mod m
minimal.
an example:
insert(11) insert(15) minmod(7) -> answer 15 (which mod 7 = 1) insert(14) minmod(7) -> answer 14 (which mod 7 = 0) minmod(10) -> answer 11 (which mod 10 = 1)
i interested in minimizing maximal total time spent on sequence of n
such operations. possible maintain list of elements s
, iterate through them every minmod
operation; insert o(1)
, minmod o(|s|)
, take o(n^2)
time n
operations (e.g., n/2
insert
operations followed n/2
minmod
operations take n^2/4
operations).
so: possible better o(n^2)
sequence of n
operations? maybe o(n sqrt(n))
or o(n log(n))
? if possible, interested know if there data structures additionally admit removing single elements s
, or removing numbers within interval.
another idea based on balanced binary search tree, in keith's answer.
suppose inserted elements far stored in balanced bst, , need compute minmod(m)
. consider our set s
union of subsets of numbers, lying in intervals [0,m-1], [m, 2m-1], [2m, 3m-1] .. etc. answer among minimal numbers have in each of intervals. so, can consequently lookup tree find minimal numbers of intervals. it's easy do, example if need find minimal number in [a,b], we'll move left if current value greater a, , right otherwise, keeping track of minimal value in [a,b] we've met far.
now if suppose m uniformly distributed in [1, 2^64], let's calculate mathematical expectation of number of queries we'll need.
for m in [2^63, 2^64-1] we'll need 2 queries. probability of 1/2.
m in [2^62, 2^63-1] we'll need 4 queries. probability of 1/4.
...
mathematical expectation sum[ 1/(2^k) * 2^k ], k in [1,64], 64 queries.
so, sum up, average minmod(m)
query complexity o(64*logn). in general, if m has unknown upper bound, o(logmlogn). bst update is, known, o(logn), overall complexity in case of n queries o(nlogm*logn).
Comments
Post a Comment