2d range trees?

hendu · Post by **hendu** » Mon Nov 05, 2012 3:43 pm

I need a fast 2d range search for a project. 2d range trees seem like the best solution.

There are 40 years' worth of papers, but no code - my google-fu only finds one, CGAL, and that's GPL.

Anyone know of a library for that?

--

Background:

Optimized 2d range trees can do searches like that, in O(log n). K-d trees, while multiple implementations exist, do range searches in O(sqrt(n)), which is terrible in comparison.

My current implementation is a really fast O(n^2). It's about 10k times faster than the naive loop on a test set, but it's still not fast enough.
(an array of hash maps, with a simple Bloom filter in front at 74% efficiency)

I may end up having to write one, but rather would not - I estimate it'd take over two weeks, including optimization and bug hunting.

chronologicaldot · Post by **chronologicaldot** » Fri Nov 09, 2012 6:48 am

First: What's this for?

Second: I assume that drawing is representing your data points and your boundaries (what you want to find is in the box). Is that right?

hendu · Post by **hendu** » Fri Nov 09, 2012 1:49 pm

I have a huge set of points, and want to find which of them are in the view area.

Second: I assume that drawing is representing your data points and your boundaries (what you want to find is in the box). Is that right?

Yes. Find all points where X is between A and B, and Y is between C and D.

Or using a tax data example like in one of the papers, find all people between ages 30-60 with income between 10k-50k.

Dareltibus · Post by **Dareltibus** » Fri Nov 16, 2012 11:46 am

quadtree doesn't work?

hendu · Post by **hendu** » Fri Nov 16, 2012 3:24 pm

Quadtree is not good enough. It's logarithmic only up to the last levels, which then are linear.

If I'm going to do this, might just as well do it well and use the best algorithm available

Dareltibus · Post by **Dareltibus** » Sun Nov 18, 2012 9:29 am

use recursive quatree. instead of having each node divided into 4 pieces you divide it in more (9/16 etc.) theorically should be faster even if it keeps O(logN) execution time. EDIT: or better just use a grid. you check nearby quads etc. and only accurate check for quads on corners of your range

serengeor · Post by **serengeor** » Sun Nov 18, 2012 11:41 am

I was doing a research for a 3d solution, though I couldn't really find something that could be as easy as plain old octrees.

I wanted to try (Looks like what you're trying to do?):
http://en.wikipedia.org/wiki/Z-order_cu ... _searching
Though, that seamed a bit too complex. I actually had written some parts for this, but never got to range searches.

Also found this:
http://mrl.snu.ac.kr/research/ProjectLi ... index.html
http://zeus.mat.puc-rio.br/tomlew/publi ... octree_sgp

If you give any of this stuff a try please do share

Edit:

Quadtree is not good enough. It's logarithmic only up to the last levels, which then are linear.

How does it become linear?

REDDemon · Post by **REDDemon** » Sun Nov 18, 2012 5:06 pm

I have a little sugguestion maybe you can take a try. (cost is linear, but should be faster.. you never know. just try and see if it works)

When a "enemy move" (I guess those red points are enemies) just update it's distance from the player (in 2D are just 2 multiply operations and 3 sums(including difference with player coordinates) and anyway you are not forced to compute distance for every enemy at every frame.). then use Insertion Sort on them using as key distance from player. (if is faster sorting only indices to enemies instead of 32x4 bits depends, try both cases).

If you have few enough enemies (several thousands) their whole positions should fit into cache entirely, well measuring for different numbers of enemies will probably shows a peak when you go out of cache capacity (I assume you store on a separate list
float posX; //lol pronunce like POSIX
float posY;
float distance;
u32 enemyID; //assuming you have component /entity system allow to find more data on that enemy, for example for apply damage)

)
and since distance order does not change very often from one frame to another (excluding worst case, when they are all placed in a circle around the player), the insertion short should short their distances in O(N) time (around few nanoseconds for several thousands without cache misses.. still bigger than logarithm cost, but maybe faster).

If then you have more than several thousands enemies you can still hold several pools, each pool have thousands enemies and each pool bound to a different "sector"(and you have to test only few sectors at once exactly the sectors that are in range.). If you need to test distance based on a rectangle instead of a circle you can still get "enemies in the range of a circle" and then test if they are inside your rectangle.

On paper this should be slower than a quatree, but you never know without testing.

EDIT:
just a consideration, if you want to find N enemies, unless you have "group of enemies" you can't perform better than N since you need to access N enemies. And logN is anyway better than N. (so i guess you refer N to also enemies that are not "in range" --> in wich case your quadtree perform Linear at last levels because once you found all enemies says M/N you still need to access all M enemies and that can't be better than M).

hendu · Post by **hendu** » Sun Nov 18, 2012 8:45 pm

serengeor wrote:
Quadtree is not good enough. It's logarithmic only up to the last levels, which then are linear.
How does it become linear?

The smallest quad (that crosses the range border) has to be linearly searched.

Or, you could have the smallest quad contain just one point, but that would then cost a lot of RAM.

If you give any of this stuff a try please do share

Will do. If I do make it to a proper lib, it'll be AGPL though - there's clearly money in this

@REDDemon

This is about a static, huge map with several searches per second. Precomputation time is not an issue. Only RAM use and search speed matter.

REDDemon · Post by **REDDemon** » Sun Nov 18, 2012 10:08 pm

Several searches per second.. or per frame? (So my solution doesn't suit your needs since assume 1 sorted array for every search and can be a lot of memory).

So probably best solution is just to use a map divided into little sectors (little enough to not include more than several units like 50-100) but big enough to not increase insanely the number of sectors.

Every sector is accessed directly from a "Grid map[X][Y]" by just doing an arithmetic operation on coordinates.
so that given posX,posY of the Range point

the function returns from 1 to 4 sets of indices that tell you in wich sectors you have to search. (linear search, should be very fast anyway).

hendu · Post by **hendu** » Mon Nov 19, 2012 4:19 pm

"Pointer chasing is just about the most expensive thing you can do on modern CPU's."
- Linus Torvalds

If you do 100 pointer hops per each sector on the range border, that's about 10us just chasing memory around, per sector. The cpu can't cache that kind of patterns.

I hope for a whole search to be < 5us. That kind of performance has been reported of the 2d range trees, with much greater amount of points than what I will use, and with lesser hardware.

If not storing pointers but direct coordinates, then your solution is the same as the quadtree one proposed above.

REDDemon · Post by **REDDemon** » Wed Nov 21, 2012 7:01 am

sorry but on my cpu 5us is the time required for several instructions, this seems to be a big request. Anyway if you need it for a game just try and profile, when speed is enough that's ok

. anyway that should not be 100 pointer hops since what I want is every sector to be completely cached at once.. (so store all coordinates in sectors, and that's not the same of a quadtree)

My solution was not a quatree, that surprise me you don't see difference.

Quadtree is a structure with O(logN) time for find a node. that's because every node contains other nodes. you can be lucky and find immediatly a leaf (can happen if your map have big low-density zones), or you can be unlucky and perform (at worst case) logN time.

My solution access directly every sector in fixed time. (and once accessed every sector is fully cached ensuring high speed, even better you can use compilers-built-ins, so when you are accessing 1 sector, you can prefetech the next sector, removing at all cachin time, prefeteching this is not viable only on CPUs with low level of assiociativity, like mobiles and small consoles, but in that case assume lesser enemies and assume more sectors to be present in same pages).

If every zone have a surface of 100 (says m^2) and every unit have a surface of 1 (m^2) then there can't be more than 100 units per sector (in practice there will almost be never 100 units since usually they will wander around with some kind of steering behaviour).

All those parameters needs only to be tuned.

2D range trees is a halfway between grid-constant accesstime and quadtrees logarithmic search.

You add perframe- pre-processing time in order to be able to do more queries/always per frame.
Some timings:

K => number of returned elements (in a unit of time)
M => number of queries /unit of time
N => number of total elements

Following timings are only time-profiling referred to worst case, but also memory usage profiling is interesting
Quadtree:

Initial building cost:
O(NlogN)

Updating cost (1 update per unit of time):
O(NlogN) //assuming ALL elements moved enough to change LEAF, rarely happens and even if it happens now most leaf already exists and less time required anyway)

Query cost:
O(K + M*logN)

2DRange Tree: (note this would be usefull for really big databases, where you can have A LOT of queries before any change, I can't imagine it usefull for a videogame)

Initial building cost:
O(n*logN*logN) (note that initial data is pre-sorted, the cost is reduced to O(NlogN). We assume that initial build will also sort previous( unsorted) data.

Updating cost:
O(N+n*logN) (sorting with insertion sort, partially sorted data requires N time, and then building requires N*logN only, total rebuild almost always needed.)

Query cost:
O(K + M*logN)

Grid constant access:

Initial buildling cost:
O(N + DIM^2) //a grid of size DIM^2 requires DIM^2 time for building

Updating cost:
O(N)

Query cost:
O(M+K)

Note that K can be thinked like M*Z (total number of returned elements is equal to elements returned in average by every query, multiplied by number of queries)

so that:
O(K + M*logN) becomes O(M(1+Z+logN))
O(M+K) becomes O( M(1+Z))

and updating cost really vary on how you tunes parameters (if you have zones/ sectors bigger enough it can greatly reduce).

Note the only nice think of 2D range trees is that their sectors have not fixed size ( every sector goes exactly from 1 enemy to next enemy), so you have no parameters to tune => high scalability for how concerns elements density. Quadtrees perform really bad and needs tuning since are not scalable with density, while GRIDS, are not scalable for bigger maps (due to high memory usage)

2DRangeTree => scalability on elements density / range of search (uses additional memory compared to quadtree), perform bad with many updates
Quadtree => scalability on number of elements (uses least memory for storing additional info for retrieving nodes)
Grid => memory inefficent but have constant time, works better with sames search ranges (not scalable on range of search, need tuning for that), perfect for quick updates

notes about memory usage of grid.

if you can have maximum 100 elements than grid requires

DIM^2 * 100 memory. This can be reduced using sectors pooling (so allocating a sector and keeping it around only when needed and some units are in it, but if units are sparse, then pooling is useless since most sectors will be active anyway.).

There's no perfect solution, all depends on your needs. and that's why Try&Profile should be your first rule when you need to decide the right algorithm.

hendu · Post by **hendu** » Wed Nov 21, 2012 2:58 pm

5us = 14 000 instructions here (2.8Ghz) - that's a lot

My solution was not a quatree, that surprise me you don't see difference.

Yes, I should have said "single-level quadtree", which is a grid essentially.

Grid Query cost:
O(M+K)

I think you have a mistake here. For one cell it's constant, but the number of cells in one grid part is big. This is what I meant by the quadtree comment, using a grid is no faster, as you still have the linear search in every border part.

There's no perfect solution, all depends on your needs. and that's why Try&Profile should be your first rule when you need to decide the right algorithm.

Indeed. I have yet to find a better algorithm than the range trees for this case.

REDDemon · Post by **REDDemon** » Fri Nov 23, 2012 3:14 pm

hendu wrote:5us = 14 000 instructions here (2.8Ghz) - that's a lot

My solution was not a quatree, that surprise me you don't see difference.
Yes, I should have said "single-level quadtree", which is a grid essentially.

Grid Query cost:
O(M+K)
I think you have a mistake here. For one cell it's constant, but the number of cells in one grid part is big. This is what I meant by the quadtree comment, using a grid is no faster, as you still have the linear search in every border part.

There's no perfect solution, all depends on your needs. and that's why Try&Profile should be your first rule when you need to decide the right algorithm.
Indeed. I have yet to find a better algorithm than the range trees for this case.

That's not a linear search. In a grid, but that's constant time access. O_o

a grid basically is a 2-d array. that's mean each element is accessed in constant time. The point obscure to you is that is possible obtain (in few instructions).
The exact X-Y index simply by transforming coordinates of the point (usually casting to integer or by dividing) and that's NOT A SEARCH, just an arithmetic operation.. and on modern computers even complex arithmetic can be still faster than memory access (also know as "arithmetic intensity/density").

EXAMPLE:
map with size 500x500 (500x500 sectors. so 250.000 sectors, already sad Grid wasn't memory-size efficient isn't it?) and coordinates that goes from 0 to 2499.99f (coordinate of 2500 is not valid => out of map).
If a point is in (1354,890) then you just need to access following sectors:

Code: Select all

 
const int Div= 2500/500; 
int X = ((int)point.x)/Div; //1354 /5  = 270.8 rounded down to 270 thanks to integer divison (wanted behaviour)
int Y = ((int)point.y)/Div; //890 /5  = 178
 
Grid[Y][X] // your map sector in 4 C instructions (don't know how disassembly will look
               // but I assume this use less than 10 assembly instructions so less than 5us)

anyway, if I remember correctly:
5us = 5x10^-9 seconds (5/billion seconds)

on a 2,8 GHz cpu a instruction takes

every singol instruction takes (1/2,8 billions seconds) = 0,000000000357 (3,57x10^-10) seconds/instruction.

in 5 us we have
(5x10^-9)/(3,57x10^-10) = (5/3,57)x(10) = 14 instructions.

Assuming correct pipelining, no branch-mis-prediction and no cache misses, there can be up to Nx14 instructions (I doubt any machine can go above 50 instructions, unless is using SSE or other special instructions for SIMD processing,and of course counting SIMD instructions as x4

).

50 < 14000

and anyway a 2DRangeTree is a complex algorithm, including lot of cache misses and probably lot of mis-preditiction that can reduce pipeline and memory efficiency, With a Quatree/2DRangetree the effective number of instructions will be lesser than 14 (you could expect as well 2-3 instructions in a time lapse of 5us, but of course no one can say that without profiling).

hendu · Post by **hendu** » Fri Nov 23, 2012 3:49 pm

You miss my point.

Yes, with a grid you can access one cell in constant time.

But if the cell sits on the search border, say, the grid cell is from 0-100 in both x and y, and I'm searching for >= 50 in both x and y.

That means I have to linearly check each item contained in that cell, to see whether it's inside or outside the range. Is this not so?

in 5 us we have
(5x10^-9)/(3,57x10^-10) = (5/3,57)x(10) = 14 instructions.

Microsecond = 10^-6.

Irrlicht Engine

2d range trees?

2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?

Re: 2d range trees?