# 2.6. algorithmsÂ¶

Some notes on algorithms following the book Introduction to algorithms and based on UVAâs course.

## 2.6.1. asymptoticsÂ¶

Big-O

big-oh: O(g): functions that grow no faster than g - upper bound, runs in time less than g

\(f(n) \leq c\cdot g(n)â\) for some c, large n

set of functions s.t. there exists c,k>0, 0 â€ f(n) â€ c*g(n), for all n > k

big-theta: Î(g): functions that grow at the same rate as g

big-oh(g) and big-theta(g) - asymptotic tight bound

big-omega: Î©(g): functions that grow at least as fast as g

f(n)â„c*g(n) for some c, large n

Example: f = 57n+3

O(n^2) - or anything bigger

Î(n)

Î©(n^.5) - or anything smaller

input must be positive

We always analyze the worst case run-time

little-omega: omega(g) - functions that grow faster than g

little-o: o(g) - functions that grow slower than g

we write f(n) â O(g(n)), not f(n) = O(g(n))

They are all reflexive and transitive, but only Î is symmetric.

Î defines an equivalence relation.

add 2 functions, growth rate will be \(O(max(g_1(n)+g_2(n))\) (same for sequential code)

recurrence thm:\( f(n) = O(n^c) => T(n) = 0(n^c)\)

\(T(n) = a*T(n/b) + f(n)\)

\(c = log_b(a)\)

over bounded number of elements, almost everything is constant time

## 2.6.2. recursionÂ¶

moving down/right on an NxN grid - each path has length (N-1)+(N-1)

we must move right N-1 times

ans = (N-1+N-1 choose N-1)

for recursion, if a list is declared outside static recursive method, it shouldnât be static

*generate permutations*- recursive, add char at each spotthink hard about the base case before starting

look for lengths that you know

look for symmetry

n-queens - one array of length n, go row by row

## 2.6.3. dynamic programmingÂ¶

```
//returns max value for knapsack of capacity W, weights wt, vals val
int knapSack(int W, int wt[], int val[])
int n = wt.length;
int K[n+1][W+1];
//build table K[][] in bottom up manner
for (int i = 0; i <= n; i++)
for (int w = 0; w <= W; w++)
if $(i==0 \vert \vert w==0)$ // base case
K[i][w] = 0;
else if (wt[i-1] <= w) //max of including weight, not including
K[i][w] = max(val[i-1] + K[i-1][w-wt[i-1]], K[i-1][w]);
else //weight too large
K[i][w] = K[i-1][w];
return K[n][W];
```

## 2.6.4. min-cut / max-cutÂ¶

## 2.6.5. hungarianÂ¶

assign N things to N targets, each with an associated cost

## 2.6.6. max-flowÂ¶

A list of pipes is given, with different flow-capacities. These pipes are connected at their endpoints. What is the maximum amount of water that you can route from a given starting point to a given ending point?

## 2.6.7. sortingÂ¶

you can assume w.l.o.g. all input numbers are unique

sorting requires Î©(nlog n) (proof w/ tree)

considerations: worst case, average, in practice, input distribution, stability (order coming in is preserved for things with same keys), in-situ (in-place), stack depth, having to read/write to disk (disk is much slower), parallelizable, online (more data coming in)

adaptive - changes its behavior based on input (ex. bubble sort will stop)

### 2.6.7.1. comparised-basedÂ¶

#### 2.6.7.1.1. bubble sortÂ¶

keep swapping adjacent pairs

```
for i=1:n-1
if a[i+1]<a[i]
swap(a,i,i+1)
```

have a flag that tells if you did no swaps - done

number of passes ~ how far elements are from their final positions

O(n^2)

#### 2.6.7.1.2. odd-even sortÂ¶

swap even pairs

then swap odd pairs

parallelizable

#### 2.6.7.1.3. selection sortÂ¶

move largest to current position for i=1:n-1 for j=1:n x = max(x,a[j]) jmax = j swap(a,i,j)

0(n^2)

#### 2.6.7.1.4. insertion sortÂ¶

insert each item into lists for i=2:n insert a[i] into a[1..(i-1)] shift

O(n^2), O(nk) where k is max dist from final position

best when alsmost sorted

#### 2.6.7.1.5. heap sortÂ¶

insert everything into heap

kepp remove max

can do in place by storing everything in array

can use any height-balanced tree instead of heap

traverse tree to get order

ex. B-tree: multi-rotations occur infrequently, average O(log n) height

0(n log n)

#### 2.6.7.1.6. smooth sortÂ¶

adaptive heapsort

collection of heaps (each one is a factor larger than the one before)

can add and remove in essentially constant time if data is in order

#### 2.6.7.1.7. merge sortÂ¶

split into smaller arrays, sort, merge

T(n) = 2T(n/2) + n = 0(n log n)

stable, parallelizable (if parallel, not in place)

#### 2.6.7.1.8. quicksortÂ¶

split on pivot, put smaller elements on left, larger on right

O(n log n) average, O(n^2) worst

O(log n) space

#### 2.6.7.1.9. shell sortÂ¶

generalize insertion sort

insertion-sort all items i apart where i starts big and then becomes small

sorted after last pass (i=1)

O(n^2), O(n^(3/2), âŠ unsure what complexity is

no matter what must be more than n log n

not used much in practice

### 2.6.7.2. not comparison-basedÂ¶

#### 2.6.7.2.1. counting sortÂ¶

use values as array indices in new sort

keep count of number of times at each index

for specialized data only, need small values

0(n) time, 0(k) space

#### 2.6.7.2.2. bucket sortÂ¶

spread data into buckets based on value

sort the buckets

O(n+k) time

buckets could be trees

#### 2.6.7.2.3. radix sortÂ¶

sort each digit in turn

stable sort on each digit

like bucket sort d times

0(d*n time), 0(k+n) space

#### 2.6.7.2.4. meta sortÂ¶

like quicksort, but 0(nlogn) worst case

run quicksort, mergesort in parallel

stop when one stops

there is an overhead but doesnât affect big-oh analysis

ave, worst-cast = O(n log n)

### 2.6.7.3. sorting overviewÂ¶

in exceptional cases insertion-sort or radix-sort are much better than the generic QuickSort / MergeSort / HeapSort answers.

merge a and b sorted - start from the back

## 2.6.8. searchingÂ¶

binary sort canât do better than linear if there are duplicates

if data is too large, we need to do external sort (sort parts of it and write them back to file)

write binary search recursively

use low<= val and high >=val so you get correct bounds

binary search with empty strings - make sure that there is an element at the end of it

âaâ.compareTo(âbâ) is -1

we always round up for these

finding minimum is Î©(n)

pf: assume an element was ignored, that element could have been minimum

simple algorithm - keep track of best so far

thm: n/2 comparisons are necessary because each comparison involves 2 elements

thm: n-1 comparisons are necessary - need to keep track of knowledge gained

every non-min element must win atleast once (move from unkown to known)

find min and max

naive solution has 2n-2 comparison

pairwise compare all elements, array of maxes, array of mins = n/2 comparisons

check min array, max array = 2* (n/2-1)

3n/2-2 comparisons are sufficient (and necessary)

pf: 4 categories (not tested, only won, only lost, both)

not tested-> w or l =n/2 comparisons

w or l -> both = n/2-1

therefore 3n/2-2 comparisons necessary

find max and next-to-max

thm: n-2 + log(n) comparisons are sufficient

consider elimination tournament, pairwise compare elements repeatedly

2nd best must have played best at some point - look for it in log(n)

selection - find ith largest integer

repeatedly finding median finds ith largest

finding median linear yields ith largest linear

T(n) = T(n/2) + M(n) where M(n) is time to find median

quickselect - partition around pivot and recur

average time linear, worst case O(n^2)

median in linear time - quickly eliminate a constant fraction and repeat

partition into n/5 groups of 5

sort each group high to low

find median of each group

compute median of medians recursively

move groups with larger medians to right

move groups with smaller medians to left

now we know 3/10 of elements larger than median of medians

3/10 of elements smaller than median of medians

partition all elements around median of medians

recur like quickselect

guarantees each partition contains at most 7n/10 elements

T(n) = T(n/5) + T(7n/10) + O(n) -> f(x+y)â„f(x)+f(y)

T(n) â€ T(9n/10) + O(n) -> this had to be less than T(n)

## 2.6.9. computational geometryÂ¶

range queries

input = n points (vectors) with preprocessing

output - number of points within any query rectangle

1D

range query is a pair of binary searches

O(log n) time per query

O(n) space, O(n log n) preprocessing time

2D

subtract out rectangles you donât want

add back things you double subtracted

we want rectangles anchored at origin

nD

make regions by making a grid that includes all points

precompute southwest counts for all regions - different ways to do this - tradeoffs between space and time

O(log n) time per query (after precomputing) - binary search x,y

polygon-point intersection

polygon - a closed sequence of segments

simple polygon - has no intersections

thm (Jordan) - a simple polygon partitions the plane into 3 regions: interior, exterior, boundary

convex polygon - intersection of half-planes

polytope - higher-dimensional polygon

raycasting

intersections = odd - interior, even - exterior

check for tangent lines, intersecting corners

O(n) time per query, O(1) space and time

convex case

preprocessing

find an interior point p (pick a vertext or average the vertices)

partition into wedges (slicing through vertices) w.r.t. p

sort wedges by polar angle

query

find containing wedge (look up by angle)

test interior / exterior

check triangle - cast ray from p to point, see if it crosses edge

O(log n) time per query (we binary search the wedges)

O(n) space and O(n log n) preprocessing time

non-convex case

preprocessing

sort vertices by x

find vertical slices

partition into trapezoids (triangle is trapezoid)

sort slice trapezoids by y

query

find containing slice

find trapezoid in slice

report interior/ exterior

O(log n) time per query (two binary searches)

O(n^2) space and O(n^2) preprocessing time

convex hull

input: set of n points

output: smallest containing convex polygon

simple solution 1 - Jarvisâs march

simple solution 2 - Grahamâs scan

mergehull

partition into two sets - computer MergeHull of each set

merge the two resulting CHS

pick point p with least x

form angle-monotone chains w.r.t p

merge chains into angle-sorted list

run Grahamâs scan to form CH

T(n) = 2T(n/2) + n = 0(n log n)

generalizes to higher dimensions

parallelizes

quickhull (like quicksort)

find right and left-most points

partition points along this line

find points farthest from line - make quadrilateral

eliminate all internal points

recurse on 4 remaining regions

concatenate resulting CHs

O(n log n) expected time

O(n^2) worst-case time - ex. circle

generalizes to higher dim, parallelizes

lower bound - CH requires Î©(n log n) comparisons

pf - reduce sorting to convex hull

consider arbitrary set of x_i to be sorted

raise the x_i to the parabola (x_i,x_i^2) - could be any concave function

compute convex hull of the parabola - all connected and line on top

from convex hull we can get sorted x_i => convex hull did sorting so at least n log n comparisons

corollary - Grahamâs scan is optimal

Chanâs convex hull algorithm

assume we know CH size m=h

partitoin points into n/m sets of m each

convex polygon diameter

Voronoi diagrams - input n points - takes O(nlogn) time to compute

problems that are solved

Voronoi cell - the set of points closer to any given point than all others form a convex polygon

generalizes to other metrics (not just Euclidean distance)

a Voronoi cell is unbounded if and only if itâs point is on the convex hull

corollary - convex hull can be computed in linear time

Voronoi diagram has at most 2n-5 vertices and 3n-6 edges

every nearest neighbor of a point defines an edge of the Voronoi diagram

corollary - all nearest neighbors can be computed from the Voronoi diagram in linear time

corollary - nearest neighbor search in O(log n) time using planar subdivision search (binary search in 2D)

connection points of neighboring Voronoi diagram cells form a triangulation (Delanuay triangulation)

a Delanuay triangulation maximizes the minimum angle over all triangulations - no long slivery triangles

Euclidean minimum spanning tree is a subset of the Delanuay triangulation (can be computed easily)

calculating Voronoi diagram

discrete case / bitmap - expand breadth-first waves from all points

time is O(bitmap size)

time is independent of #points

intersecting half planes

Voronoi cell of a point is intersection of all half-planes induced by the perpendicular bisectors w.r.t all other points

use intersection of convex polygons to intersect half-planes (nlogn time per cell)

can be computed in nlogn total time

idea divide and conquer

merging is complex

sweep line using parabolas