# 2.6. algorithms¶

Some notes on algorithms following the book Introduction to algorithms and based on UVA’s course.

## 2.6.1. asymptotics¶

• Big-O

• big-oh: O(g): functions that grow no faster than g - upper bound, runs in time less than g

• $$f(n) \leq c\cdot g(n)​$$ for some c, large n

• set of functions s.t. there exists c,k>0, 0 ≤ f(n) ≤ c*g(n), for all n > k

• big-theta: Θ(g): functions that grow at the same rate as g

• big-oh(g) and big-theta(g) - asymptotic tight bound

• big-omega: Ω(g): functions that grow at least as fast as g

• f(n)≥c*g(n) for some c, large n

• Example: f = 57n+3

• O(n^2) - or anything bigger

• Θ(n)

• Ω(n^.5) - or anything smaller

• input must be positive

• We always analyze the worst case run-time

• little-omega: omega(g) - functions that grow faster than g

• little-o: o(g) - functions that grow slower than g

• we write f(n) ∈ O(g(n)), not f(n) = O(g(n))

• They are all reflexive and transitive, but only Θ is symmetric.

• Θ defines an equivalence relation.

• add 2 functions, growth rate will be $$O(max(g_1(n)+g_2(n))$$ (same for sequential code)

• recurrence thm:$$f(n) = O(n^c) => T(n) = 0(n^c)$$

• $$T(n) = a*T(n/b) + f(n)$$

• $$c = log_b(a)$$

• over bounded number of elements, almost everything is constant time

## 2.6.2. recursion¶

• moving down/right on an NxN grid - each path has length (N-1)+(N-1)

• we must move right N-1 times

• ans = (N-1+N-1 choose N-1)

• for recursion, if a list is declared outside static recursive method, it shouldn’t be static

• generate permutations - recursive, add char at each spot

• think hard about the base case before starting

• look for lengths that you know

• look for symmetry

• n-queens - one array of length n, go row by row

## 2.6.3. dynamic programming¶

//returns max value for knapsack of capacity W, weights wt, vals val
int knapSack(int W, int wt[], int val[])
int n = wt.length;
int K[n+1][W+1];
//build table K[][] in bottom up manner
for (int i = 0; i <= n; i++)
for (int w = 0; w <= W; w++)
if $(i==0 \vert \vert w==0)$ // base case
K[i][w] = 0;
else if (wt[i-1] <= w) //max of including weight, not including
K[i][w] = max(val[i-1] + K[i-1][w-wt[i-1]], K[i-1][w]);
else //weight too large
K[i][w] = K[i-1][w];
return K[n][W];


## 2.6.5. hungarian¶

• assign N things to N targets, each with an associated cost

## 2.6.6. max-flow¶

• A list of pipes is given, with different flow-capacities. These pipes are connected at their endpoints. What is the maximum amount of water that you can route from a given starting point to a given ending point?

## 2.6.7. sorting¶

• you can assume w.l.o.g. all input numbers are unique

• sorting requires Ω(nlog n) (proof w/ tree)

• considerations: worst case, average, in practice, input distribution, stability (order coming in is preserved for things with same keys), in-situ (in-place), stack depth, having to read/write to disk (disk is much slower), parallelizable, online (more data coming in)

• adaptive - changes its behavior based on input (ex. bubble sort will stop)

### 2.6.7.1. comparised-based¶

#### 2.6.7.1.1. bubble sort¶

• keep swapping adjacent pairs

for i=1:n-1
if a[i+1]<a[i]
swap(a,i,i+1)

• have a flag that tells if you did no swaps - done

• number of passes ~ how far elements are from their final positions

• O(n^2)

#### 2.6.7.1.2. odd-even sort¶

• swap even pairs

• then swap odd pairs

• parallelizable

#### 2.6.7.1.3. selection sort¶

• move largest to current position for i=1:n-1 for j=1:n x = max(x,a[j]) jmax = j swap(a,i,j)

• 0(n^2)

#### 2.6.7.1.4. insertion sort¶

• insert each item into lists for i=2:n insert a[i] into a[1..(i-1)] shift

• O(n^2), O(nk) where k is max dist from final position

• best when alsmost sorted

#### 2.6.7.1.5. heap sort¶

• insert everything into heap

• kepp remove max

• can do in place by storing everything in array

• can use any height-balanced tree instead of heap

• traverse tree to get order

• ex. B-tree: multi-rotations occur infrequently, average O(log n) height

• 0(n log n)

#### 2.6.7.1.6. smooth sort¶

• adaptive heapsort

• collection of heaps (each one is a factor larger than the one before)

• can add and remove in essentially constant time if data is in order

#### 2.6.7.1.7. merge sort¶

• split into smaller arrays, sort, merge

• T(n) = 2T(n/2) + n = 0(n log n)

• stable, parallelizable (if parallel, not in place)

#### 2.6.7.1.8. quicksort¶

• split on pivot, put smaller elements on left, larger on right

• O(n log n) average, O(n^2) worst

• O(log n) space

#### 2.6.7.1.9. shell sort¶

• generalize insertion sort

• insertion-sort all items i apart where i starts big and then becomes small

• sorted after last pass (i=1)

• O(n^2), O(n^(3/2), … unsure what complexity is

• no matter what must be more than n log n

• not used much in practice

### 2.6.7.2. not comparison-based¶

#### 2.6.7.2.1. counting sort¶

• use values as array indices in new sort

• keep count of number of times at each index

• for specialized data only, need small values

• 0(n) time, 0(k) space

#### 2.6.7.2.2. bucket sort¶

• spread data into buckets based on value

• sort the buckets

• O(n+k) time

• buckets could be trees

#### 2.6.7.2.3. radix sort¶

• sort each digit in turn

• stable sort on each digit

• like bucket sort d times

• 0(d*n time), 0(k+n) space

#### 2.6.7.2.4. meta sort¶

• like quicksort, but 0(nlogn) worst case

• run quicksort, mergesort in parallel

• stop when one stops

• there is an overhead but doesn’t affect big-oh analysis

• ave, worst-cast = O(n log n)

### 2.6.7.3. sorting overview¶

• in exceptional cases insertion-sort or radix-sort are much better than the generic QuickSort / MergeSort / HeapSort answers.

• merge a and b sorted - start from the back

## 2.6.8. searching¶

• binary sort can’t do better than linear if there are duplicates

• if data is too large, we need to do external sort (sort parts of it and write them back to file)

• write binary search recursively

• use low<= val and high >=val so you get correct bounds

• binary search with empty strings - make sure that there is an element at the end of it

• “a”.compareTo(“b”) is -1

• we always round up for these

• finding minimum is Ω(n)

• pf: assume an element was ignored, that element could have been minimum

• simple algorithm - keep track of best so far

• thm: n/2 comparisons are necessary because each comparison involves 2 elements

• thm: n-1 comparisons are necessary - need to keep track of knowledge gained

• every non-min element must win atleast once (move from unkown to known)

• find min and max

• naive solution has 2n-2 comparison

• pairwise compare all elements, array of maxes, array of mins = n/2 comparisons

• check min array, max array = 2* (n/2-1)

• 3n/2-2 comparisons are sufficient (and necessary)

• pf: 4 categories (not tested, only won, only lost, both)

• not tested-> w or l =n/2 comparisons

• w or l -> both = n/2-1

• therefore 3n/2-2 comparisons necessary

• find max and next-to-max

• thm: n-2 + log(n) comparisons are sufficient

• consider elimination tournament, pairwise compare elements repeatedly

• 2nd best must have played best at some point - look for it in log(n)

• selection - find ith largest integer

• repeatedly finding median finds ith largest

• finding median linear yields ith largest linear

• T(n) = T(n/2) + M(n) where M(n) is time to find median

• quickselect - partition around pivot and recur

• average time linear, worst case O(n^2)

• median in linear time - quickly eliminate a constant fraction and repeat

• partition into n/5 groups of 5

• sort each group high to low

• find median of each group

• compute median of medians recursively

• move groups with larger medians to right

• move groups with smaller medians to left

• now we know 3/10 of elements larger than median of medians

• 3/10 of elements smaller than median of medians

• partition all elements around median of medians

• recur like quickselect

• guarantees each partition contains at most 7n/10 elements

• T(n) = T(n/5) + T(7n/10) + O(n) -> f(x+y)≥f(x)+f(y)

• T(n) ≤ T(9n/10) + O(n) -> this had to be less than T(n)

## 2.6.9. computational geometry¶

• range queries

• input = n points (vectors) with preprocessing

• output - number of points within any query rectangle

• 1D

• range query is a pair of binary searches

• O(log n) time per query

• O(n) space, O(n log n) preprocessing time

• 2D

• subtract out rectangles you don’t want

• add back things you double subtracted

• we want rectangles anchored at origin

• nD

• make regions by making a grid that includes all points

• precompute southwest counts for all regions - different ways to do this - tradeoffs between space and time

• O(log n) time per query (after precomputing) - binary search x,y

• polygon-point intersection

• polygon - a closed sequence of segments

• simple polygon - has no intersections

• thm (Jordan) - a simple polygon partitions the plane into 3 regions: interior, exterior, boundary

• convex polygon - intersection of half-planes

• polytope - higher-dimensional polygon

• raycasting

• intersections = odd - interior, even - exterior

• check for tangent lines, intersecting corners

• O(n) time per query, O(1) space and time

• convex case

• preprocessing

• find an interior point p (pick a vertext or average the vertices)

• partition into wedges (slicing through vertices) w.r.t. p

• sort wedges by polar angle

• query

• find containing wedge (look up by angle)

• test interior / exterior

• check triangle - cast ray from p to point, see if it crosses edge

• O(log n) time per query (we binary search the wedges)

• O(n) space and O(n log n) preprocessing time

• non-convex case

• preprocessing

• sort vertices by x

• find vertical slices

• partition into trapezoids (triangle is trapezoid)

• sort slice trapezoids by y

• query

• find containing slice

• find trapezoid in slice

• report interior/ exterior

• O(log n) time per query (two binary searches)

• O(n^2) space and O(n^2) preprocessing time

• convex hull

• input: set of n points

• output: smallest containing convex polygon

• simple solution 1 - Jarvis’s march

• simple solution 2 - Graham’s scan

• mergehull

• partition into two sets - computer MergeHull of each set

• merge the two resulting CHS

• pick point p with least x

• form angle-monotone chains w.r.t p

• merge chains into angle-sorted list

• run Graham’s scan to form CH

• T(n) = 2T(n/2) + n = 0(n log n)

• generalizes to higher dimensions

• parallelizes

• quickhull (like quicksort)

• find right and left-most points

• partition points along this line

• find points farthest from line - make quadrilateral

• eliminate all internal points

• recurse on 4 remaining regions

• concatenate resulting CHs

• O(n log n) expected time

• O(n^2) worst-case time - ex. circle

• generalizes to higher dim, parallelizes

• lower bound - CH requires Ω(n log n) comparisons

• pf - reduce sorting to convex hull

• consider arbitrary set of x_i to be sorted

• raise the x_i to the parabola (x_i,x_i^2) - could be any concave function

• compute convex hull of the parabola - all connected and line on top

• from convex hull we can get sorted x_i => convex hull did sorting so at least n log n comparisons

• corollary - Graham’s scan is optimal

• Chan’s convex hull algorithm

• assume we know CH size m=h

• partitoin points into n/m sets of m each

• convex polygon diameter

• Voronoi diagrams - input n points - takes O(nlogn) time to compute

• problems that are solved

• Voronoi cell - the set of points closer to any given point than all others form a convex polygon

• generalizes to other metrics (not just Euclidean distance)

• a Voronoi cell is unbounded if and only if it’s point is on the convex hull

• corollary - convex hull can be computed in linear time

• Voronoi diagram has at most 2n-5 vertices and 3n-6 edges

• every nearest neighbor of a point defines an edge of the Voronoi diagram

• corollary - all nearest neighbors can be computed from the Voronoi diagram in linear time

• corollary - nearest neighbor search in O(log n) time using planar subdivision search (binary search in 2D)

• connection points of neighboring Voronoi diagram cells form a triangulation (Delanuay triangulation)

• a Delanuay triangulation maximizes the minimum angle over all triangulations - no long slivery triangles

• Euclidean minimum spanning tree is a subset of the Delanuay triangulation (can be computed easily)

• calculating Voronoi diagram

• discrete case / bitmap - expand breadth-first waves from all points

• time is O(bitmap size)

• time is independent of #points

• intersecting half planes

• Voronoi cell of a point is intersection of all half-planes induced by the perpendicular bisectors w.r.t all other points

• use intersection of convex polygons to intersect half-planes (nlogn time per cell)

• can be computed in nlogn total time

1. idea divide and conquer

• merging is complex

1. sweep line using parabolas