INF102 notes
March 01, 2019
Mathias Bøe
BIG DISCLAIMER There is a pdf of these notes that are actually correct and proper; right now this is just laying here as a template, so if you want to contribute, please do so!
Todos
Runtime analysis (Pretty important, but up there)
Theoretical
Empirical (Not that important)
Collections (Really important to understand)
LinkedList
Dynamically resizeable arrays
Sorting
Merge (important)
Quick (important)
Heap (important)
Shell
Insertion
Selection
Union find
MST (Minimum-spanning-tree)
Indexed priorityqueue
Symbol tables
BSTs (important)
2-3 trees
Red-black trees
Hashing (important)
Graphs (important)
Exploration (important)
DFS (important)
BFS (important)
- Shortest Path (important)
Dijkstra (important)
Bellman-Ford
Floyd-Warshall
- Connected components / Strongly connected components
Topological sort
Cycle detection / back edge etc
Notes
Algoritmer
Dijkstra
Formål: Tells you the shortest distance from one node to every other node.
Weighted directed graph.
Start in one node. What can you reach from this node?
Time complexity: O(|E| + |V|log|V|
Searches the quickest “roads” first, this way you can get the solution before you have searched every node.
A problem with Dijkstra is that if you use it where direction is important for example in google maps. If you follow dijkstra blindly then you might end up traveling in the wrong direction for several minutes before Dijkstra realizes its going the wrong way because the weighted graph is more expensive in the wrong direction after a long time.
Sortering
What is Big Oh?
Merge sort
Usually done recursively Divide and conquer.
[2 , 8 , 5 , 3 , 9 , 4 , 1 , 7]
[2 , 8 , 5 , 3] [9 , 4 , 1 , 7]
[2 , 8] [5 , 3] [9 , 4] [1 , 7]
[2 , 8] [3 , 5] [4 , 9] [1 , 7]
[2 , 3 , 5 , 8] [1 , 4 , 7 , 9]
[1 , 2 , 3 , 4 , 5 , 7 , 8 , 9]
Timecomplexity: O(n log n) Merge step, have to visit n items log n from maximum height of a binary tree.
Pseudo code:
mergesort (array a)
if (n == 1)
return a
arrayOne = a[0] ... a[n/2];
arrayTwo = a[n/2+1] ... a[n];
arrayOne = mergesort(arrayOne);
arrayTwo = mergesort(arrayTwo);
return merge(arrayOne, ArrayTwo);
merge(array a, array b)
array c;
while(a and b have elements)
if(a[0]>b[0])
add b[0] to the end of c;
remove b[0] from b;
else
add[0] to the end of c;
remove a[0] from a;
// At this point either a or b is empty
while(a has elements)
add a[0] to the end of c;
remove a[0] from a;
while(b has elements)
add b[0] to the end of c;
remove b[0] from b;
return c;
Quick sort
Recursive algorithm quick sort = pivot
- Correct position in final, sorted array.
- Items to the left of pivot is smaller.
- Items to the rights are larger.
[2 , 6 , 5 , 3 , 8 , 7 , 1 , 0]
Chose pivot.
1.
Move pivot to the right.
[2 , 6 , 5 , 0 , 8 , 7 , 1 , 3]
Find the item with the smallest index that is larger than the pivot (item from left).
Find the item with the largest index that is smaller than the pivot (item from right).
int this case item from left is 6 and item from right is 1.
[2 , 1 , 5 , 0 , 8 , 7 , 6 , 3]
repeat
[2 , 1 , 0 , 5 , 8 , 7 , 6 , 3]
item from left index > item from right index then swap item from left with pivot.
[2 , 1 , 0 , 3 , 8 , 7 , 6 , 5]
Now pivot will be in the right spot.
Repeat this whole process with a new pivot. (We use 7 in this example)
[2 , 1 , 0 , 3 , 6 , 5 , 7 , 8]
After recursion this will end up as a sorted array.
[0 , 1 , 2 , 4 , 5 , 6 , 7 , 8]
How do we chose the pivot? A popular method is called median of three. It looks at the first, last and middle element of an array. We chose the middle one of them, the logic being that this value will be close to the median value of the array.
Heap sort
Heap: ordered binary tree Max heap: parent > child
Build-max-heap: creates max heap from unsorted array Heapify: Similar to build-max-heap, but assumes part of array is already sorted.
[2 , 8 , 5 , 3 , 9 , 1]
- Create max heap
- Remove largest item
- Place item in sorted partition
Represent the array in a tree
DETTE MÅ DU SJÅ MEIR PÅ! https://www.youtube.com/watch?v=2DmK_H7IdTo
Heap sort pseudo code:
Heapsort(A as array) {
BuildMaxHeap(A) {
for i = n to 1
swap (A[1], A[i]);
n = n - 1;
Heapify(A, 1);
}
BuildMaxHeap(A as array) {
n = elements_in(A)
for i = floor (n/2) to 1
Heapify(A,i)
}
Heapify(A as array, i as int) {
left = 2i;
right = 2i+1;
if(left <= n) and (A[left] > A[i])
max = left;
else
max = i;
if(right <= n) and (A[right] > A[max])
max = right;
if(max != i) {
swap (A[i], A[max])
Heapify(A, max)
}
}
Time complexity: O(n log n) build-max-heap: O(n) heapify: O(log n), called n-1 times
jepp it works
Runtime analysis
Theoretical 2
- Common Data Structure Operations
Average time, worst case is equal in these examples. (except Average time uses θ instead of O).
Data Structure | Access | Search | Insertion | Deletion |
---|---|---|---|---|
Array | \(O(1)\) | \(O(n)\) | \(O(n)\) | \(O(n)\) |
Stack | \(O(n)\) | \(O(n)\) | \(O(1)\) | \(O(1)\) |
Queue | \(O(n)\) | \(O(n)\) | \(O(1)\) | \(O(1)\) |
Singly-Linked List | \(O(n)\) | \(O(n)\) | \(O(1)\) | \(O(1)\) |
Doubly-Linked List | \(O(n)\) | \(O(n)\) | \(O(1)\) | \(O(1)\) |
Binary Search Tree | \(O(log(n))\) | \(O(log(n))\) | \(O(log(n))\) | \(O(n)\) |
Red-Black Tree | \(O(log(n))\) | \(O(log(n))\) | \(O(log(n))\) | \(O(log(n))\) |
- Array Sorting Algorithms
Algorithm | Time Complexity: Best | Time Complexity: Average | Time Complexity: Worst | Space Complexity: Worst |
---|---|---|---|---|
Quicksort | \(\Omega(n log(n))\) | \(\theta(n log(n))\) | \(O(n^2)\) | \(O(log(n))\) |
Mergesort | \(\Omega(n log(n))\) | \(\theta(n log(n))\) | \(O(n log(n))\) | \(O(n)\) |
Heapsort | \(\Omega(n log(n))\) | \(\theta(n log(n))\) | \(O(n log(n))\) | \(O(n)\) |
Insertion sort | \(\Omega(n)\) | \(\theta(n^2)\) | \(O(n^2)\) | \(O(1)\) |
Selection sort | \(\Omega(n^2)\) | \(\theta(n^2)\) | \(O(n^2)\) | \(O(1)\) |
Shell sort | \(\Omega(n log(n))\) | \(\theta(n(log(n))^2)\) | \(O(n(log(n))^2)\) | \(O(1)\) |
Collections
You need to be aware of what the collection framework can do, and maybe even more important, what it cannot do.
What collection should you chose?
First, do you need a list, set or a map.
Lists
- A list store lists of objects
- Objects remain in order
- Elements are indexed via an integer
- Checking for particular item in list is slow
- Iterating through lists is relatively fast
- Can be sorted if you want to
// If you only want to add or remove items at the end of list, use ArrayList.
List<String> list1 = new ArrayList<String>();
// Removing or adding items elsewhere in the list?
List<String> list2 = new LinkedList<String>();
Sets
- Only store unique values
- Great for removing duplicates
- Not indexed, unike lists
- Very fast to check if a particular object exists
- If you want to use your own objects, you must implement hashCode() and equals(). THIS IS VERY IMPORTANT, THIS NEEDS TO BE PRACTISED!
- These methods makes it possible to remove duplicates, its also the way the set knows if it contains the same object from before. It’s therefore important that these methods are correct.
// Order is unimportant and OK if it changes?
// HashSet is not ordered
// For some reason some of these code blocks are fucked, so dont mind the slash in </String>
Set<String> set1 = new HashSet</String>();
// Sorted in natural order? Use TreeSet - must implement Comparable for custom types
// (1,2,3,...,a,b,c,...,etc.)
Set<String> set2 = new TreeSet</String>();
// Elements remain in order they were added
Set<String> set3 = new LinkedHashSet</String>();
Maps
- Key value pairs
- Like lookup tables
- Retrieving a value by key is fast
- Iterating over maps is (very) slow
- Maps are not optimised for iteration
- If you want to use your own objects as keys, you must implement hashCode() and equals().
// Keys not in any particular order, and order liable to change.
Map<String, String> map1 = new HashMap<String, String>();
// Keys sorted in natural order - must implement Comparable for custom types
Map<String, String> map2 = new TreeMap<String, String>();
// Keys remain in order added
Map<String, String> map3 = new LinkedHashMap<String, String>();
// There are also the SortedSet and SortedMap interfaces.
Minimum-spanning-tree
ONLY USED IN UNDIRECTED GRAPHS
MST is a greedy algorithm, repeatedly makes locally best choice/decision, ignoring effect on future.
Tree = connected acyclic graph.
Spanning Tree of G = subset of edges of G that form a tree & hit all vertices of G.
MST: Given graph G(V,E) & edge weights w: E->R find a spanning tree T of minimum weight = (\sum w(e) , e \in T)
Greedy algorithm properties:
- Optimal substructure: Optimal solution to problem incorporates optimal solutions to subproblem(s).
- Greedy choice property: Locally optimal choices lead to globally optimal solution.
- Optimal substructure for MST:
if e{u,v} is on edge of some MST
- contract e:
merge the endpoints (u & v)
https://www.youtube.com/watch?v=tKwnms5iRBU
17:00
(HOOOOLY FOOOK THIS IS SOME HARD STOOF)
if T(prime) is a MST of G/e
then T´ \cup{e} is a MST of G
Dynamic program
- Guess edge e in a MST
- Contract e
- Recurse
- Decontract
- Add e to MST
Kræsjkurs
Eksamen
En kombinasjon av kjøretidsanalyse, implementasjon / løsing av problemer og litt teori Trace av innhold i datastrukturer etter en algoritme har kjørt Generelt kommer det til å ligne obligene Pinars eksamener (2014 og tidligere er mer relevante enn Marcs). Ligner veldig på obligene (Big O quiz, tracing osv.).
Kjøretidsanalyse
- Hvor mange atomære operasjoner må vi gjøre?
- Regnes (nesten) alltid i worst case, av og til average case.
- Vi er interessert i hvor raskt algoritmen skalerer
- Notasjoner
- Total = 3N3 + 2N + 1000
- Tilde, ~ = 3N3 O(N3)
- Big Oh, O(N)
for (int i = 0; i < N; i++) {
stuff();
}
Nøstede løkker:
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
stuff();
}
}
Som regel handler dette om ganging (multiplikasjon). Denne algoritmen har kjøretid (O(N^2)).
for (int i = 0; i < N; i++) {
for (int j = 0; j < i; j++) {
stuff();
}
}
0 + 1 + 2 + … + N - 1 = N(N-1)/2, som er (O(N^2))
for (int i = 0; i < N; i++) {
for (int j = i; j < N; j++) {
stuff();
}
}
N-1 0 N-2 + … = (N^2/2)
for (int i = 1; i < N; i *= 2) {
stuff();
}
log(2)N
for (int i = N; i > 0; i /= 2) {
stuff();
}
log(2)N
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j *= 2) {
stuff();
}
}
(N * log(2)N)
Pass på med logaritmer og eksponenter!
Om k og N er variabler som kan ingå i kjøretiden, husk at: (N^(k+1) er ikke O(N^k)) O(log (n)) er ikke O(log(k) (n)) (log(b)(x) = log(x)/log(b)) Enkelte av log er feil fordi eg ikkje huska korleis ein nedfelle ein bokstav
Logaritmisk | \(log N\) |
Linær | \(N\) |
Linearitmisk | \(N log N\) |
Kvadratisk | \(N^2\) |
Eksponensiell | \(1.0005^N\) |
Gitt fire lister med navn, lag en linearitmisk algoritme som finner det leksikografisk første navnet som forekommer i alle 4. Sorter tabellene hver for seg med Arrays.sort() → 4 * (N log N)
fjern duplikater fra hver liste → 4 * N slå sammen listene til en lang liste → 4N Start fra toppen og let etter 4 like etter hverandre → 4N
→ 4 * (N log N) + 4N + 4N + 4N =~ 4(N log N) = O(N log N)
Binærsøk
Vanlig søk gjennom en liste har lineær kjøretid, O(N).
Hvis listen er sortert kan vi utnytte dette ved å gjøre et binærsøk.
idé: Vi deler søkeområdet i 2 helt til vi finner verdien.
static int binarySearch(int [] list, int valueToSearchFor) { int hi = SJÅ PÅ SLIDESA FOR RESTEN AV KØDEN
HashMap
- Map fra key → value
- new HashMap<K,V> Eks: new HashMap<String, Integer>
- Representert som en tabell
- Bruker .hashCode() for å finne indeks i tabellen.
NB! IKKJE BRUK KODEN I DEI NESTE TO SRC BLOKKENE
public void put(K key, V value) {
int index = Math.abs(key.hashCode() % array.length);
array[index] = value;
}
public V get(K key) {
int index = Math.abs(key.hashCode() % array.length);
return array[index];
}
- Hash-collisions
- Separate chaining
- Hver indeks i arrayen er en (lenket)liste.
- Linear probing
- Hvis indeksen er okkupert, legg til på neste ledige index
- Ved oppslag, sjekk fra index →, helt til vi finner elementet/tom plass
- Ved delete, sett inn en dummy-value
- M >> N, hvor M = array.length og N er antall elementer som skal settes inn
- Space/time trade-off
- Liten M → Mange kollisjoner → Lang søketid, lite minnebruk
- Stor M → Få kollisjoner → Stor minnebruk, kort søketid
Generelt O(1) kjøretid
Union find 2
- Når vi kun trenger å vite om to elementer er connected eller ikke
- Union find har
- En int[] array som holder styr på parent
- Et symbol table
- public void union(int p, int q)
- public boolean isConnected(int q, int q)
- private int find(int p)
- Varianter
- QuickFind (slow union)
- QuickUnion (slow find)
- Weighted UF
- QuickUnion-PathCompression
Insertion sort
void sort(Comparable [] a) {
int N = a.length;
for(int i = 1; i < N; i++) {
for(int j = i; j > 0 && less(a[j], a[j-1]; j--) {
exch...
Gamle eksamensoppgåver - Høst 2014
Oppgåve 1 (20%)
I denne oppgaven skal vi se på kjøretidsanalyse Anta at (n) er et positivt heltall. Gi kjøretiden til følgende kodesnutter som “order of growth”. Begrunn alle svarene dine.
a)
s = 0;
for (int i = 1; i < n; i++)
for (int j = i; j <= n; j++)
s = s + 1;
O(2n)
b)
c = 0;
for (int i = 1; i < n; i++) {
for (int j = i; j < n && j%10 != 3; j++) {
c++;
}
i = j-1;
}
does not compile nor compute. Makes absolutely no sense what so ever.
c)
for (int i = 0; i < n; i = i + 2) {
j = 0;
while (j < i)
j = j + 1;
k = 0;
while (k < i)
k = k + 1;
}
O(n/2)
d)
for (int i = 0; i < n; i++) {
j = i;
while (j > 0)
j = j / 2;
}
O(N(N/2))
e)
s = 0;
for (int i = 1; i <= n; i++)
for (int j = 2; j <= n; j++)
for (int k = 3; k <= n; k++)
s = i + j + k;
O(n3)
Oppgåve 2
a) Forklar hvordan insertion sort og merge sort fungerer.
Merge sort divides the array in to two equal sized parts (there may be a 1 size difference) and keeps doing this until it has only two elements left. It then sorts these two elements in the correct order.
[2 , 5 , 3 , 7 , 8 , 9 , 8 , 2]
[2 , 5 , 3 , 7] [8 , 9 , 8 , 2]
[2 , 5] [3 , 7] [8 , 9] [8 , 2]
Sort the “sub-arrays”
[2 , 5][3 , 7] [8 , 9][2 , 8]
Merge the “sub-arrays” and sort them
[2 , 5][3 , 7] [8 , 9][2 , 8]
[2 , 3 , 5 , 7][2 , 8 , 8 , 9]
The sorting behaves like this: If 2 is smaller than 3 put 2 in index 0 If 5 is smaller than 3 put 5 in index 1 If 5 is smaller than 7 put 5 in index 2 If null put 7 index 3 Then sort the last “sub-array”
[2 , 3 , 5 , 7][2 , 8 , 8 , 9]
[2 , 2 , 3 , 5 , 7 , 8 , 8 , 9]
No more “sub-arrays” sorting is finished. Return sorted array.
Er worst case kjøretid O(n log n)
Insertionsort går gjennom arrayet frå venstre til høgre. Algoritmen går gjennom arrayet og sammenligna det med elementet til venstre og sorterar utifrå dette.
Underlined numbers are sorted.
- ({2 , 8 , 5 , 3 , 9 , 4})
- ({2 , 8 , 5 , 3 , 9 , 4})
- ({2 , 5 , 8 , 3 , 9 , 4})
- ({2 , 5 , 3 , 8 , 9 , 4})({2 , 3 , 5 , 8 , 9 , 4})({2 , 3 , 5 , 8 , 9 , 4})
- ({2 , 3 , 5 , 8 , 9 , 4})
- ({2 , 3 , 5 , 8 , 4 , 9})({2 , 3 , 5 , 4 , 8 , 9})({2 , 3 , 4 , 5 , 8 , 9})({2 , 3 , 4 , 5 , 8 , 9})
Sorted. Insertion sort has a worst case runtime of (O(n^2));