elements of programming interviews

personal

Interviewing Strats

Clarify the question: if the question seems unnecessarily hard, then clarify it
work on concrete examples
spell out the brute force solution
think out loud
apply patterns
clarify assumptions that inputs are valid
test for corner cases
- write unit tests if there is time
- add code checking for the result
- syntax - lots of bad syntax shows you have no coding experience
memory management: avoid memory management opportunities all together
keep the solutions short

Bit Primitives

integers in Python3 are unbounded, can use max of memory size
- sys.maxsize can be used to find word-size
  - maximum value integer that can be stored in the word
- python does not have unsigned shifts, since integers have infinite precision
how to use bitwise operators?
how to use masks?
fast ways to clear the lowermost set bit
understand signedness and implications to shifting
consider using cache for brute-forcing small inputs
commutativity and associativity can be used to perform operations in parallel and reorder ops
key methods for numeric types
- abs, math.ciel, math.floor, min, max, pow, math.sqrt
floats are not infinite precision, and easy to refer to infinity as float('inf') and float('-inf')
key methods for random:
- random.randrange, random.randint, random.random(), random.shuffle(A) and random.choice(A)
parity - parity of a binary word is 1 if the word is odd, otherwise 0
- parity checks are used to detect single bit errors in data storage and communication

Arrays

bisect.bisect and bisect.bisect_left and bisect.bisect_right gives you ranges for sorted lists that tell you where to put new elements so the list remains sorted
- bisect_right and bisect_left are one off for cardinatity 1 lists
any([]) -> False

Strings

strings in python are immutable
- concatenating a single character n times to a string in a for loop has O(n^2) complexity
strings often have brute-force solutions that use O(n) but can be reduced to O(1)

Linked Lists

often conceptually simple, and more about cleanly coding what’s specified
consider using a dummy head to avoid checking for empty lists
easy to forget update next/previous for head and tail
algorithms that operate on singly linked lists often benefit from using two iterators
- example: hierarchical navigable small words, skip lists

Stacks and Queues

Stacks
- push/pop on a LL implemented stack has O(1) ops
- push/pop on an array implemented stack has O(1) ops, but maximum size, unless there is dynamic resizing, where the amortized is O(1)
- last-in/first-out semantics makes it good for making reverse iterators
  - parsing typically benefits from a stack
  - consider augmenting the basic stack or queue data structure to support additional ops
Queue
- queue’s with LL’s have O(1) ops for enqueue and dequeue
- often has a “peek” operation
- push/inject for front/back ops in a deque
  - note that you can use collections.deque()
    - which has popleft and popright functions
- queues are ideal when order needs to be preserved

Binary Trees

full binary tree is when every node other than the leaves have two children
perfect binary tree is a full binary tree where all the leaves are at the same depth, and every parent has two children
- 2^(h+1)-1 nodes -> 2^(h+1) are leaves
complete binary tree is a binary tree where every level is filled and all leaves are to the left as possible
- n nodes has height floor(log(n))
recursive algorithms are well suited to problems on binary trees
you might be able to reuse existing tree nodes to reduce space complexity to o(1)
consider left and right skewed trees when doing complexity analysis, O(h) complexity, where h is the tree height, translates into O(log n) complexity for balanced trees, and O(n) complexity for skewed trees
“parent” field makes code simpler
easy to make the mistake of treating a node that has a single child as a leaf

Heaps

Specialized binary tree
complete binary tree that satisfies the heap property - the key at each node is as great as the keys stored in its children
max-heap supports O(log n) insertions, O(1) time lookup, and O(log n) deletion of the max element. The extract max op is defined to delete and return max element
sometimes referred to as a priority queue, each element has a “priority”, and deletion removes the element with the highest priority
min-heap is a completely symmetric version of the data structure and supports an O(1) looked up for the minimum element
you can use a heap to keep track the k lognest string in a streaming string system, each insert is log n time, and if you have a min-heap, you can choose when to drop
use a heap when you care about the largest or smallest elements, and do not have to support fast lookup, delete, or search operations
a heap is a good choice when you need to compute K largest or K smallest elements, former should use a min-heap, latter should use a max heap
python has the heapq module to implement a heap, but it specifically only has a min-heap. Insert the negative for integers or floats, or for objects, implement __lt()__ appropriately

Searching

Search algorithms can be classifed in a number of ways. Is the underlying collection static or dynamic, and are inserts/deletes interleaved with searching?
- can you preprocess the data?
- are there special statistical properties of the data that can be exploited?
- should we operate directly on the data or transform it?

binary search

O(log n) search time, but requires a sort that takes at least O(n log n) time

how do you implement a binary search?

def bsearch(t, A):
    L, U = 0, len(a) - 1
    while L <= U:
        M = L + (U-L)/2
        if A[M] < t:
            L = M + 1
        elif A[M] == t:
            return M
        else:
            U = M - 1
    return -1

binary search is applicable than just sorted arrays, you can use it to search an interval of real numbers or integers
if we use sorting, and the computation performed after sorting is faster than sorting, look for solutions that do not perform a complete sort

general searching

objects have comparable __lt()__ and __gt()__ for user defined types
langauge already knows how to compare built-in types

Hash tables

Updating keys in hash tables -> don’t do it
- delete the key, then update it, then move it back
- keys should generally be not mutable
we want to aim for rolling hash functions
- hash functions should examine all the characters in a string, and not let one character dominate
- we also want to implement a rolling hash function, where if one character is removed from the start and added to the end, it should not effect the final result
Hash tables is a good data structure to represent a dictionary
- sometimes a trie gets used to store a dynamic set of strings, and has computational advantages when many letters are repeated
anagram example: to find anagrams, you can preprocess each trail and sort, and append a list on the other side
designing a hash table class
- you can implement __hash__, and hash is a builtin
hash tables have the best theoretical and real world performance for lookup, insert, and delete.
consider using a hash table as a signature to enhance performance
consider using a precomputed lookup table
when defining the own type that will be put in the hash table, make sure to know the relationship between logical equality and the fields the hash function inspects
- anytime equality is implemented, imperative that the hash function is also implemented
sometimes a multimap can be used, which contains multiple values for single key
hash table types set, dict, collections.defaultdict, collections.Counter
- set.remove raises an error if the value is not in the list
- set.discard does not alert if the value is not in the list
note that not every type is hashable, mutable containers are notably not hashable

Sorting

sorting is a common problem
naive sorting algorithms run in O(n^2) time, some run in O(nlog) time
sorting table:

sorting in place (O(1) space)? stable (duplicate items remain in the same order)?

heapsort yes no

merge sort no yes

quicksort yes yes
well implemented quicksort is usually the best choice
- for shorter arrays (<10), insertion sort is easier to code and faster
- if every element is known to be at most k places away from its final location, a min-heap can be used to get O(n log k) algorithm
- if there are a small amount of distinct keys, counting sort works well
  - if there are many duplicate keys, we add the keys to a BST, with linked list for elements which have the same key; the sorted result can be derived from an in-order traversal of the BST
most sorting algorithms are not stable, mergesort, carefully implemented, can be made stable.
most sorting routines are based on a compare function. It’s also possible to use numerical attributes directly with radix sort
sorted() returns a copy using __lt__
list.sort sorts in place
time complexity of any reasonable library is O(n log n), most sorting functionalities are based on quicksort, which has O(1) space
sorting problems usually come in two flavors:
- using sorting to make the subsequent steps in an algorithm simpler
  - use a library sort function, possibly with custom comparator
- desinging a custom subroutine
  - use a data structure like a BST, heap, or array indexed by values
certain problems become easier to understand, as well as solve, when the input is sorted
- best when inputs have natural order, or preprocessing step to speed up searching
for specialized input (very small range of values or small number of values), it’s possible to sort in O(n) time
sorting can be implemented with less space than required by a brute force approach
sometimes it is not obvious what to sort on
sort -> inplace for list
- implements stable in place sort for list objects, returning None.
- takes two arguments, both optional:
  - key - assumed to be a function which takes list elements and maps to objects which are comparable
  - reverse - whether to sort in descending order, by default sort does it in ascending order
to sort an iterable, use sorted()
- also take identical key and reverse objects

sorting	in place (O(1) space)?	stable (duplicate items remain in the same order)?
heapsort	yes	no
merge sort	no	yes
quicksort	yes	yes

Binary Search Trees

BSTs are workhorse data structures and can be used to solve almost every data structures problem reasonably efficiently, and offer the ability to efficiently search for a key as well as find min and max elements
Just a binary tree where the ndoes store keys that are comparable such as integers or strings
Key lookup, insertion, and deletion take time proportional to the height of the tree, which can be worst-case O(n)
- some implementations of insert and delete which guarentee that the tree heigh is O(log n)
Common mistake with BSTs -> object that’s present in the BST is not updated
- avoid putting mutable objects in the BST! If you must mutate, remove, update, then add it back
BSTs give you the ability to find min and max elements, and find the next largest/next smallest element
- implement some kind of ordering alongside this, finding the min, the max, next largest/smallest, and lookup, delete, and find take O(log n) time
- both BSTs and hash tables use O(n) space

BST implementation

class BSTNode:
    def __init__(self, data=None, left=None, right=None):
        self.data, self.left, self.right = data, left, right

def search_bst(tree, key):
    return (tree
            if not tree or tree.data == key else search_bst(tree.left, key)
            if k < tree.data else search_bst(tree.right, key))

-Since the program that descends the tree with each step, and spends O(1) time per level, time complexity is O(h), where h is the height of the tree

We can iterate through the elements in sorted order in time O(n) regardless of balance
Some problems need a combination of a BST and a hashtable. If you need to order something, you can use the dict to contain additional, mutable information, and the dict can contain direct elements in the tree
Sometimes you need to augment a BST to make it possible to manipulate more complicated data such as intervals and give more complex queries, such as the number of elements in a range
BST property is a global property - a binary search tree may have the property that each node’s key is greater than the key at its left child and smaller than the key at its right child, but it may not be a BST
NO BST LIBRARY IN PYTHON
- sortedcontainers or bintrees can be used

Recursion

Recursion is an approach where the solution depends partially on solutions to smaller instances of related problems
Often appropriate when the input is expressed using recursive rules, such as computer grammar
Searching, enumeration, divide and conquer, decomposing a complex problem into a set of similar smaller instance are all scenarios where recursion may be suitable
A recursive function consists of base cases and calls to the same function with different arguments
Two main ingredients for successful recursion:
- identifying the base cases
- ensuring progress
A divide and conquer algorithm works by repeatedly decomposing a problem into two or more smaller independent subproblems of the same kind, until it gets to instances that are simple enough to be solved directly
- solutions to the subproblems are then combined to give a solution to the original problem
divide and conquer is not synonmyous with recursion
- the problem is divided into two or more independent smaller problems that are of the same type as the original problem, recursion is more general, there may be a single subproblem, but the subproblems may not be independent (dynamic programming), and they may not be as the same type as the original (regex matching).
In addition, sometimes to improve runtime and reduce space complexity, you can implement divide and conquer using iteration
use recursion as an alterative to deeply nested iteration loops
removing recursion can be done by mimicking call stack with the stack data structure
recursion can be easily removed from tail recursive programs by using the while loop, no stack is needed
if recursive functions may end up being called with the same arguments more than once, cache the results (aka dp)

DP

General technique for solving optimization, search, and counting problems that can be decomposed
consider using DP when you have to make choices to arrive at the solution, especially when it involves subproblems
key difference between divide-and-conquer and dp is that DP shows the same subproblem may reoccur

Greedy Algorithms and Invariants

greedy algorithms computes a solution in steps, at each step the algorithm makes a decision that is locally optimum, and it never changes the decision
does not necessarily produce the optimal solution. Consider making change for 48 pence in old British currency where the coins came in at 30, 24, 12, 6, 3, and 1

عجفت الغور