Cracking the Coding Interview4 Notes

Table of Contents

Site

Before the Interview

Resume screeners look for the same things that interviewers do: »»Are you smart? »»Can you code?

Writing Strong Bullets: For each role, try to discuss your accomplishments with the following approach: “Accomplished X by implementing Y which led to Z.” Here’s an example:

»»“Reduced object rendering time by 75% by applying Floyd’s algorithm, leading to a 10% reduction in system boot time.” Here’s another example with an alternate wording:

»»“Increased average match accuracy from 1.2 to 1.5 by implementing a new comparison algorithm based on windiff.”

Languages: Knowing which languages to list on your resume is always a tricky thing. Do you list everything you’ve ever worked with? Or only the ones that you’re more comfortable with (even though that might only be one or two languages)? I recommend the following compromise: list most languages you’ve used, but add your experience level. This approach is shown below:

»»“Languages: Java (expert), C++ (proficient), JavaScript (prior experience), C (prior experience)”

What questions should you ask the interviewer?

Most interviewers will give you a chance to ask them questions. The quality of your questions will be a factor, whether subconsciously or consciously, in their decisions. Some questions may come to you during the interview, but you can - and should - prepare questions in advance. Doing research on the company or team may help you with preparing questions. Questions can be divided into three different categories:

Genuine Questions: These are the questions you actually want to know. Here are a few ideas of questions that are valuable to many candidates:

  1. “How much of your day do you spend coding?”
  2. “How many meetings do you have every week?”
  3. “What is the ratio of testers to developers to product managers? What is the interaction like? How does project planning happen on the team?”

Insightful Questions: These questions are designed to demonstrate your deep knowledge of programming or technologies.

  1. “I noticed that you use technology X. How do you handle problem Y?”
  2. “Why did the product choose to use the X protocol over the Y protocol? I know it has benefits like A, B, C, but many companies choose not to use it because of issue D.”

Passion Questions: These questions are designed to demonstrate your passion for technology.

  1. “I’m very interested in scalability. Did you come in with a background in this, or what opportunities are there to learn about it?”
  2. “I’m not familiar with technology X, but it sounds like a very interesting solution. Could you tell me a bit more about how it works?”

The Interview and Beyond

Five Steps to a Technical Questions

A technical interview question can be solved utilizing a five step approach:

  1. Ask your interviewer questions to resolve ambiguity.
  2. Design an Algorithm
  3. Write pseudo-code first, but make sure to tell your interviewer that you’re writing pseudo-code! Otherwise, he/she may think that you’re never planning to write “real” code, and many interviewers will hold that against you.
  4. Write your code, not too slow and not too fast.
  5. Test your code and carefully fix any mistakes.

Step 1: Ask Questions

Good questions might be things like: What are the data types? How much data is there? What assumptions do you need to solve the problem? Who is the user? Example: “Design an algorithm to sort a list.”»»Question: What sort of list? An array? A linked list? »»Answer: An array. »»Question: What does the array hold? Numbers? Characters? Strings? »»Answer: Numbers. »Question: And are the numbers integers? »»Answer: Yes. »»Question: Where did the numbers come from? Are they IDs? Values of something? »»Answer: They are the ages of customers. »»Question: And how many customers are there? »»Answer: About a million. We now have a pretty different problem: sort an array containing a million integers between 0 and 130. How do we solve this? Just create an array with 130 elements and count the number of ages at each value.

Step 2: Design an Algorithm

Designing an algorithm can be tough, but our five approaches to algorithms can help you out (see pg 34). While you’re designing your algorithm, don’t forget to think about: »»What are the space and time complexities? »»What happens if there is a lot of data? »»Does your design cause other issues? (i.e., if you’re creating a modified version of a binary search tree, did your design impact the time for insert / find / delete?) »»If there are other issues, did you make the right trade-offs? »»If they gave you specific data (e.g., mentioned that the data is ages, or in sorted order), have you leveraged that information? There’s probably a reason that you’re given it.

Step 3: Pseudo-Code

Writing pseudo-code first can help you outline your thoughts clearly and reduce the number of mistakes you commit. But, make sure to tell your interviewer that you’re writing pseudo-code first and that you’ll follow it up with “real” code. Many candidates will write pseudo-code in order to ‘escape’ writing real code, and you certainly don’t want to be confused with those candidates.

Step 4: Code

You don’t need to rush through your code; in fact, this will most likely hurt you. Just go at a nice, slow methodical pace. Also, remember this advice: »»Use Data Structures Generously: Where relevant, use a good data structure or define your own. For example, if you’re asked a problem involving finding the minimum age for a group of people, consider defining a data structure to represent a Person. This shows your interviewer that you care about good object oriented design.

»»Don’t Crowd Your Coding: This is a minor thing, but it can really help. When you’re writing code on a whiteboard, start in the upper left hand corner – not in the middle. This will give you plenty of space to write your answer.

Step 5: Test

Yes, you need to test your code! Consider testing for: »»Extreme cases: 0, negative, null, maximums, etc »»User error: What happens if the user passes in null or a negative value? »»General cases: Test the normal case. If the algorithm is complicated or highly numerical (bit shifting, arithmetic, etc), consider testing while you’re writing the code rather than just at the end. Also, when you find mistakes (which you will), carefully think through why the bug is occuring. One of the worst things I saw while interviewing was candidates who recognized a mistake and tried making “random” changes to fix the error.

Five Algorithm Approaches

  • APPROACH I: EXAMPLIFY

    Description: Write out specific examples of the problem, and see if you can figure out a general rule.

  • APPROACH II: PATTERN MATCHING

    Description: Consider what problems the algorithm is similar to, and figure out if you can modify the solution to develop an algorithm for this problem.

  • APPROACH III: SIMPLIFY & GENERALIZE

    Description: Change a constraint (data type, size, etc) to simplify the problem. Then try to solve it. Once you have an algorithm for the “simplified” problem, generalize the problem again.

  • APPROACH IV: BASE CASE AND BUILD

    Description: Solve the algorithm first for a base case (e.g., just one element). Then, try to solve it for elements one and two, assuming that you have the answer for element one. Then, try to solve it for elements one, two and three, assuming that you have the answer to elements one and two.

  • APPROACH V: DATA STRUCTURE BRAINSTORM

    Description: This is hacky, but it often works. Simply run through a list of data structures and try to apply each one. Example: Numbers are randomly generated and stored into an (expanding) array. How would you keep track of the median? Data Structure Brainstorm: »»Linked list? Probably not – linked lists tend not to do very well with accessing and sorting numbers. »»Array? Maybe, but you already have an array. Could you somehow keep the elements sorted? That’s probably expensive. Let’s hold off on this and return to it if it’s needed. »»Binary tree? This is possible, since binary trees do fairly well with ordering. In fact, if the binary search tree is perfectly balanced, the top might be the median. But, be careful – if there’s an even number of elements, the median is actually the average of the middle two elements. The middle two elements can’t both be at the top. This is probably a workable algorithm, but let’s come back to it. »»Heap? A heap is really good at basic ordering and keeping track of max and mins. This is actually interesting – if you had two heaps, you could keep track of the biggest half and the smallest half of the elements. The biggest half is kept in a min heap, such that the smallest element in the biggest half is at the root. The smallest half is kept in a max heap, such that the biggest element of the smallest half is at the root. Now, with these data structures, you have the potential median elements at the roots. If the heaps are no longer the same size, you can quickly “rebalance” the heaps by popping an element off the one heap and pushing it onto the other. Note that the more problems you do, the better instinct you will develop about which data structure to apply.

Questions

Chapter 7

  • 7.9

    data block allocation -> “Practical File System Design” or B+ trees

Chapter 8

Chapter 11

Elevator algorithms could elimate random bouncing from track to track.

Chapter 13

  • 13.5 What is the significance of the keyword “volatile” in C?

    SOLUTION Volatile informs the compiler that the value of the variable can change from the outside, without any update done by the code. Declaring a simple volatile variable:

    volatile int x;
    int volatile x;
    

    Declaring a pointer variable for a volatile memory (only the pointer address is volatile):

    volatile int * x;
    int volatile * x;
    

    Declaring a volatile pointer variable for a non-volatile memory (only memory contained is volatile):

    int * volatile x;
    

    Declaring a volatile variable pointer for a volatile memory (both pointer address and memory contained are volatile):

    volatile int * volatile x;
    int volatile * volatile x;
    

    Volatile variables are not optimized, but this can actually be useful. Imagine this function:

    int opt = 1;
    void Fn(void) {
    
    start:
    
    if (opt == 1) goto start;
    
    else break;
    }
    

    At first glance, our code appears to loop infinitely. The compiler will try to optimize it to:

    void Fn(void) {
    
    start:
    
    int opt = 1;
    
    if (true)
    
    goto start;
    }
    

    This becomes an infinite loop. However, an external program might write ‘0’ to the location of variable opt. Volatile variables are also useful when multi-threaded programs have global variables and any thread can modify these shared variables. Of course, we don’t want optimization on them.

Chapter 14

Chapter 16

  • 16.1 Explain the following terms: virtual memory, page fault, thrashing.

    SOLUTION

    Virtual memory is a computer system technique which gives an application program the im- pression that it has contiguous working memory (an address space), while in fact it may be physically fragmented and may even overflow on to disk storage. Systems that use this tech- nique make programming of large applications easier and use real physical memory (e.g. RAM) more efficiently than those without virtual memory. http://en.wikipedia.org/wiki/Virtual_memory

    Page Fault: A page is a fixed-length block of memory that is used as a unit of transfer be- tween physical memory and external storage like a disk, and a page fault is an interrupt (or exception) to the software raised by the hardware, when a program accesses a page that is mapped in address space, but not loaded in physical memory. http://en.wikipedia.org/wiki/Page_fault

    Thrash is the term used to describe a degenerate situation on a computer where increas- ing resources are used to do a decreasing amount of work. In this situation the system is said to be thrashing. Usually it refers to two or more processes accessing a shared resource repeatedly such that serious system performance degradation occurs because the system is spending a disproportionate amount of time just accessing the shared resource. Resource access time may generally be considered as wasted, since it does not contribute to the ad- vancement of any process. In modern computers, thrashing may occur in the paging system (if there is not ‘sufficient’ physical memory or the disk access time is overly long), or in the communications system (especially in conflicts over internal bus access), etc. http://en.wikipedia.org/wiki/Thrash_(computer_science)

  • 16.2 What is a Branch Target buffer? Explain how it can be used in reducing bubble cycles in cases of branch misprediction.

    SOLUTION

    Branch misprediction occurs when the CPU mispredicts the next instruction to be executed. The CPU uses pipelining which allows several instructions to be processed simultaneously. But during a conditional jump, the next instruction to be executed depends on the result of the condition. Branch Prediction tries to guess the next instruction. However, if the guess is wrong, we are penalized because the instruction which was executed must be discarded. Branch Target Buffer (BTB) reduces the penalty by predicting the path of the branch, comput- ing the target of the branch and caching the information used by the branch. There will be no stalls if the branch entry found on BTB and the prediction is correct, otherwise the penalty will be at least two cycles.

  • 16.3 Describe direct memory access (DMA). Can a user level buffer / pointer be used by kernel or drivers?

    SOLUTION

    Direct Memory is a feature which provides direct access (read/write) to system memory with- out interaction from the CPU. The “DMA Controller” manages this by requesting the System bus access (DMA request) from CPU. CPU completes its current task and grants access by as- serting DMA acknowledgement signal. Once it gets the access, it reads/writes the data and returns back the system bus to the CPU by asserting the bus release signal. This transfer is faster than the usual transfer by CPU. Between this time CPU is involved with processing task which doesn’t require memory access. By using DMA, drivers can access the memory allocated to the user level buffer / pointer.

  • 16.4 Write a step by step execution of things that happen after a user presses a key on the keyboard. Use as much detail as possible.

    SOLUTION

    1. The keyboard sends a scan code of the key to the keyboard controller (Scan code for key pressed and key released is different).
    2. The keyboard controller interprets the scan code and stores it in a buffer.
    3. The keyboard controller sends a hardware interrupt to the processor. This is done by putting signal on “interrupt request line”: IRQ 1.
    4. The interrupt controller maps IRQ 1 into INT 9.
    5. An interrupt is a signal which tells the processor to stop what it was doing currently and do some special task.
    6. The processor invokes the “Interrupt handler”. CPU fetches the address of “Interrupt Service Routine” (ISR) from “Interrupt Vector Table” maintained by the OS (Processor use the IRQ number for this).
    7. The ISR reads the scan code from port 60h and decides whether to process it or pass the control to program for taking action.
  • 16.9 Write an aligned malloc & free function that takes number of bytes and aligned byte

    (which is always power of 2) EXAMPLE alignmalloc (1000,128) will return a memory address that is a multiple of 128 and that points to memory of size 1000 bytes. alignedfree() will free memory allocated by alignmalloc.

    SOLUTION

    We will use malloc routine provided by C to implement the functionality. Allocate memory of size (bytes required + alignment – 1 + sizeof(void*)) using malloc. alignment: malloc can give us any address and we need to find a multiple of align- ment. (Therefore, at maximum multiple of alignment, we will be alignment-1 bytes away from any location.) sizeof(sizet): We are returning a modified memory pointer to user, which is different from the one that would be returned by malloc. We also need to extra space to store the address given by malloc, so that we can free memory in alignedfree by calling free routine provided by C.

    1. If it returns NULL, then alignedmalloc will fail and we return NULL.
    2. Else, find the aligned memory address which is a multiple of alignment (call this p2).
    3. Store the address returned by malloc (e.g., p1 is just sizet bytes ahead of p2), which will be required by alignedfree.
    4. Return p2.
             void* aligned_malloc(size_t required_bytes, size_t alignment) {
             void* p1; // original block
             void** p2; // aligned block
             int offset = alignment - 1 + sizeof(void*);
             if ((p1 = (void*)malloc(required_bytes + offset)) == NULL) {
             return NULL;
             }
       p2 = (void**)(((size_t)(p1) + offset) & ~(alignment - 1));
       p2[-1] = p1;
       return p2;
    }
    void aligned_free(void *p) {
    free(((void**)p)[-1]);
    }
    

Author: Shi Shougang

Created: 2015-03-05 Thu 23:21

Emacs 24.3.1 (Org mode 8.2.10)

Validate