Interview Experience 1: System Design(Out of tutorials)

SHIVAM SOURAV JHA
4 min readOct 25, 2022

--

In most of the interviews, I appeared for this one because it was a great learning experience. The result hasn’t come yet, but I’m pretty sure I’m rejected because it was a great learning experience, so I thought I’d share it.

I was given an interview opportunity at one of India’s well-paying organisations, and I was given a prep guide, so I read a book, saw a few sample blogs/tutorials, and thought I was ready. But what happened during the interview was far from what I had anticipated.

The biggest mistake: Forgetting Murphy’s first law

The interview started and the interviewer explained the question, which goes like this:

dep A-scrapping the internet in 24 hrs = (processA)
dep B-processing the html files alone. 1 html file takes 0-2 hrs of processing
Output: Analytics 100 lines in stdout= (processB)Machine: 16 GB RAM Hard Disk: 100 GBCores: 2Requirement:1) Transfer data from processA to processB2) processing should also be completed within 24 hrs (The speed of processing should be O equal to or greater than the speed of the scrapper).

Then I was expected to ask questions, which I struggled with because I had solutions but couldn’t think of what to ask because I thought I had everything. I need to think about what could help me develop the system. I was at a loss for what to ask for a while, but it was suggested that I inquire about the number of files that process B can generate. Following that, I inquired about the amount of memory required by B to determine how many parallel threads we could run. I also inquired about the type of memory employed (SSD or HDD in case we want to store some RAM in hardware).

Some general conversation

Process A is running on one core while Process B is running on the other. Process A has a single thread, while Process B has multiple threads. We discussed the maximum size of the HTML file, and I made some educated guesses for the smallest and largest sizes possible.

Approach 1: Use Messaging Queues

Probably not the smartest solution, I considered connecting a third queue where process A could dump data and distribute it in parallel threads of process B.
We discussed how we can use push/pull on the broker-consumer side. We can process the files if we use the pull from the broker to the server, but if process B fails, it will be difficult to find the lost files. As a result, I proposed that in the push mechanism, we distribute the copy and use a tag that indicates whether the file was successfully processed or not. If the processing fails, we can return it to the broker, ensuring that all unprocessed items are returned to the queue.

Demerits of this approach

The main disadvantage is that, because we introduced the third process, there will be latency as well. What if this process fails? To ensure that no data is lost, we would have to replicate our broker. This also increases the overall coding weight, implying that we must code more.

Approach 2: Store the data of process A on RAM

I was told that we could remove the queue and simply store the file on RAM, that while it was being processed, I could store it in hardware, from which I could send the address to process B, and that if process B was successfully processed the file, we could return a tag indicating that it had been processed. We can also use a scheduled job to delete all of the processed files. This is a good strategy, but it is short-sighted.

Approach 3: The correct answer

I was then suggested that since we’re only providing the location from A to B, why not just store everything in a file, implying that A would continue to run and store files in a folder where B could process them? At this point, I realised what was going on.

To simplify, we’re using a folder where process A would come and dump files, process B would pick the same from there and process it, and if any thread of B failed, the file would still be there and thus could be processed later. I was surprised at how simple the solution was. To be fair, this was the first system design round I had ever participated in (at a major organisation).

Fault Tolerance and threads

I was later informed that any thread of process B could fail at any time and would require 5 minutes to recover. So, in the worst-case scenario, we needed around 80–90 threads for B. Now, because a machine can fail at any time, I had to prepare for this by adding more threads. According to my calculations, we had effectively 16 hours left because the machine would fail three times per day. So, after some basic math, I came up with 125 possible threads for process B.

My learnings from this interview

Instead of chasing after complicated or fancy things, consider the simplest solutions and understand the core job that is expected. This wasn’t the worst interview I’d ever done, but it was certainly the most important. Given how many hints I took, it’s reasonable to speculate on the outcome. If I pass this round, I’ll post an update (of which chances are very less).

--

--