The workplace ‘happy hour’ is a beloved ritual. The opportunity to meet coworkers in a relaxed setting is a great opportunity for informal discussions on recent projects and ideas. It was in these surroundings that I first found myself creating beverage analogies. My theory in approaching this exercise is that the building blocks of modern web applications are intentionally designed to be easily understood at a high level of abstraction by those without technical backgrounds. In that spirit, this blog will introduce five technical concepts via happy-hour analogies, before bringing them together in a functional web architecture.
- Analogy One: Consistency versus Availability
- Analogy Two: Sharding
- Analogy Three: Load Balancing and Queueing
- Analogy Four: Caching
- Analogy Five: Web Sockets
- Example Web Application
An introductory thought exercise
Each week many co-workers meet for a happy-hour gathering. People enjoy various drinks, but since I’m writing this from the Pacific Northwest, I’m going to pick Kombucha as our running example. Currently in our office we have a few people and a couple of fridges. It’s pretty easy for everyone to get a drink when they want it. Imagine things were scaled by tens, hundreds, maybe millions:
- What sort of challenges would we face and how should we deal with them?
- How many unique drinks and flavours can we (or should we) offer?
- What types of drinks are offered and not offered?
- If multiple people want drinks at the same time, how many concurrent people can we comfortably accommodate at any one time?
- If too many people attempt to access a fridge at one time, how should they queue (eg. is it strictly first-in-first-out or do we have a VIP lane)?
- What’s an acceptable/unacceptable wait time for people to get a drink?
- How does the demand for drinks compare at 8am on a Monday versus 4pm on a Friday?
Analogy One: Consistency versus Availability
My workspace is an office with multiple rooms and a fridge located a short walk away. It’s not uncommon for someone to venture to another room to pick up multiple drinks for other people. A scenario we face is that people often aren’t certain about the exact fridge inventory but take drink requests based on presumptions and the fridge’s most-recent known inventory (eg. “There should be an orange Kombucha in the fridge. Please get one for me if it’s available”). In this scenario we are sacrificing the guarantee that our drink inventory is correct (or ‘consistent’) in favor of being immediately able to take people’s orders (or be ‘available’) in the moment that people want drinks. Alternatively we could of course check the fridge first to ensure that the inventory is correct, but that would slow things down, sacrificing availability for consistency. 
Analogy Two: Sharding
Have you ever found yourself blindly rummaging through a fridge or a bucket of drinks to find your desired beverage? It can be inefficient and time-consuming. In our running analogy we currently have a fridge containing multiple drinks. Imagine the number of people and drinks tripled and required additional fridges. With this scale comes new possibilities and new challenges for categorising and distributing drinks across fridges. 
- Regional: Usually partitioned by a static location identifier in the data, based on where the data is normally generated and consumed (eg. US East, US West, Canada etc.)
- Range: Usually based on a primary or composite identifier in the data, and partitioned based on groupings within this range (eg. A-C, D-F, H-J etc.)
- Hashing: A hashing algorithm is used to randomly distribute data across the available database shards, which can reduce the risk of a disproportionate concentration of traffic (a ‘hotspot’) being served in one location.
Analogy Three: Load Balancing and Queueing
Think about the last time you went to an arena sports game or concert. How did you get inside? Chances are you probably entered by one of many entrances based on your ticket. Otherwise, perhaps you were directed to the least-busy entrances. This segregation of people serves to distribute the fans and increases the rate (or ‘throughput’) at which fans can enter and exit the venue. Like in the physical world, load balancers provide a solution for dispersing web traffic across multiple web servers, reducing the risk of congestion in a single spot. Let’s continue the analogy of a trip to a game/concert. Once you get inside you might make your way to the bar before queuing again. Were there different queue types? Probably a first-in-first-out queue, but maybe there was also a VIP queue for certain ticket holders. 
Analogy Four: Caching
You can call it many names, but everyone hates having to wait, everyone hates ‘latency’. Imagine if one flavour of Kombucha was significantly more popular than another. It would make sense to make that the most easily-accessible in our fridge. How about if it was a long walk to the fridge? Perhaps we’d put a mini fridge closer to our desks and stock it with the most-popular drinks. 
Analogy Five: Web Sockets
Let’s suppose you wanted a drink from another room and needed to phone someone to request it. You could place a new phone call each time you had a question, or you could stay on the same phone call until the drink is delivered, permitting a two-way conversation with real-time updates. Both options have pros and cons. For example, if the consumer valued real-time updates on their order, a two-way conversation would be more optimal than multiple short calls. Maintaining a long call is not ideal either if, for example, the messenger only has one phone but needs to accommodate drinks orders from multiple people. Like so many choices in systems design, it is a case of weighing up options to choose the best choice for a given scenario. 
Example Web Application
Finally, we can merge all of the previous analogies together into a single architecture. Let’s create a simple app with the following requirements, to coordinate the tasks of requesting and delivering drinks:
- Users can use the app to both request drinks and to accept other users’ drink requests.
- We will refer to the users requesting drinks as ‘consumers’ and those delivering drinks as ‘messengers’.
- From within the app, users can toggle their availability off and on. If they are available to fetch drinks, they will be sent drink requests that they can accept or refuse via the app.
- If a request is not successfully paired with a messenger within three minutes, the request will terminate with the drink consumer notified to try again later.
- The maximum number of drinks requestable is four. This limitation is intentional, so that orders can be easily carried by the drinks messengers.

- Users are first routed through a load balancer which relays their traffic to an available web server
- Information on locally-available drinks can be loaded via the cache
- Precise drink inventory is tracked via a database which is sharded based on the regions that drinks are located in.
- A WebSocket connection is favoured to prioritise real-time updates for drink requests.
- New drinks requests are placed into a queue which is used by two ‘matchmaker ‘services that attempt to match drink consumers with drink messengers. A request is removed from the queue if it is completed, canceled, or three minutes pass without a messenger accepting the request.
Conclusion
An appropriate quote to draw us towards conclusion is from the philosopher Alfred Korzybski, who stated that “the map is not the territory”. This blog’s diagrams and analogies all represent visual and logical abstractions of topics that could be dissected in far greater granularity. The beauty here is that the building blocks of the internet, while certainly complicated at a more granular level, can be understood at multiple levels of abstraction. Whether you are coming from a technical or non-technical background, I hope that you have gained a new perspective from this blog, or at the very least some new ideas for your next workplace happy hour.



