Cloud computing for non-technical product managers

This explainer article is an experiment for me. If you consider yourself to be a non-technical product manager, I would love to know in the comments below whether this article has been helpful and pitched at the right level for you.

To understand how cloud computing works, we’re going to start with the basic building blocks and work our way up.

Main bits of a computer

While much faster, smaller and more powerful, the laptop and smartphone you use today still share many characteristics with their ancestors that used to fill a room:

they have a CPU (central processing unit) that receives and carries out the instructions you send it;

they have faster memory for the CPU to ‘remember’ snippets of information that it will need to access again quickly while the system is still running;

they have slower storage media to house larger amounts of information that doesn’t need to be accessed as quickly, and which will not disappear even if the system is shutdown and restarted. Each of these main components can talk to each other.

When you start your laptop or smartphone, it reads information from the storage media to load the operating system into memory (think Windows, macOS or Linux for laptops, or Android or iOS for phones). You can think of the operating system as a very long and complex set of instructions that allow it to interact with your laptop or smartphone. The CPU then reads these instructions from memory and runs them. You then interact with the operating system using your keyboard, mouse or touchscreen to start up and use apps.

Running software on servers

A picture of a Dell PowerEdge 850 server with its case open (Credit: Rodzilla at English Wikipedia) CC-BY-SA 3.0

Now imagine your laptop was 10 times more powerful. Let’s call it a server. Because it’s more powerful, it’s going to use more electricity and generate more heat, so we put it in a special rack (a vertical, physical housing for servers) in a data centre to provide it the the power and cooling it needs to keep it running without overheating and damaging itself.

Although we could if we wanted to, we won’t be plugging in a monitor, keyboard and mouse directly to our server. Instead, we’re going to access the software it’s running remotely over the internet.

We could stick with the same model of using that server as an individual unit to run software for us (like your laptop or smartphone). It’s more powerful, sure, but there’s always an upper limit to how much stuff you can run on one machine.

The other consideration is if the one and only server you have fails, you lose everything at once. So rather than having one gigantic server running everything, let’s start with lots of smaller servers instead.

Clustering servers

30U server rack diagram with colours showing clustering — Nice rack

Now imagine you can fit 40 of these servers into one rack. You can run lots of software, but you’re increasing the administrative effort of having to keep each of those 40 individual servers running well, and you still have to remember which server is running which bits of software. Plus, if any one of those servers fails, you lose the bit of software it was running, which may break the whole setup. For example, one server might be running a database, which other software in your rack may depend on.

So you might decide to cluster together two or more of your servers and devote them to a specific bit of software. In a cluster, all the servers share information and work together to spread the workload across them, depending on which of the physical servers in the rack are more or less busy. That way if one fails, there are others to keep that bit of software running. This is called redundancy.

We’ll use bigger clusters (more servers) for our more compute-intensive software and for software we know is going to be used by more people at the same time.

Automation and orchestration

Let’s take that concept further. At the moment, our 40 servers are divided up into small, redundant clusters, each dedicated to a particular bit of software. Each machine in a cluster is identically set up (same operating system, same software, same configuration etc.) so that it doesn’t matter which of the servers in a given cluster you’re talking to, they all behave identically.

It’s becoming a bit of a chore to manage all these servers individually and to make sure that each machine in its respective cluster is set up identically. Plus, if one of the servers fails, we have to physically replace it, reload it with the operating system and software, and configure it again. When you upgrade your smartphone, it’s always a bit of an effort to get the new one all set up again to your liking, even if it helps you transfer apps across from the old phone.

It would be rather helpful if we could automate that process. Imagine you could take a perfect snapshot of all the apps, configuration and logins that your old phone was using and simply clone that snapshot to your new phone. Et voilà, you can pick up using your new phone exactly where you left off with your old one.

That’s basically how we’re going to manage our rack of servers. We could have a library of server snapshots corresponding to each of the bits of software we want to run. If we need to add or replace a server in a cluster, we clone the corresponding snapshot to the new server, fire it up, and off we go again. That certainly saves us a lot of manual work.

Having to physically replace failed servers is still a bit of a chore, so let’s keep a few of our 40 servers aside as spares. That way, when we need to provision a replacement server, we’ve already got one ready to be deployed. And providing we always have working spares available, the orchestration software can automatically detect which type of server failed, push the right kind of snapshot to the spare server, then add it to the cluster to replace the failed unit.

We can go a little further still. Some bits of software may be in more demand (= under higher load) than others, and this may change over time. We might want to respond in real-time to an increase in demand by automatically provisioning more servers and adding them to the heavily loaded cluster. And when demand for that bit of software dies down again, we could either return the extras to our spares pool to be reused, or re-provision them for use in other clusters as needed.

Software containers

Shipping containers stacked up (Photo by Guillaume Bolduc on Unsplash)

So far, so good. We’re able to automatically scale each bit of software up and down in response to demand, and we can cope with a few server failures without losing service. But we’re still working in increments of physical servers. What if we only needed smaller increments to run a piece of software?

It used to be the case that software was monolithic (literally ‘single stone’ in ancient Greek), meaning it was designed only to be run on a single, usually powerful server. Modern software typically runs as a collection of smaller, independent components.

Even a relatively modest server is powerful enough to run several bits of software alongside each other. As before, we could install all the different bits of software onto a single server, then clone the whole thing as needed. However, we start to run into problems when one bit of software needs a particular set of components which conflicts with what a different piece of software needs. It can become overly complex to make the two co-exist on the same physical server, so what we really need is a way to divide up the server to keep each bit of software plus its dependent components in ‘boxes’ separate from each other. A common way of doing this is with containers.

Containers allow us to run several bits of software, each with their own dependent components, alongside each other without interfering with one another. Software containers rely on and share the underlying operating system, so a container of Windows software has to run on a Windows host, a Linux container on a Linux host and so on.

We can run one or more separate containers (instances) of the same bit of pre-configured software on the same server. More usefully we can easily transfer a container as a unit to another server run the software there. We’re no longer worried about what other software that server happens to be running because each container is self-contained.

This means we can take advantage of our clustering and orchestration again for redundancy and workload management. Only this time we’re not provisioning whole server snapshots (operating system, software and configuration), we’re deploying software containers (just software and configuration, no operating system).

Before, we ran a particular software app across a cluster of physical servers for redundancy and to scale performance. Now we’re running the software across a cluster of containers spread across different physical servers. We can take more granular advantage of which physical servers are more or less busy, adding and removing containers as needed, wherever we want.

Once we start doing this, it makes it a lot harder to point to any given physical server and say what software it happens to be running – each bit of software is running as a cluster of containers, and those containers could be spread across any of our 40 servers at that particular moment. The orchestration software takes care of moving the containers around as needed and ensuring they continue to operate together as a cluster.

Side note: virtualization

A Russian doll or Matryoshka opened to reveal smaller dolls nested inside (Credit: Fanghong on Wikipedia) CC-BY-SA 3.0

Virtualization differs from containers. A virtual machine is a self-contained server (the operating system and all the software it runs) nested inside a software simulation of a physical server, a bit like a Russian doll. It relies on a special bit of software called a hypervisor, which allows all the virtual machines to share the CPU, memory and storage on the physical server. This is useful when you need to run multiple, independent operating systems alongside each other on the same hardware.

One example might be when an office is set up for remote working. Each employee has their own virtual server running a copy of Windows (the operating system). All the virtual servers are hosted centrally on a powerful server cluster. An employee can use any desktop computer in the office or their laptop outside the office to access their own virtual Windows machine.

It doesn’t matter which desktop or laptop an employee uses, they’re always remotely accessing their own virtual Windows machine.

Because the virtual machine simulates the physical server, you can do slightly mind-bending things like running a copy of Windows nested inside macOS on your Apple laptop, or running a virtual machine of Android in Windows. The key difference to remember about virtual servers is that they contain the operating system and think they’re just a regular computer.

(Just to noodle your brain further, you can happily run nested virtual machines and software containers inside a virtual machine. Like I said, Russian dolls.)

Moving to cloud scale

CERN's server room (Photo by Florian Hirzinger on Wikipedia) CC-BY-SA 3.0 — CERN’s server room – big, but nowhere near cloud provider scale (Photo by Florian Hirzinger on Wikipedia) CC-BY-SA 3.0

To recap where we’ve got to so far: we have 40 servers in our rack which are collectively acting as an amorphous lump of server processing power. We have lots of different software applications running inside containers, which the orchestrator is deploying as needed across the 40 servers. Everything scales up and down according to demand, and on failure of a physical server, the orchestrator simply deploys more containers elsewhere to handle the workload, and brings spare physical servers online when more processing power is needed.

I’ve been using the example of 40 servers because it is roughly equivalent to a typical server rack stuffed full to the gills. 40 servers is a massive amount of processing power if we’re comparing it to our individual use of our laptops or smartphones. Unless you’re handling AI workloads (which are far more compute intensive), you’re unlikely to need all that processing power all of the time. You might need it to handle the occasional spike in demand, but for the rest of the time it’s sitting idle, waiting patiently for a bit of action. What if we could rent out the spare processing capacity on demand to paying customers?

That’s largely what cloud computing companies are doing. They allow you to run containers and virtual machines on their physical server hardware, securely partitioned off from each other, and charge you for usage or ongoing rental. You access them remotely across the internet and only have a vague idea of where they’re actually running, maybe a geographical region like London, West Canada or Malaysia. (In network diagrams, the internet was traditionally represented as a fluffy cloud for some reason, hence ‘cloud’ computing.)

This is great if you don’t have the capital or in-house expertise to spin up and run your own servers, or if you need a vast amount of processing power for a very short time, very infrequently (as one of my clients needed in order to sequence genomes over a weekend). But convenience comes with a cost: beyond a certain amount of regular usage, it will usually work out cheaper to run your own kit, even accounting for the overheads and cost of acquiring in-house expertise to maintain it.

Cloud computing providers are operating at a scale that makes our example of 40 servers seem minuscule. Each data centre these cloud computing providers operate can easily house several thousand racks, each of 40 servers or more. And the big players operate multiple data centres (again, for redundancy) in different geographical regions worldwide. But if you were able to nip into one of those vast data centres and open up a single physical server in one of their racks, you’d still be able to identify the basic building blocks we started with.

Here be dragons simplification

I’ve deliberately glossed over a lot of detail for the sake of brevity and clarity of concepts. Treat this as a starting point, the broad brushstrokes if you like. At each level we’ve skimmed over, you could happily dive down a rabbit hole and spend a significant part of your working life specialising in any one of the concepts I’ve described. I spent the first six years of my career specialising in load balancer software (the bit that spreads workload across a software cluster) and I didn’t even call that bit out explicitly :-)

Where can I learn more?

Glad you asked. Wikipedia is a good starting point for topics like this, though it can rapidly descend into deeply technical discussion. Other useful reads:

“What is server clustering?”, Melanie Purkis, Liquid Web blog (retrieved 15 May 2025)

“What is orchestration?”, Red Hat blog (retrieved 15 May 2025)

“What is a container?”, Docker blog (retrieved 15 May 2025)

“What is virtualization?”, Opensource.com (retrieved 15 May 2025)

(I’m not affiliated with any of the above organisations nor endorse their products or services.)

If you got this far, thank you for reading. Let me know whether you found this explainer helpful (or not) in the comments below.

0 Comments on “Cloud computing for non-technical product managers”

Leave a Reply Cancel reply