Hyperscale Datacenters

Cloud Gossip

English - February 02, 2018 08:30 - 10 minutes - 9.86 MB
Technology technology azure cloud computing software development cloud aws kubernetes big data docker Homepage Download Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed

Previous Episode: Containers?!

Next Episode: What is the cloud?

Today on the show - hyperscale datacenters. After this episode, you'll know what they are, what makes them special and why are they important for the cloud.

#Epsiode transcript:#

##Prologue##

As use of computers grew rapidly in the 1990s, so did the need for servers and datacenters. Back in the day, network connections were slow and expensive. Therefore, the datacenters had to be built close to the companies and users using them. Usually that meant building the datacenters into the office building’s basement.

There was this Nordic company and their business model heavily relied on using a lot of servers. So naturally, they also had to have quite a massive basement. This essentially meant the basement was business-critical for them. If the computers were to be harmed, the company would lose their reputation, business, everything. The office was in an area with low natural disaster risks. For example, there had been no recorded earthquakes in modern history.

However, the basement of this company's office was flooded few years ago. This wasn’t just an inconvenience for the office workers. The flooding was a serious threat for the future of the company, as the server room was completely flooded. As everyone knows computers and water don't mix well together. The situation seemed dire: the company could lose all their data, and their business could go under. At this darkest of the hours, the friendly neighborhood sysadmin jumped in and saved the day by swimming to the servers and rescuing them.

In the end it affected their business, but they avoided a catastrophe. So how could this situation have been avoided? That's what we're discussing in today's episode: --Hyperscale Datacenters.

##Introduction##

Hi, and welcome to Cloud Gossip. I'm Annie and I am a cloud marketing expert and a startup coach. Hey, my name is Teemu. I'm Cloud developer, Devops trainer and an international speaker. And I'm Karl and I'm a cloud & security consultant for enterprise customers, and I also moonlight as an international speaker. Today on the show - hyperscale datacenters. After this episode, you'll know what they are, what makes them special and why are they important for the cloud. This podcast is part of a 4-part series, which you can find either on Apple Podcast, Android podcast apps or on our website CloudGossip.net.

##History of datacenters##

Hi, this is Karl again. So, what is cloud? Cloud - as we know it - is a network of modern, hyper-scale datacenters. These hyper-scale datacenters of today are different from the datacenters we've had previously. Let's look at the history of datacenters leading up to the cloud. Before modern hyper-scale datacenters, we used a single server at a time.

The first datacenters - actually had only single server -- that was filling the whole room. Once we got further, the server size came down and we started to have data centers: multiple servers connected to each other.

The idea was that pretty much every company with computing needs would build their own datacenter. A datacenter is a specifically-made space to host multiple servers and take care of all their needs, such as electricity, heating, ventilation, air conditioning and network.

As all the companies were building their own datacenters, they had to maintain the physical security. This meant installing locks, keycard readers or any other security measures that the customers required. The physical location had to be carefully picked and deals with energy providers had to be made.

When companies were running their own datacenters, it was a big deal that they were responsible of building, installing, updating and "end of lifing all the servers in their use. End-of-lifing means that when the physical server is so old that it's no longer feasible to replace the broken parts and rather cheaper to buy a new server, the old server is disposed in a secure way.

The hard drives are wiped clean in a secure way, so that there's no way that somebody could recover our data from them. After that they are physically destroyed.

When the servers would eventually have hardware failures, the servers would be out of use. This is called an outage. Preparing for outages involves taking care of the spare parts for the servers. The datacenter owner had to purchase enough spare parts for their own use, or make sure they had access to the needed parts when needed.

These tasks of running a datacenter required a lot of personnel. Once up and running, a typical datacenter could have one administrator per two dozen servers. A typical midsize company could easily have 1000 servers in their datacenter. This meant having over 40 people on payroll just to keep the lights on and servers running.

##Problems with traditional datacenters##

Hi, it’s Annie again. Running their own datacenter caused a lot of headache to the companies. A major problem was outages. When an outage occurs, the services would not be available for use.

There are two kinds of outages: planned and unplanned. Planned outages could be for example migrating the whole datacenters to another physical location, or performing regular updates to the servers.

Let's talk about unplanned outages, as they are the ones that cause grey hair and sleepless nights. An unplanned outage means that something unexpected has happened. That something can be a connectivity outage: the servers and applications are still running but the end users cannot connect to those applications. This usually means network failures, such as a broken network cable or fried network equipment.

Unexpected outages can also be caused by hardware failures. To minimize outage times, the datacenter owners always bought the best hardware they could afford in the hope that money can buy happiness.

They bought the hardware that promised the longest life time. A single piece of network equipment could cost the same as a Lamborghini.

Another type of cause for an unexpected outage are heating-related issues. If not properly cooled down, a server can malfunction or even melt. A legendary solution to this problem was introduced by Google. In their early days -- fresh out of the garage -- they used cardboard boxes to isolate their servers instead of investing in expensive air conditioning.

Those were the issues around outages. Now let's move on and talk about issues related to physical security. Physical security issues range from someone stealing datacenter equipment for monetary gain -- to corporate espionage. An example of corporate espionage would be someone stealing the hard drives with trade secrets in them. So it's not just about who has access to the servers, but also making sure that the servers and data actually stay within the data center.

This means that there was a need for electronic keycards, man traps and access control logs. This requires even more manual work. Another thing was that the lead times were long.

Which means that whenever you needed a server, you started by ordering the hardware, making sure that it fits your budget, electricity, cooling, and network capacity. Set needed firewall rules, and made sure that there was physical space in the datacenter.

After this you installed operating system and needed applications to the server. As these were all steps that would require comprehensive knowledge of the system and the specific skills related to this area.

These tasks couldn't be performed by the same people, but instead each of these tasks were performed by a dedicated team of people, which created bottlenecks as people were waiting for another team to complete their work before starting their own.

For example, server operating system cannot be installed and configured before there is network access in place. All of this led to a situation that it could take us weeks or even months to get everything in place.

##Hyperscale datacenters##

This is Teemu again. The solution to these problems is hyperscale datacenters, also known as the cloud. Hyperscale datacenters are massive in size, mostly automated and mostly operate without any human interaction. By massive we mean really MASSIVE. For example, a typical datacenter building in a hyperscale context is large enough to cover two jumbo jets - the largest aircraft in the world.

A datacenter building may host up to one million servers. And a datacenter usually has half a dozen buildings. In the hyperscale world, a lot of the human actions are replaced by automation. A single administrator can now maintain 5 000 to 50 000 servers, compared to 50 servers in a traditional datacenter.

As the speed of upgrades -- that is, replacing the servers -- is much faster in the cloud, instead of buying the Lamborghini-priced hardware that has a life expectancy of 6 years, hyperscale cloud providers would buy the cheapest possible hardware with a life expectancy of 2 years and replace those servers faster to get better energy efficiency and performance.

So, all the hard work of preparing against outages are no longer in the physical world, but rather in the automation and software. Because there is no human work involved, servers in hyperscale cloud can be deployed in minutes instead of weeks or months -- in contrast to traditional datacenters.

The differences to traditional datacenters don’t end there. For a traditional data center owner, connectivity problems could be solved by calling the local network carrier. For a hyperscale cloud provider, connectivity problems are related to speed of light (as data takes longer time to reach me from Australia than from Canada).

Solving those issues could include hiring our own submarine captains to install new network connectivity to the bottom of the oceans to get better intercontinental connectivity.

The massive scale and automation have revolutionized the world of datacenters as we know it.