A new year has come and gone, and we’ve wasted no time in putting our New Year’s resolutions to work. Take a look at what the Google Cloud Platform team has been up to in the month of January.

Digging into containers

Containers continue to be a hot topic across the software development universe, including here at Google. In fact, two of Google’s open source projects – Kubernetes and cAdvisor – center on containers and how they are run in clusters; and both projects were named Open Source Rookies of the Year by Black Duck Software this month.

To keep you up to speed, we launched a blog series diving into the technology and explaining the new paradigms. For a primer, start with “An introduction to containers, Kubernetes, and the trajectory of modern cloud computing.” In a nutshell, the post explains what a container provides that a VM does not. Read the piece to see why this matters, and see the relationship between single-instance containers, Docker (an open platform for distributed applications), Kubernetes (clusters of intelligently managed containers), and Google Container Engine (containers-as-a-service hosted on Google).

If you’re looking to dive deeper into the world of containers and Google’s rationale for creating Kubernetes, read “What makes a container cluster,” which talks about the ingredients of a greater container cluster manager and the benefits of running containers in large-scale clusters, and “Everything you wanted to know about Kubernetes but were too afraid to ask.”

And we weren’t just talking about containers this month. Following on the launch of the alpha version of Google Container Engine in November at Google Cloud Platform Live, we announced the beta release of Google Container Registry, a new service that’s designed to provide secure and private Docker image storage on Google Cloud Platform.

Demystifying cloud pricing

Speaking of hot topics, this month we also unpacked a topic that tops the list: pricing. Pricing is a critical consideration for users trying to make the best decision about infrastructure systems design, but it’s also complex and sometimes cloudy (pun intended). Learn what exactly you get for your money through an analysis of Google Cloud Platform pricing compared to Amazon Web Services.

Tech tips on tips: Dataflow Big Data pipelines, verify MongoDB backups, diagnose bottlenecks...

A few other tech tips and other tidbits you may have missed in the past month:

From genomics to website design: how to’s with our customers

The true benefit of this quickly evolving cloud technology really shines through in our customer stories. This month, we heard from customers spanning industries and geographies, including:

  • Alacris Theranostics, a Berlin-based spin-off of the Max Planck Institute for Molecular Genetics is using Google Cloud Platform to better match cancer patients with the most promising drug therapies.
  • Aucor, based in Finland, transitioned customer websites onto Google Cloud Platform, providing them the capacity to scale with their expanding customer base and focus on what they do best: design awesome websites.
  • Shine Technologies, a digital consultancy based in Australia, uses Google BigQuery to help businesses make sense of the billions of ad clicks, ad impressions and other data that guide business decisions.
  • Aerospike, an open-source NoSQL database based here in Mountain View, pushes the limits of Local SSD technology to offer blazing performances: fully 95% of local SSD reads complete in under 1 ms. In fact, benchmarks show that Aerospike delivers a 15x price advantage in storage costs with Local SSD compared with RAM.

New year, new series: Introducing the Learn with Google Cloud Platform Webinar Series

We’ve kicked off a monthly webinar series featuring use cases and real-time Twitter and Google+ Q&A sessions to pull back the curtain on solving complex business challenges in the cloud and nurturing business growth. This month’s webinar discussed how high-growth online retailer zulily leveraged big data to offer a uniquely tailored product and customer experience to a mass market around the clock.

It’s been an exciting month, and February promises to bring more discussion of container clusters and more tips, news and stories from the cloud. Stay tuned, and Happy Friday!

-Posted by Charlene Lee, Product Marketing Manager

In the previous weeks, Miles Ward, Google Cloud Platform’s Global Head of Solutions, kicked off the Kubernetes blog series with a post about the overarching concepts around containers, Docker, and Kubernetes, and Joe Beda, Senior Staff Engineer and Kubernetes co-founder, articulated the key components of a container cluster management tool based on Google’s ten years experience in running its entire business on containers. This week, Martin Buhr, Product Manager for the Kubernetes open source project, answers many of your burning questions about Kubernetes and our support for containers on Google Cloud Platform.

Everything you wanted to know about Kubernetes but were afraid to ask

When we announced the Kubernetes open source project in June of 2014, we were thrilled with the large community of customers and partners it quickly created. Red Hat, VMware, CoreOS, and others are helping to grow and mature Kubernetes at a remarkable pace. There is also a growing community of users who are both utilizing Kubernetes to manage their container clusters, but in many cases are also contributing to the project itself.
I’ve been fortunate to be able to engage with many in our community, and we consistently hear many of the same questions:

  • Given that Google already has its own mature, robust cluster management systems (which handle around two billion new containers a week), why did you create Kubernetes?
  • How does Kubernetes relate to Docker? How does it differ from Docker Swarm?
  • What insures that Google is committed to the Kubernetes open source project over the long run?
  • How does Kubernetes fit in with and augment your overarching strategy for Google Cloud Platform?
  • What incentive does Google have to make Kubernetes great outside of Google Cloud Platform for deployment on premise or on other public clouds?
  • What is the relationship between Kubernetes and Google Container Engine, now and in the future?

This post will answer these questions, and we’d love to field others we may have missed via the Kubernetes G+ page.

Why Kubernetes?

Given that Google already has its own mature, robust cluster management systems, many wonder why we created Kubernetes. There are actually two reasons for this.

First, there is the altruistic motive. We have enjoyed amazing benefits by moving to the model embodied by Kubernetes over the past ten years. It enabled us to dramatically scale developer productivity and the number of services we were able to offer without investing in a corresponding increase in operational overhead. It also gave us fantastic workload portability, enabling us to quickly “drain” applications from one resource pool and move to another. As with many other technologies and concepts that we’ve shared with the community over the years, we think Kubernetes will help make the world a better place and help others enjoy similar benefits. Other examples include Android, Chromium, and many of the technologies that underpin the rising popularity of Linux containers (including memcg, the Go programming language in which Docker is written, cgroups, and cadvisor).

Second, there is the practical reason grounded in our desire to make Google Cloud Platform the best platform on the web for customers to build and host their applications. As Urs Hölzle, Senior Vice President for Technical Infrastructure at Google noted last March, we’re unifying Google’s core infrastructure and Google Cloud Platform and see a significant business opportunity for Google in Google Cloud Platform. By enabling customers to start using the same patterns and best practices Google has developed for its own container based workloads, we make it easy for customers to move those workloads around to where they make the most sense based on factors like latency, cost, and adjacent services. We think over time that our deep, comprehensive support for containers on Google Cloud Platform will create a gravity well in the market for container based apps and that a significant percentage of them will end up with us.

How does Kubernetes relate to Docker? How does it differ from Docker Swarm?

When referring to “Docker,” we’re specifically talking about using the Docker container image format and Docker Engine to run Docker images (as opposed to Docker Inc., the company that has popularized these concepts). These Docker containers are then managed by Kubernetes.

Imagine individual Docker containers as packing boxes. The boxes that need to stay together because they need to go to the same location or have an affinity to each other are loaded into shipping containers. In this analogy, the packing boxes are Docker containers, and the shipping containers are Kubernetes pods.

Ultimately, all these pods make up your application.
You don’t want this ship adrift on the stormy seas of the Internet. Kubernetes acts as ship captain – adeptly steering the ship along a smooth path, and insuring that the applications under its supervision are effectively managed and stay healthy.
Once you move beyond working with a handful of containers, and especially when your application grows beyond more than one physical host, we strongly advise that you use Kubernetes (for reasons we’ve highlighted recently).

In terms of how Kubernetes differs from other container management systems out there, such as Swarm, Kubernetes is the third iteration of cluster managers that Google has developed. It incorporates the cumulative learnings of over a decade of experience in production container management. It embodies the cluster centric model, which we’ve found works best for developing, deploying, and managing container based applications. Swarm and similar systems embody the single node model and may work well for some use cases, but there are several critical architectural patterns missing that customers will ultimately need as they move to production use cases (these were highlighted in Joe’s post last week).

Is Google committed to Kubernetes?

Both customers and partners are asking variations of the following question: “Given that I’m considering betting the future of my project/app/business on the long term viability of Kubernetes, what assurance do I have that Google will not lose interest over the long term, causing the project to whither?”

First, as outlined above, we view Kubernetes as core to our cloud strategy, and we’re internally committed to making Google Cloud Platform a significant part of Google’s overall business. Our deep experience in running containerized workloads is a big competitive advantage for Google Cloud Platform, so it makes sense for us us to continue to invest in making Kubernetes robust and mature. As an expression of this, we have some of our most experienced engineering talent working on the project, including Googlers with years of experience developing and refining our internal cluster management systems and processes.

Second, we’ve been very fortunate to have a vibrant, experienced community of contributors form around Kubernetes. Many of them have incorporated Kubernetes into their own products, resulting in a vested interest in the health and sustainability of Kubernetes. For example, Red Hat made Kubernetes an integral part of OpenShift version 3, and as of the time of this post, two of the top ten contributors are from the growing team Red Hat has working on Kubernetes. Thus, even if Google were to get taken out by a meteorite, a significant community of contributors would remain to carry it forward.

How does Kubernetes fit into Google’s cloud strategy?

As we mentioned, Google Cloud Platform is a key business for Google, and we are confident (based on ten years of experience using containers to run our business and the significant technical and operational depth we’ve acquired in doing so) that we can make Google Cloud Platform the best place on the web for containers. Kubernetes embodies the best practices and patterns based on this hard won experience for creating and running container based workloads.

We think that Kubernetes will help developers create better container based applications that require less operational overhead to run, thereby accelerating the trend toward container adoption. Given the inherent portability of container based applications managed by Kubernetes, every new one created is another candidate to run on Google Cloud Platform.

Our hope is that container based apps will be made even more awesome through the use of Kubernetes (regardless of where they reside), and our goal is to ensure that Kubernetes based apps will be exceptionally awesome on Google Cloud Platform. How much of the market moves to containers and how much of this load we’re able to attract to Google Cloud Platform remains to be seen, but we’ve placed our bets on wide-scale adoption.

Kubernetes on other clouds? On-premise?

For our strategy to be successful, we need Kubernetes to be awesome everywhere, even for customers who will run their apps on other clouds or in their own datacenters. Thus, our goal for Kubernetes is ubiquity. Wherever you run your container based app, our hope is that you do so using Kubernetes so that you can benefit from all the things Google has gotten right over the years (as well as the numerous lessons we’ve learned from the things we got wrong). Even if you never plan on moving beyond your own datacenters, or plan on sticking with your current cloud provider exclusively into the foreseeable future1, we would still love to talk to you about why Kubernetes makes sense as a foundational piece of your container strategy.

Kubernetes and Google Container Engine?

This brings us to Google Container Engine, our managed container hosting offering and the embodiment of Kubernetes on Google Cloud Platform. We want everyone to use Kubernetes based on its own merits and develop container based apps based on proven patterns battle tested at Google. In parallel, we’re making Google Cloud Platform a fantastic place to develop and run container based applications, giving customers the benefits of not only Google’s experience in operating and maintaining container clusters, but also of all the adjacent services on Google Cloud Platform. At present, Google Container Engine is simply hosted Kubernetes, but look for us to start introducing features and linkages to other Google Cloud Platform services to further enhance its utility.

We're Stoked!

It’s an exciting time to be an application developer! As you’ve seen above, Google is deeply committed to Kubernetes, and we and our ecosystem of contributors are working hard to make sure it’s the best tool for creating and managing container clusters regardless of where these clusters run. From our perspective, the first and best option is that you run your container based apps on Google Container Engine, second best is that you run them on Google Compute Engine using Kubernetes, and third best is that you run them someplace else using Kubernetes.

The thing that most excites me about Kubernetes is the frequency at which I see customers rolling up their sleeves and contributing to the project itself. While I’m very proud of what our extended team has created in Kubernetes, I think Joe Beda said it best in his most recent blog post:

While we have a lot of experience in this space, Google doesn't have all the answers.                     There are requirements and considerations that we don't see internally. With that in mind,                   please check out what we are building and get involved!

Try it out, file bug reports, ask for help or send a pull request (PR).

-Posted by Martin Buhr, Product Manager, Kubernetes

1 The theories of supply chain diversification and vendor risk management both recommend against relying on a single supplier for any critical component of one’s business or infrastructure. This has been borne out by the experience of numerous customers over the years with large vendors of proprietary IT systems and software. Part of the appeal of Docker and Kubernetes is the degree to which they significantly lower the friction involved in moving applications between various resource pools (laptop to server, server to server, data center to data center, cloud to cloud, etc.).

(Cross-posted on the Google for Work Blog)

Many businesses around the world rely on VMware datacenter virtualization solutions to virtualize their infrastructure and optimize the agility and efficiency of their data centers. Today we’re excited to announce that we are teaming up with VMware to make select Google Cloud Platform services available to VMware customers via vCloud Air, VMware’s hybrid cloud platform. We know how valuable flexibility is to a business when determining its total infrastructure solution, and with today’s announcement, enterprise businesses leveraging VMware’s datacenter virtualization solutions gain the flexibility to easily integrate Google Cloud Platform.

Businesses can now use Google Cloud Platform tools and services – including Google BigQuery and Google Cloud Storage – to increase scale, productivity, and functionality. VMware customers will benefit from the security, scalability, and price performance of Google’s public cloud, built on the same infrastructure that allows Google to return billions of search results in milliseconds, serve 6 billion hours of YouTube video per month and provide storage for 425 million Gmail users.

With Google BigQuery, Google Cloud Datastore, Google Cloud Storage, and Google Cloud DNS directly available via VMware vCloud Air, VMware customers will benefit from a single point of purchase and support for both vCloud Air and Google Cloud Platform:

  • vCloud Air customers will have access to Google Cloud Platform under their existing service contract and existing network interconnect with vCloud Air, and will simply pay for the Google Cloud Platform services they consume.
  • Google Cloud Platform services will be available under the VMware vCloud Air terms of service, and will be fully supported by VMware’s Global Support and Services (GSS) team.
  • Certain Google Cloud Platform services are also fully covered by VMware’s Business Associate Agreement (BAA) for US customers who require HIPAA-compliant cloud service.

Google Cloud Platform services will be available to VMware customers beginning later this year, so we’ll have more information very soon. In the near future, VMware is also exploring extended support for Google Cloud Platform as part of its vRealize Cloud Management Suite, a management tool for hybrid clouds.

Today’s announcement bolsters our joint value proposition to customers and builds on our strong, existing relationship around Chromebooks and VMware View and also around the recently announced Kubernetes open-source project. We look forward to welcoming VMware customers to Google Cloud Platform.

-Posted by Murali Sitaram, Managing Director, Global Partner Strategy & Alliances, Google for Work

Today’s guest blog comes from Graham Polley, Senior Consultant for Shine Technologies, a digital consultancy in Melbourne, Australia. Shine builds custom enterprise software for companies in many industries, including online retailers, telecom providers, and energy businesses.

Wrestling with large data sets reminds me of that memorable line from Jaws when police chief Brody sees the enormous great white shark for the first time: “You’re gonna need a bigger boat”. That line pops into my head whenever we have a new project at Shine Technologies that involves processing and reporting on massive amounts of client data. Where do we get that ‘bigger boat’ we need to help businesses make sense of the billions of ad clicks, ad impressions, and other data that can guide business decisions?

Four or five years ago, without any kind of ‘bigger boat’ available, we simply couldn’t grind through terabytes of data without plenty of expensive hardware, and a lot of time. We’d have to provision new servers, which could take weeks or even months, not to mention costs for licensing and system administration. We could rarely analyze all the data at hand because it would overwhelm network resources and we’d end up usually trying to analyze just 10% or 20%, which didn’t give us complete answers to client questions or provide any discernible insights.

When one of our biggest clients, a national telecommunications provider in Australia, needed to analyze a large amount of their business data in real time, we chose Google’s DoubleClick for Publishers product. We realized we could configure DoubleClick to store the data in Google Cloud Storage, and then point Google BigQuery to those files for analysis, with just a couple of clicks.
Finally, we thought, we’ve found something that can scale effortlessly, keep costs down, and (most importantly) allow us to analyze all of our client’s data as opposed to only small chunks of it. BigQuery boasts impressive speeds, is easy to use, and comes with a very short learning curve. We don’t need to provision any hardware, or spin up complex Hadoop clusters, and it comes with a really nice SQL-like interface that even makes it possible for non-techy people, such as Business Analysts, to easily interrogate and draw insights from the data.

When the same client came to us with a particularly complex problem, we immediately knew that BigQuery had our backs. They wanted us to stream millions of ad impressions from their large portfolio of websites into a database, and generate analytics about that data using some visually compelling charts - in real-time. Using its streaming functionality, we started to pump the data into BigQuery, which went off without a hitch, and we sat back and watched as millions of rows started flowing into BigQuery. When it came to interrogating and analysing the data, we experienced consistent results in the 20-25 second range for grinding through our massive data set of 2 billion rows using relatively complex queries to aggregate the data.

By leveraging the streaming capability of BigQuery, it allows us to analyze our client’s data instantly, and empowers them with ‘real-time insights’, rather than waiting for slower batch jobs to complete. The client can now instantly see how ad campaigns are performing, and change the ad creative or target audience on the fly in order to achieve better results.

Simply put, without BigQuery it just would not have been possible to pull this off. This is bleeding edge technology that we are using and the idea of doing something similar in the past with a relational database management system (RDBMS) was simply inconceivable.

The success of this project opened up a lot of doors for us. After we blogged about it, we received several requests from prospective clients wanting to know if we could apply the same technology to their own big data projects, and Google invited us to become a Google for Work Services partner. Our clients are continuously coming up with more ideas for driving insights from their data, and by using BigQuery we can easily keep up with them.

Big data can seem like that great white shark in Jaws - unmanageable and wild unless you have the right tools at your disposal to tame it. BigQuery has become our go-to solution for reeling in data, processing it, and discovering the value within.

Contributed by Graham Polley, Senior Consultant, Shine Technologies

Learn more about Shine Technologies and the business impact of BigQuery. Watch as BigQuery takes on Shine Technologies' 30 Billion Row, 30 Terabyte Challenge.


Part 1 - Virtual Compute

When designing infrastructure systems, whether creating new applications or deploying existing software, it’s crucial to manage cost. Costs come from a variety of sources, and every approach to delivering infrastructure has its own tradeoffs and complexities. Cloud infrastructure systems create a whole new range of variables in these complex equations.

In addition, no two clouds are the same! Some bundle components while others offer more granular purchasing. Some bill in different time increments, and many offer a variety of payment structures, each with differing economic ramifications. How do you figure out what each costs and make a choice?

To help you work this through, we’ve created an example for you. For this example, let's look at a fairly common scenario, a mobile application with its backend in the cloud. This application shares pictures in some way, and has about 5 million active monthly users. Let’s go through what instance types this application will need to meet that user-driven workload and then price out what that will cost in an average month on Google Cloud Platform and compare against Amazon Web Services.

Our example application has 4 components:

  • An API frontend that mobile devices will contact for requests and actions. This portion will consume the majority of the compute cycles.
  • A static marketing and blog front end.
  • An application layer that will process and store images as they come in or are accessed.
  • And on the back end, a Cassandra cluster to store operational metadata.

For capacity planning, we have scoped as follows:

  • The API frontend instances can respond to roughly 80 requests per second. We expect about 350 requests per second given this number of users. Therefore we should only need four regular instances for this layer.
  • The marketing front end shouldn’t need more than two instances for redundancy.
  • The application layer will need four instances for image processing and storage control.
  • The Cassandra cluster will need five instances with a higher memory footprint. Let’s assume for now that the workload is entirely static, and autoscaling isn’t being used (oh don’t worry, we’ll add that and more back in later).

In Figure 1, you can see our example application logical architecture looks like this:
To explain the nuances of cloud pricing, let’s use Google Cloud Platform and Amazon Web Services as the example cloud infrastructure providers, and start at the most simple, on-demand model. We can use calculators that each provider offers to find out correct pricing quickly:

Please note that we completed these calculations on January 12, 2015, and have included the output prices in this post. Any discrepancies are likely due to pricing or calculator changes following the publishing of this post.

Here is the output of the pricing calculators:

Google Cloud Platform estimate:
Monthly: $2610.90

Amazon Web Services estimate:
Monthly: $4201.68

It’s important to note that right away things don’t look equivalent, with Google’s pricing being 38% lower. Why? Google includes an automatic discount called Sustained Usage Discount, which reduces the cost of long-running instances. Since we didn’t autoscale or otherwise vary our system over the course of the month, the full 30% discount applies. Even without that, pricing before the discount comes in at $3729.86, or an 11% discount off Amazon’s on-demand rates. Over the course of a year, going with Google would save you just over $19,000!

Reserved Instances

Amazon Web Services has an alternate payment model, where you can make a commitment to run infrastructure for a longer period of time (either 1 or 3 years), and opt to pay some portion of the costs up front, which they call Reserved Instances. Here are the costs for our example app with Amazon’s Reserved Instance pricing:

Amazon Web Services, no-upfront, 1 year estimate:
Monthly: $2993.00

Over a one-year term with Amazon, if you commit to pay for the instance for that entire period, and you opt for the “no-upfront” option, you still end up with a 13% higher cost than making no commitment to Google.

Amazon Web Services, partial upfront, 1 year estimate:
Upfront: $18164.00
Monthly: $1093.54
Effective monthly: $2607.21

If you opt to pay over $18k up front using the “partial upfront” model, you arrive at a lower price, saving $44 dollars (not thousands) over the course of the year

Amazon Web Services, all upfront, 1 year estimate:
Upfront: $30,649.00
Monthly: $0.00
Effective monthly: $2554.08

If you choose instead to pay 100% of the yearly cost up front, you’d end up saving $681.78 over the course of the year versus Google Cloud Platform, or 2.3%. As you can see, however, the upfront payment is over $30,000!

Similarly, Amazon offers three-year options for the partial upfront and all upfront models:

Partial upfront, 3 year estimate:
Upfront: $27,585.00
Monthly: $897.90
Effective monthly: $1664.15

All upfront, 3 year estimate:
Upfront: $56,303.00
Monthly: $0.00
Effective monthly: $1563.97

If you’re willing to part with just over $56,000 for the three-year, all upfront Reserved Instance, you’d receive a 40% discount off of Google’s rate, for a total projected gap of over $37k.

However, as I’m sure you can surmise, there are several risks that a significant up front commitment and payment create. The bottom line –- you’re locked in to a long-term pricing contract, and you risk missing out on substantial savings. Lets look at why:
  1. Infrastructure prices will drop, either for Google (which has happened 3 times in the last 12 months, as we've reintroduced Moore’s law to the cloud), or for Amazon (which has happened 2 times in the last 12 months). For 2014, this worked out to an average of a 4.85% price reduction per month on Google Cloud Platform. Due to on-demand pricing, any reduction in prices is something you automatically receive on GCP.
  2. Also, don’t forget, capital is expensive! Most businesses pay a ~7% per year cost of capital, which reduces the value of these up-front purchases significantly. For this example, that adds an effective $11,823.63 to the 3-year all up-front Reserved Instance price from Amazon.

So, let’s revisit that $37,689.40 gap. By adding in the cost of capital, and subtracting likely instance price reductions, even at the most aggressive discount AWS offers, AWS costs $60,244.21 and Google Cloud Platform costs $57,959.57, which equates to a 3.9% cost advantage.

By combining conservative evaluations of the basic facts of public cloud pricing dynamics (3% per month price reductions, 7% cost of capital) even 3-year all-upfront RI’s from AWS are not cost efficient compared to on-demand Sustained Use Discounts from Google Cloud Platform.


There are also cost risks to this structure presented by commitment to specific usage choices.

  1. New instance types might make your old choices inefficient (c3 instances from AWS are substantially more cost efficient for some workloads than older m3 instances, for example).
  2. Your software might change. For example, what if you improve the efficiency of your software to reduce your infrastructure requirements by 50%? Or what if you re-platform from Windows to Linux? (Reserved Instances require a commit on OS type) Or what if your memory needs to grow, and instances need to switch from standard to high-memory variants?
  3. Your needs might change. For example, what if a new competitor arrives who takes ½ of your customers, which reduces the load on your infrastructure by 50%?
  4. What if you picked everything right but the geography, and your app is suddenly popular in Asia or Europe?

The “on-demand” agility and flexibility of cloud computing is supposed to be a huge financial benefit, especially when your requirements change. Let’s imagine in the second month, several of those risks above actually happen: you move to the Asian market, resize a few instances to better map to actual workload, and shrink a bit on the cassandra cluster redundancy due to how reliable instances with live-migration are. That would look something like Figure 2.
Google Compute Engine estimate:
Monthly: $909.72

Amazon Web Services Partial upfront, 1 year, estimate:
Upfront: $6350.00
Monthly: $331.42
Effective monthly: $860.59

This system costs less than ½ of what the original system costs, and is on an entirely different continent, but what does it cost to change your plan? This change costs very little at Google: you don’t pay any direct penalty for changing your infrastructure design. Your only costs would be how long the two different systems are up and running simultaneously to facilitate a zero-downtime cut-over.

In stark contrast, the cost for changing the Amazon system are essentially the total loss of whatever committed funds you applied to earn the discount, plus, the new requirement for upfront funds to get an efficient price (and re-commit!) in your new configuration, on top of the above-mentioned dual system usage (which costs more per hour...)

Let’s look at this from a cash flow perspective, not even in the worst case, but just assuming that you wanted to break-even with Google pricing on Amazon and chose the partial up front one-year Reserved Instance.

Google: Month 1 usage: $2610.90 + Month 2-13 usage: $909.72 x 12 = $13,527.54

Amazon: Month 1 Commit: $18,164.00 + Month 1 usage: $1093.54 + Month 2 commit: $6350.00 + Month 2-12 usage: 331.42 *12 = $29,584.58

That’s a big gap, even without figuring in the cost of capital! You can see how risky those commitments can be. AWS has a service to mitigate some of that risk, a RI marketplace, which allows you to attempt to sell back Reserved Instance units to other AWS customers. However, as I’m sure you can imagine, this is another process that presents a few risks:
  1. Are the RI’s you’re selling, for instance, types that are now clearly inefficient for many workloads and therefore not desirable to other customers?
  2. Will your RI’s sell for full price, or some discount to encourage a sale?
  3. How many buyers are there in the marketplace, and how quick will your RI’s sell, if at all?
  4. What if you didn’t start out in the US? The RI Marketplace is only available for customers with a US bank account.
One risk that's a guaranteed loss: every sale on the RI marketplace comes with a 12% fee, payable to Amazon. Let’s say you have great luck and are able to sell 10 months of your original 12-month RI (they have to be sold in whole-month increments, rounding down), at full original price, which nets you back $13,320.27 after fees. Now your 13-month total is $16,083.19, so you’ve only lost $2,555.65 compared to what you would have paid using Google. But what a hassle, and how much risk did you take on? What if the RI’s didn’t sell for a few months? Every month, you lose $1,332. Ouch!

Automatic Scaling

But this is a backwards example you say, cloud isn’t intended for this kind of static sizing, you’re supposed to be autoscaling to tightly follow load. True! So, let’s imagine that the above reflects the requirements of our steady-state load, and we have four small peaks during the day: morning rush, lunch peak, after-work, and midnight madness, each of which pop at 10x the above workload. (Our application passes the toothbrush test!) Our backend handles these spikes fine, but our web and API tiers need to autoscale dramatically. Let’s say each of these peaks onset very rapidly, say over the course of five minutes, and last for 15 minutes each. Note, we see systems that spike at 100x or more, so this scenario isn’t extreme!

This kind of system is pretty easy to build efficiently on Google. Instances take roughly a minute to launch, so we can easily autoscale to accommodate load, and since we charge only a minimum of 10 minutes and bill in per-minute increments, this only adds $110.77 a month to our bill. 10x peaks!

Google Compute Engine estimate:
Monthly additional: $110.77

Building this on AWS is just not as efficient. Because instances take >5 minutes on average to launch, we need to pre-trigger our instance boots (read, timing logic or manual maintenance). Also, AWS bills for instances in full hour increments, so we pay for 60 minutes when we only use ~20, for each of our 4 peaks. This makes the total additional cost $341.60, and without any ability to appropriately discount via reserved instances, that’s a number an AWS customer can’t bring down today.

Amazon Web Services estimate:
Monthly additional: $341.60
            + instance launch management logic manual ops or development

While this spike example is one utilization behavior we see frequently, we also see basic diurnal (twice daily, aka day/night) variability on almost every customer-facing service of anywhere from 2x-5x utilization. If that natural variation isn’t being followed by use of Autoscaler or other automated resource management, you are definitely leaving money on the table!


While there are many more dimensions to evaluate, hopefully this is a helpful analysis of how pricing systems differ between Google and Amazon. We’re not stopping here; look forward to more comparisons with more cloud providers and more workloads to help you understand exactly what you get for your money.

We are hyper-focused on driving cost out of cloud services, and leading the way with innovations such as Sustained Usage Discounts and per-minute billing. As one of our customers, StarMaker Interactive VP of Engineering Christian F. Howes said, “App Engine's minute-by-minute scaling and billing saves us as much as $3,000 USD per month.”

We think pricing considerations are critical for users trying to make the best decision they can about infrastructure systems design. I’d love to hear your thoughts and what matters to you in cloud pricing? What areas are confusing, hard to analyze, hard to predict? What ideas do you have? Reach out!

-Posted by Miles Ward, Global Head of Solutions, Google Cloud Platform

Interested in cloud computing with containers? Join us for an evening with the experts on Kubernetes, the open source container cluster orchestration platform. There will be talks, demos, a panel discussion, and refreshments sponsored by Intel.

Many contributors to Kubernetes will be attending, including Google, Red Hat, CoreOS, and others.

Time: 6:00PM-10:00PM PST
Location: San Francisco, CA

Detailed agenda coming soon. Register here.

Today, Black Duck Software announced their annual Open Source Rookie of the Year awards. We’re very excited that two of our open source projects, Kubernetes and cAdvisor, were amongst those selected! The award recognizes the top new open source projects of the past year. Both projects center on containers and how they’re run in clusters. Kubernetes is a container cluster manager and cAdvisor analyzes the performance of running containers. Read on to learn more about these projects.

Developers want to focus on writing code, and IT operations want to focus on running applications efficiently. Using Docker containers helps to define the boundaries and improve portability. Kubernetes takes that one step further and lets users deploy, manage, and orchestrate a container cluster as a single system.

Kubernetes is designed to be portable across any infrastructure, which allows application owners to deploy on laptops, servers, or cloud, including Google Cloud Platform, Amazon Web Service and Microsoft Azure.

It lets you break applications down into small sets of containers that can be reused. It then schedules these containers onto machines and actively manages them. These can be logically grouped to make it even easier for users to manage and discover them. Kubernetes is lightweight, portable, and extensible. You can start running your own clusters today.

Kubernetes started about a year ago as a small group of Googlers who wanted to bring our internal cluster management concepts to the open source containers ecosystem. Drawing from from Google’s 10+ years of experience running container clusters at massive scale, the group developed the first few prototypes of Kubernetes. Six months, and lots of work later, the first version of Kubernetes was released as an open source project. We were all humbled and excited to see the overwhelming positive response the project received. Although it started as a Google project, it quickly gained owners from RedHat, Core OS, and many many contributors. In November, we announced Google Container Engine, which offers a hosted Kubernetes cluster running in the Google Cloud Platform. This makes it even easier to run Kubernetes by letting us manage the cluster for you.

What’s next for Kubernetes? The team and community is furiously working towards version 1.0, the first production-ready release. Expect to see a slew of improvements in user experience, reliability, and integration with other open source tools.

cAdvisor analyzes the resource usage and performance characteristics of running containers. It aims to give users and automated systems a deep understanding of how their containers are performing. The information it gathers is exposed via a live-updating UI (see a screenshot below) and through an API for processing by systems like InfluxDB and Google’s BigQuery. cAdvisor was released alongside Kubernetes back in June and has since become a defacto standard for monitoring Docker containers. Today, it’s run on all Kubernetes clusters and can monitor any type of Linux container. cAdvisor has even become one of the most downloaded images on the Docker Hub.

Below is a screenshot of part of the cAdvisor UI showing the live-updating resource usage of a container. The screenshot shows total CPU and memory consumption over time as well as the instantaneous breakdown of memory usage.

Continuously updating view of a container's resource usage

The cAdvisor team is working to make it even easier to understand your running containers by surfacing events that let you know that your containers are not getting enough resources. Alongside these, come suggestions on actions you can take to remedy the problem. Events and suggestions can be integrated into systems like Kubernetes to allow for auto-scaling, resizing, overcommitment, and quality of service guarantees for containers.

We’re extremely grateful to the open source community for embracing both of these projects so widely. Our aim was to address a need we saw in the open source containers community and start a dialogue around containers and how they should be run. And as we continue to collaborate with the open source community, we look forward to evolving these projects. We invite you to join us in making Kubernetes and cAdvisor better! Try them out, open issues, send patches, and start discussions. Happy hacking!

-Posted by Greg DeMichillie, Director of Product Management