Logs generated by applications and services can provide an immense amount of information about how your deployment is running and the experiences your users are having as they interact with the products and services. But as deployments grow more complex, gleaning insights from this data becomes more challenging. Logs come from an increasing number of sources, so they can be hard to collate and query for useful information. And building, operating and maintaining your own infrastructure to analyze log data at scale requires extensive expertise in running distributed systems and storage. Today, we’re introducing a new solution paper and reference implementation that will show how you can process logs from multiple sources and extract meaningful information by using Google Cloud Platform and Google Cloud Dataflow.

Log processing typically involves some combination of the following activities:

  • Configuring applications and services
  • Collecting and capturing log files
  • Storing and managing log data
  • Processing and extracting data
  • Persisting insights

Each of those components has it’s own scaling and management challenges, often using different approaches at different times. These sorts of challenges can slow down the generation of meaningful, actionable information from your log data.

Cloud Platform provides a number of services that can help you to address these challenges. You can use Cloud Logging to collect logs from applications and services, and then store them in Google Cloud Storage buckets or stream them to Pub/Sub topics. Dataflow can read from Cloud Storage or Pub/Sub (and many more), process log data, extract and transform metadata and compute aggregations. You can persist the output from Dataflow in BigQuery, where it can be analyzed or reviewed anytime. These mechanisms are offered as managed services—meaning they can scale when needed. That also means that you don't need to worry about provisioning resources up front.

The solution paper and reference implementation describe how you can use Dataflow to process log data from multiple sources and persist findings directly in BigQuery. You’ll learn how to configure Cloud Logging to collect logs from applications running in Container Engine, how to export those logs to Cloud Storage, and how to execute the Dataflow processing job. In addition, the solution shows you how to reconfigure Cloud Logging to use Pub/Sub to stream data directly to Dataflow, so you can process logs in real-time.

Check out the Processing Logs at Scale using Cloud Dataflow solution to learn how to combine logging, storage, processing and persistence into a scalable log processing approach. Then take a look at the reference implementation tutorial on Github to deploy a complete end-to-end working example. Feedback is welcome and appreciated; comment here, submit a pull request, create an issue, or find me on Twitter @crcsmnky and let me know how I can help.

- Posted by Sandeep Parikh, Google Solutions Architect

(Cross-posted on the Google for Work Blog.)

Since the launch of our first product for businesses, the Google Search Appliance, in 2002, we’ve been building more and more products that help make businesses more productive. From Gmail to Docs to Chromebooks and Google Cloud Platform, we are now helping millions of businesses transform and support their businesses with our Cloud products. In fact, more than 60% of the Fortune 500 are actively using a paid Google for Work product. And all of Google’s own businesses run on our cloud infrastructure. Including our own services, Google has significantly larger data center capacity than any other public cloud provider  part of what makes it possible for customers to receive the best price and performance for compute and storage services.

All of this demonstrates great momentum, but it’s really just the beginning. In fact, only a tiny fraction of the world’s data is currently in the cloud  most businesses and applications aren’t cloud-based yet. This is an important and fast-growing area for Google and we’re investing for the future.

That’s why we’re so excited that Diane Greene will lead a new team combining all our cloud businesses, including Google for Work, Cloud Platform, and Google Apps. This new business will bring together product, engineering, marketing and sales and allow us to operate in a much more integrated, coordinated fashion.

As a long-time industry veteran and co-founder and CEO of VMWare, Diane needs no introduction. Cloud computing is revolutionizing the way people live and work, and there is no better person to lead this important area. We’re also lucky that Diane has agreed to remain on Google’s Board of Directors (she has already served three years here)  as she has a huge amount of operational experience that will continue to help the company.

I’m equally excited that Google has entered into an agreement to acquire a company founded by Diane: bebop is a new development platform that makes it easy to build and maintain enterprise applications. We think this will help many more businesses find great applications, and reap the benefits of cloud computing. bebop and its stellar team will help us provide integrated cloud products at every level: end-user platforms like Android and Chromebooks, infrastructure and services in Google Cloud Platform, developer frameworks for mobile and enterprise users, and end-user applications like Gmail and Docs. Both Diane and the bebop team will join Google upon close of the acquisition.

With these announcements, we’re excited to take the next step in helping businesses take advantage of the cloud to work better, operate more securely, run more efficiently and grow faster.

- Posted by Sundar Pichai, CEO

Google’s global network is a key piece of our foundation, enabling all of Google Cloud Platform’s services. Our customers have reiterated to us the critical importance of business continuity and service quality for their key processes, especially around network performance given today’s media-rich web and mobile applications.

We’re making several important announcements today: the general availability of HTTPS Load Balancing, and sustained performance gains from our software-defined network virtualization stack Andromeda, from which customers gain immediate benefits. We’re also introducing Cloud Router and Subnetworks, which together enable fine-grained network management and control demanded by our leading enterprise customers.

In line with our belief that speed is a feature, we’re also extremely pleased to welcome Akamai into our CDN Interconnect program. Origin traffic from Google egressing out to select Akamai CDN locations will take a private route on Google’s edge network, helping to reduce latency and egress costs for our joint customers. Akamai’s peering with Google at a growing number of points-of-presence across Google’s extensive global networking footprint enables us to deliver to our customers the responsiveness they expect from Google’s services.

General Availability of HTTPS Load Balancing. Google’s private fiber network connects our data centers where your applications run to one of more than 70 global network points of presence. HTTPS Load Balancing deployed at these key points across the globe dramatically reduces latency and increases availability for your customers critically important to achieving the responsiveness users expect from today’s most demanding web and mobile apps. For full details, see the documentation.
Figure 1 Our global load balancing location

Andromeda. Over the past year, we’ve written about our innovations made in Google’s data centers and networking to serve world-class services like Search, YouTube, Maps and Drive. The Cloud Platform team ensures that the benefits from these gains are passed onto customers with no additional effort on their part. Andromeda, Google’s software-defined network virtualization stack, quantifies some of these gains especially around performance. The chart below shows network throughput gains in Gbits/sec: in a little over a year, throughput has doubled for both single-stream and 200-stream benchmarks.

Subnetworks. Subnetworks allow you to segment IP space into regional prefixes. As a result, you gain fine-grained control over the full logical range in your private IP space, avoiding the need to create multiple networks, and providing full flexibility to create your desired topology.

Additionally, if you’re a VPN customer, you’ll see immediate enhancement as subnetworks allow you to configure your VPN gateway with different destination IP ranges per-region in the same network. In addition to providing more control over VPN routes, regional targeting affords lower latency compared to a single IP range spanning across all regions. Get started with subnetworks here.

Cloud Router. With Cloud Router, your enterprise-grade VPN to Google gets dynamic routing. Network topology changes on either end propagate automatically using BGP, eliminating the need to configure static routes or restart VPN tunnels. You get seamless connectivity with no traffic disruption. Learn more here.

Akamai and CDN Interconnect. Cloud Platform traffic egressing out to select Akamai CDN locations travel over direct peering links and are priced based on Google Cloud Interconnect fares. More information on using Akamai as a CDN Interconnect provider can be found here.

We’ll continue to invest and innovate in our networking capabilities, and pass the benefits of Google’s major networking enhancements to Cloud Platform customers. We always appreciate feedback and would love to learn how we can support your mission-critical workloads. Contact the Cloud Networking team to get started!

Posted by Morgan Dollard, Cloud Networking Product Management Lead

When choosing a virtual machine type, major cloud providers force you to overbuy; since VMs usually come in powers of two, you need to buy 8 vCPUs, even when you only need 6. Today, this ends. With Custom Machine Types, you can create virtual machines with the shapes (i.e. vCPU and memory) that are right for your workloads.

You can create machine types with as little as 1 vCPU and up to 32 vCPUs, by increments of even numbers of vCPU. For memory, you can choose up to 6.5 GiB per vCPU. Combine different number of vCPUs with different memory sizes to get the best possible price/performance fit for your workload. If your needs change, you can move your application to another configuration.

For example, let's say you have a workload that works best on a machine type that is somewhere between the predefined n1-standard-8 and n1-standard-16 machine types. Instead of rounding up, you can create a custom machine type. The following table shows the monthly price of your Custom Machine Type compared to the other two:
*This is the price per month (current as of 11/18/2015), with full sustained-use discounts applied, when the instance runs for 100% of the month. To get a feel for how sustained usage pricing applies in your use cases, use the pricing calculator. Hourly rates are based on 730 hours a month including full sustained discount.

This is just one example of a recurring theme: the workload you run will rarely fit into a pre-defined shape. Once you "round up", you'll end up paying up to twice as much for just one more vCPU! Custom machine types solve this problem, letting you fit the VM to your workload, saving you money.

Custom Machine Types are priced based on hourly usage per vCPUs and per GiB of memory. A 8 vCPU 20GiB memory VM costs twice as much as a 4 vCPU 10GiB memory VM. You also get our standard customer-friendly pricing like per-minute billing and sustained use discounts with Custom machine types.

Give Custom Machine Types a try today! Custom Machine Types are supported by the gcloud command line tool and through our API. Creating a VM is as easy as:

$ gcloud components update
$ gcloud compute instances create my-custom-vm --custom-cpu 12 --custom-memory 45 --zone us-central1-f

For more info on Custom Machine Types, see our documentation here.

We’re rolling out Custom Machine Type support in Google Developers Console over the next few days. Visit the Google Compute Engine section of Google Developers Console and click Create Instance. In the Create instance page, you'll notice Machine Type now has a Basic and Customize view. Click Customize and build a virtual machine to fit your needs.

Custom Machine Types are available in beta and will work with CentOS, CoreOS, Debian, OpenSUSE and Ubuntu. Additional operating systems will be supported in the future. See the documentation for details.

Custom Machine Types and resource-based pricing are two more steps in our quest to create a cloud platform you can fit to your needs, instead of the other way around. Let us know what you think! You can reach us at

- Posted by Sami Iqram, Product Manager, Google Cloud Platform

Today’s guest post comes from Dale Humby, CTO of Nomanini, an enterprise payments platform provider based in South Africa that enables transactions in the cash-based retail sector. With Google Cloud Platform, Nomanini has seen a 20 percent boost in productivity since developers can focus on rolling out new features instead of focusing on infrastructure. Read how Humby and his team are enabling the delivery of essential services in far-flung locations.

About 50 percent of Africa’s population, or half a billion people, live on less than $1 per day. Many people don't have reliable access to electricity and telecommunications. Nomanini and our partners are helping change this by providing access to these essential services. Our network of merchants, equipped with point of sale terminals and a financial backend powered by Google Cloud Platform, allows us to distribute pre-paid airtime and electricity to far-flung, rural villages in an affordable and reliable way.
Vendors in Maputo

Our custom built, ruggedized point of sale terminals are known for their speed and reliability  and our financial processing backend has to be just as reliable. Merchants use Nomanini’s platform to make a living by getting commissions from their sales. Even a single incorrect sale or a few minutes of downtime costs merchants customers and money.

Many people choose Google App Engine because it automatically scales to hundreds of servers and beyond. While this is important to us as we grow exponentially, we initially chose App Engine for its reliability: Downtime puts merchants’ livelihoods at risk. We make extensive use of Task Queues for all processing, and Datastore, GCP’s high-availability NoSQL database, as our financial transaction store. In addition, all data is streamed in real-time to BigQuery for customer analytics and reporting. Nightly reconciliation jobs export data from Datastore to Google Cloud Storage for long-term backup.

The financial technology space is highly competitive. We have to continually improve our product or get left behind. Over the past four years Nomanini has built an engineering process that allows us to continually innovate, while simultaneously ensuring our product is (almost) bug free. Our team has doubled from three developers to six, while the number of deployments to production has increased from one per month to more than six per day  representing an increase in development velocity of 120 times. To do this we’ve been using Kanban, a production methodology developed by Toyota, and more recently adopted by the software industry as a way to remain agile.
Team size and deploys to production per month since 2011

By continually improving our development process, we can build better products for our customers and provide updates that help them grow their own returns, like rolling out firmware with new products for our clients to sell and improving battery life.

Once code for the embedded system has been committed, it’s pushed to our cloud-hosted Mercurial source code repository. The continuous integration server checks out the code, builds all the artifacts, including executable binaries for the point of sale terminals, and uploads them to Google Cloud Storage. We run a service on App Engine that manages our build pipeline, tracking each commit through CI and test, Alpha, Beta and Stable phases.
Release pipeline: New versions in CI at top, stable version in production below

Production terminals are subscribed to either the Beta or Stable channels, and when a new version of the application is available they download the binaries directly from Cloud Storage over the GSM network, then install and upgrade automatically. Using Cloud Storage means we don’t have to manage our own FTP or HTTP file server, reducing operational complexity and cost and increasing reliability of the upgrade service.

Terminals collect diagnostic information and stream it to BigQuery. We run statistical tests comparing metrics between versions of code, and have automated alerts when code in Beta deviates significantly from the Stable version, indicating a potential issue with the new version. (For more details, I’ve written a blog post on Monitoring distributed systems with BigQuery and R)

We make extensive use of Google Cloud Monitoring to view performance and business metrics on TV’s around the office, and receive SMS and email alerts if Cloud Monitoring detects issues. All server logs are streamed to BigQuery where we can investigate issues, store logs for auditing and run analyses, for example, to highlight areas for performance improvement.

Cloud Platform offers a cohesive set of services that would be difficult to build with our small team, and almost impossible to host as reliably and securely as Google. By building on top of Cloud Platform, we can release features for production within a few hours of completing development, delighting our customers and giving us a significant competitive advantage. But what truly drives our product innovation is the knowledge that the better our products, the easier it is for people in remote rural villages to switch the lights on or get in touch with a loved one who’s far away.

To learn more about how Nomanini uses GCP, read our case study.

Posted by Dale Humby, CTO, Nomanini


Earlier today, Netflix announced Spinnaker, an open source, multi-cloud continuous delivery platform for releasing software changes with high velocity and confidence.

Continuous delivery (CD) integrates software build, test and release processes, helping organizations to move quickly in response to market needs and customer demands. Spinnaker enables you to define your own CD workflows by linking together a series of steps into pipelines that can be triggered manually or automatically, for example, based on specific events such as the completion of a pipeline or a Jenkins job. Pipelines can be reused across workflows.

We are pleased to announce that you can now use Spinnaker to manage your CD workflows on Google Cloud Platform. To get started quickly, you can deploy Spinnaker using Google Cloud Launcher, or follow the instructions at to install and run Spinnaker.

A late 2013 study by Computer Associates indicates that CD yields an average 21% increase in new features delivered, a 22% improvement in product quality, a 19% increase in revenue and 50% fewer failures. Today,  about 50% of organizations have implemented CD for some or all of their projects, according to a DZone 2015 survey.2 With these kinds of benefits, it’s only a matter of time before CD becomes the norm.

To learn more, check out our tutorial on managing deployments on Google Cloud Platform with Spinnaker, ask a question on Stack Overflow, or chat with us on the Spinnaker Slack channel. In the coming months, we plan to add support for containers, and to support CD on Kubernetes and Google Container Engine. If you’re new to Google Cloud Platform, redeem free credit to try out Spinnaker on Google Cloud Platform.

Posted by Rick Buskens, Engineering Manager at Google

1 CA Technologies, “DevOps Driving 20 Percent Faster Time-to-Market for New Services, Global IT Study Reveals,” 9/12/13, survey of 1,300 senior IT decision-makers worldwide.

2 DZone 2015 Continuous Delivery Survey.

Our startup series shines the spotlight on some of the most innovative and game-changing companies using the Google Cloud Platform for Startups program, designed to help startups build and scale. Today we hear from Mark Johnson, CEO of Descartes Labs, a deep-learning satellite image analysis company that reveals insights about the Earth’s resources. Learn how Descartes Labs can make quick and accurate predictions about U.S. agriculture yield, spin up more than 30,000 processor cores and sustain 23 gigabytes per second for 16 hours using Google Cloud Platform.

For the past four decades, NASA satellites have been capturing images of our planet from space. However, until recently, that archive of images was not easily accessible because of the limitations of desktop computers. Descartes Labs analyzes all of the images from NASA and other satellites to gain invaluable insights about human populations and natural resources, such as the growth and health of crops, the growth of cities, the spread of forest fires and the state of available drinking water across the globe. Our first product tracks corn (maize) crops across the world; based on an 11-year historical backtest, we’re now predicting the U.S. corn yield faster and more accurately than the United States Department of Agriculture.

If we had to set up our own data center and physical infrastructure, we could not have started this company. The storage and bandwidth alone, to be able to download over a petabyte (8x1015 bits) of satellite images from NASA’s servers, would have been prohibitive. Fortunately, Google hosts a repository for all the public Landsat imagery in its Google Earth Engine project. We were granted full access to that data and set up high-bandwidth links between its Google Compute Engine applications running in containers and the satellite data. Otherwise, we would have spent all our time moving data.

With Google Cloud Platform, we benefitted by having a virtual supercomputer on demand, without having to deal with all the usual space, power, cooling and networking issues. Just a few years ago, we would have needed to use the largest supercomputers on the planet to do what we’re now able to do with Google. In a recent analysis run, we spun up more than 30,000 processor cores — something unimaginable a few years ago — and sustained throughput of 23 gigabytes per second for 16 hours. That’s an incredible feat considering that, within seconds after our big run was over, those computational resources were released and made available to other customers for their own needs.

When we learned about the Google Cloud Platform for Startups program from an analyst report, we immediately applied through Crosslink Capital, our primary venture funder. We already relied on Compute Engine and Google Cloud Storage. The startup program provided us a great opportunity to evaluate Cloud Platform services such as Google Container Engine, Google Container Registry, Google BigQuery and Google Cloud Monitoring, all of which we now use with our customers. We wouldn’t have been able to do our petabyte run if we didn’t have the one-year credit from the startups program.

Google was very generous to give us a sandbox to play in before we started paying. From a cost perspective, Cloud Platform is very competitive with other cloud solutions — and it has the capacity to scale. What’s more, Google is innovative in areas like containers, which makes it easy to move workloads to the Google stack.

For startups evaluating cloud solutions, make sure the solution you choose gives you the freedom to experiment; lets your team focus on product development, not IT management; and aligns with your company’s budget.

Most importantly for a small startup like us with dreams to change the world is the partnership we’ve formed with the GCP technical team. Google truly went out of its way to help us get up and running and maximize the value of the platform. We look forward to working with the Google team to push the limits of cloud computing.

Posted by Mark Johnson, CEO, Descartes Labs