Google Cloud Platform Core Services:
Google divides the services into the following logical groups:
- Computing and hosting services
- Storage services
- Networking services
- Big data services
- Machine learning (ML) services
- Identity services
We will have a look at each of these:
Computing and hosting services:
We are given a variety of options when it comes to computing in GCP. Depending on our requirements and flexibility, we can choose from one of the following four options.
- Infrastructure as a Service (IaaS): Google Compute Engine (GCE)
- Container as a Service (CaaS): Google Kubernetes Engine (GKE)
- Platform as a Service (PaaS): Google App Engine (GAE)
- Function as a Service (FaaS): Cloud Functions
The choice we make can depend on several factors. For example, do we need full control over our infrastructure, or do we want a fully managed service?
Starting with Compute Engine, we have control from a virtual machine (VM) container.
This gives us the most flexibility but also implies that we need to take care of the stack above it.
The advantage of Cloud Functions is that you don’t need to worry about infrastructure and scaling. You only concentrate on developing your functions.
If you need to use a language that is not supported by Cloud Functions, you will not be able to use it.
The computing options in GCP are as shown in the following diagram:
Let’s have a look at each of the compute options and see what is managed by Google for us against the flexibility that we are given:
GCE: GCE is an IaaS offering. It allows the most flexibility as it provides compute infrastructure to provision VM instances. This means that you have full control of the instance hardware and operating system.
You can use standard GCP images or your own custom image. You can control where your VMs and storage are located in terms of regions and zones.
You have granular control over the network, including firewalls and load balancing. With the use of an instance group, you can autoscale your control and your capacity as needed. compute Engine is suitable in most cases, but might not be an optimal solution.
GKE: GKE is a CaaS offering. It allows you to create Kubernetes clusters on-demand, which takes away all of the heavy liftings of installing the clusters yourself.
It leverages Compute Engine for hosting the cluster nodes, but the customer does not need to bother with the infrastructure and can concentrate on writing the code.
The provision cluster can be automatically updated and scaled. The GCP software-defined networks are integrated with GKE and allow users to create network objects, such as load balancers, on-demand when the application is deployed.
Several services integrate with GKE, such as a container repository, which allows you to store and scan your container images.
GAE: GAE is a PaaS offering. It allows you to concentrate on writing your code, while Google takes care of hosting, scaling, monitoring, and updates.
It is targeted at developers who do not need to understand the complexity of the infrastructure.
GAE offers two types of environments, as follows:
Standard: With sets of common languages supported.
Flexible: Even more languages, with the possibility of creating a custom runtime With a flexible environment, you lose some out-of-the-box integration, but you gain more flexibility.
GAE is tightly integrated with GCP services including databases and storage. It allows the versioning of your application for easy rollouts and rollbacks.
Cloud Functions: Cloud Functions is a FaaS offering. It allows you to concentrate on writing your functions in one of the supported languages.
It is ideal for executing simple tasks for data processing, mobile backends, and IoT.
This service is completely serverless and all of the layers below it are managed by Google.
The functions can be executed using an event trigger or HTTP endpoint. Now that we have studied computing services.
Storage is an essential part of Google Cloud Platform Core Services and cloud computing as it saves the data and state of your applications. GCP offers a wide variety of storage, from object storage to managed databases.
The different storage services that we will be looking at are as follows:
Cloud Storage: Cloud Storage is a fully managed, object-oriented storage service with infinite capacity. It allows the creation of buckets that store your data and allow access through APIs and other tools such as gsutil.
It comes with different flavors to best suit your needs in terms of how often your data will be accessed and where it should be located.
Keep in mind that the price differs for each tier.
Making a conscious decision will allow you to cut costs. You can choose from the following options:
- Multi-regional: The highest availability in multiple geolocations
- Regional: High availability with fixed locations
- Nearline: Low-cost, for data accessed less than once a month
- Coldline: The lowest cost for backup and disaster recovery with Cloud Storage, you do not need to worry about running out of capacity.
Filestore: Cloud Filestore is a managed file storage service. It allows users to provision a Network Attached Storage (NAS) service that can be integrated with GCE and GKE.
It comes with two performance tiers—standard and premium, which offer different Input/Output operations Per Second (IOPS) and throughputs.
Cloud SQL: Cloud SQL is a fully-managed relational database service providing either a MySQL or PostgreSQL database. It offers data replication, backups, data exports, and monitoring.
It is ideal when you need to move your current instances from on-premises and want to delegate the maintenance of the database to Google.
Cloud Datastore: Cloud Datastore is a fully managed non-SQL database. It is ideal for applications that rely on highly available structured data at scale.
The scaling and high availability are achieved by distributed architecture and are abstracted from the user. There is only one database available per project. Cloud Datastore offers SQL-like language to query your data.
Firestore: Firestore is the next generation of Cloud Datastore with several enhanced features. It can run in Native or Datastore mode. The former is compatible with Cloud Datastore.
Google has announced that all Datastore clients will be automatically moved to Firestore without any downtime or any user intervention. All new projects should be created in Firestore instead of Datastore.
Cloud Spanner: Cloud Spanner is a fully managed, globally distributed, and highly consistent database service. It is a strong and consistent relational database with non-relational database scaling capabilities.
Users can define a schema and leverage industry-standard American National Standards Institute (ANSI) 2011 SQL. It is very high-performing, with a 99.999% availability Service Level Agreement (SLA), meaning there is almost no downtime applicable.
Cloud Spanners are aimed at use cases such as financial trading, insurance, global call centers, telecoms, gaming, and e-commerce. Global consistency makes it ideal for globally accessible applications.
Bigtable: Bigtable is a fully managed, massive scale, non-SQL database with sub-10 ms latency. It is used by Google to deliver services such as Gmail and Google Maps.
It is ideal for fintech, IoT, and ML storage use cases. It integrates easily with big data product families such as Dataproc and Dataflow. It is based on open-source Apache HBase, enabling the use of its API.
The cost of Bigtable is much higher than Datastore, so the database should be chosen with great care.
Custom databases: You can also choose to use Compute Engine to install a database of your choice, such as MongoDB; however, that would be an unmanaged service.
Google Cloud Platform networking is based on Software-Defined Networks (SDNs), which allows users to deliver all networking services programmatically.
All of the services are fully managed, leaving users with the task of configuring them according to their requirements. The networking services in the Google Cloud Platform Core Services that we will be looking at are as follows:
Virtual Private Cloud (VPC): The VPC is the foundation of GCP networking. Each GCP project has a default VPC network created, but the user can also create new networks. You can think of it as a cloud version of a physical network.
A VPC can contain one or more regional subnets. A VPC creates a global logical boundary that allows communication between VMs within the same VPC. To allow communication between VPCs, traffic needs to traverse the internet or via VPC peering.
Load balancer: The load balancer allows the distribution of traffic between your workloads. It is available for GCE, GAE, and GKE. For GCE, you can choose from load balancers with global or regional scopes.
The choice will also depend on the network type. The following load balancers are available to choose from:
- HTTP(S) load balancer
- SSL proxy load balancer
- TCP proxy load balancer
- Network load balancer
- Internal TCP/UDP load balancer
Virtual Private Network (VPN): VPNs allow a connection between your on-premises network and GCP VPC through an IPsec tunnel over the internet. Only site-to-site VPNs are supported.
To establish a VPN connection, there need to be two gateways on each side of the tunnel. The traffic in transit is encrypted. Both static and dynamic routing are supported, with the former requiring a cloud router.
Using a VPN should be the first method of connecting your environment to GCP as it entails the lowest cost. If there are low-latency and high-bandwidth requirements, then Cloud Interconnect should be considered.
Cloud Interconnect: If there is a need for low latency and a highly available connection, then interconnect should be considered. In this case, the traffic does not traverse the internet.
There are two interconnect options, which are as follows:
- Dedicated Interconnect: 10 Gbps piped directly to a Google datacenter.
- Partner Interconnect: 50 Mbps-10 Gbps piped through a Google partner.
Cloud Router: Cloud Router is a service that allows for dynamic routing exchange between Compute Engine, VPNs, and external networks. It eliminates the need for the creation of static routes.
Cloud DNS: Cloud DNS is a managed DNS service with a 100% SLA. It translates domains into IP addresses. Millions of zones and records can be managed.
Cloud DNS can also host private zones accessible only from your GCP network. It can be integrated on-premises, where your local DNS is authorized and Cloud DNS is responsible for caching.
Cloud Content Delivery Network (CDN): Cloud CDN is a service that allows the caching of HTTP(S) load balanced content, including Cloud Storage bucket objects. Caching reduces content delivery time and cost.
It can also protect you from a Distributed Denial-of-Service (DDoS) attack. Data is cached on Google’s globally distributed edge points. On the first request, when content is not cached, data is retrieved from a backend service.
The next call data will be served directly from the cache until the expiration time is reached.
Cloud NAT: Cloud NAT is a regional service that allows VMs without external IPs to communicate with the internet.
It is a fully managed service with built-in auto scalability. It works with both GCE and GKE. It is a better alternative for NAT instances that need to be managed by users.
Firewall: GCP Firewall is a service that allows for micro-segmentation. Firewall rules are created per VPC and can be based on IPs, IP ranges, tags, and service accounts. Several firewall rules are created by default but can be modified.
Identity Aware Proxy (IAP): IAP is a service that replaces the VPN when a user is working from an untrusted network. It controls access to your application based on user identity, device status, and IP address. It is part of Google’s BeyondCorp security model.
Cloud Armor: Cloud Armor is a service that allows protection against infrastructure DDoS attacks using Google’s global infrastructure and security systems.
It integrates with global HTTP(S) load balancers and blocks traffic based on IP addresses or ranges. Preview mode allows users to analyze the attack pattern without cutting off regular users.
Big Data Services:
Big data services enable the user to process large amounts of data to provide answers to complex problems. GCP offers many services that tightly integrate to create an End-to-End (E2E) data analysis pipeline. These services are as follows:
BigQuery: BigQuery is a highly scalable and fully managed cloud data warehouse. It allows users to perform analytics operations with built-in ML. BigQuery is completely serverless and can host petabytes of data.
The underlying infrastructure scales seamlessly and allows parallel data processing. The data can be stored in BigQuery Storage, Cloud Storage, Bigtable, Sheets, or Google Drive. The user defines datasets containing tables. BigQuery uses familiar ANSI-compliant SQL for queries and provides ODBC and JDBC drivers.
Users can choose from two types of payment models—one is flexible and involves paying for storage and queries, and the other involves a flat rate with stable monthly costs. It is ideal for use cases such as predictive analysis, IoT, and log analysis, and integrates with GCP’s big data product family.
Pub/Sub: This is a fully managed asynchronous messaging service that allows you to loosely couple your application components. It is serverless with global availability.
Your application can publish messages to a topic or subscribe to it to pull messages. Pub/Sub can also push messages to Webhooks.
Dataproc: Dataproc is a fully managed Apache Spark and Hadoop cluster. It allows users to create clusters on demand and use them only when data processing is needed.
It is billed per second. It allows users to move already existing, on-premises clusters to the cloud without refactoring the code. The use of preemptible instances can further lower the cost.
Dataflow: Cloud Dataflow is a fully managed service for processing data in streams and batches. It is based on open-source Apache Beam, is completely serverless, and offers almost limitless capacity.
It will manage resources and job balancing for the user. It can be used for use cases such as online fraud analytics, IoT, healthcare, and logistics.
Dataprep: This is a tool that can be used to perform data visualization and exploring without any coding skills being required. Data can be interactively prepared for further analysis.
Datalab: Datalab is a built-in tool on Jupyter (formerly IPython) that allows users to explore, analyze, and transform data. It also allows users to build ML data models and leverages Compute Engine.
Data Studio: This a tool that allows you to consume data from sources and visualize it in the form of reports and dashboards.
Cloud Composer: This is a fully managed service based on open source Apache Airflow. It allows you to create and orchestrate big data pipelines.
Machine Learning (ML) Services:
One of the strongest points of Google is its long-term experience with ML. GCP offers several services around ML. You can choose between a pre-trained model or training the model yourself. The various services included under ML are as follows:
Cloud ML Engine: ML Engine is a managed service that allows you to train and host your ML models in GCP. It leverages the TensorFlow application for the training process.
The underlying infrastructure is managed by Google, while users can choose from different hardware options. The trained model can be accessed through APIs to perform predictions.
Pretrained APIs: ML APIs are services that allow you to leverage several pre-trained models, enabling you to analyze a video. Currently, the following APIs are available:
- Google Cloud Video Intelligence
- Google Cloud Speech
- Google Cloud Vision
- Google Cloud Natural Language
- Google Cloud Translation
These models can be used without any background knowledge of how they work. As an example, we can analyze text for sentiment analysis.
AutoML: AutoML is a service that can be used by developers to train models without having extensive knowledge of data science. As an example, by providing labeled samples to AutoML, it can be trained to recognize objects that are not recognizable by Vision API. The following are the labeled samples of AutoML:
- AutoML Translation
- AutoML Natural Language and Vision
Dialogflow: This is a service that allows you to build conversation applications that can interact with human beings.
The interface can interact with many compatible platforms, such as Slack or Google Assistant. It can also integrate with Firebase functions to integrate with third-party platforms using common APIs.
Identity and Access Management (IAM) is one of the most important aspects of any cloud. It allows you to control who has access to the cloud but can also provide identity services to your applications. In short, this is achieved by a combination of roles and permissions.
The roles are assigned to either users or groups. Let’s have a look at the options we have in GCP:
IAM: IAM allows the GCP admin to control authorization to GCP services. Administrators can create roles with granular permissions. Roles can then be assigned to users, or preferably, a group of users.
Cloud Identity: Cloud Identity is an Identity as a Service (IDaaS) offering. It sits outside of GCP but can be easily integrated with GCP. It allows you to create organizations, groups, and users, and manage them centrally.
If you already have an existing user catalog, you can synchronize it with Cloud Identity.