Training

Data Driven Transformation with Google Cloud

Mode:

Online, In person

At Matza Education, each training course is designed to offer practical and relevant knowledge, connecting theory and application in real-life scenarios. Our aim is to prepare professionals for the challenges of the market, strengthening technical and strategic skills in different areas of technology and management.

By taking part in one of our programs, you will have access to up-to-date content, experienced instructors and a results-oriented methodology. Regardless of the format - face-to-face or online - we aim to create a dynamic, accessible and high-impact learning experience.

More than just a course, each training is an opportunity for professional and personal development, helping you to gain certifications, expand your skills and stand out in an increasingly competitive market.

Important: you must confirm the e-mail you received after registering to validate your participation.

Extract, load, transform, clean and validate data
Designing channels and architectures for data processing
Create and maintain machine learning models and statistical models
Query data sets, view query results and create reports

Designing and creating data processing systems on the Google Cloud Platform
Batch and streaming data processing, implementing automatic scaling data channels in Cloud Dataflow
Derive business insights from extremely large data sets using Google BigQuery
Train, evaluate and predict with machine learning models using TensorFlow and Cloud ML
Leverage unstructured data with Spark APIs and machine learning in the Dataproc Cloud
Provide instant insights from streaming data

Completed Google Cloud Fundamentals: Big Data & Machine Learning course OR equivalent experience
Basic proficiency in common query languages such as SQL
Experience with data modeling activities, extraction, transformation and loading
Application development with a common programming language such as Python
Familiarity with Machine Learning and/or statistics

4 days - 32 class hours - Live or Online

Module 1: Overview of Google Cloud Dataproc
- Cluster creation and management
- Use of customized machine types and preemptive work nodes
- Scaling and excluding clusters
- Lab: How to create Hadoop clusters with Google Cloud Dataproc
Module 2: Running Dataproc jobs
- Execution of Pig and Hive jobs
- Separation of storage and computing
- Lab: How to run Hadoop and Spark jobs with Dataproc
- Lab: Sending and monitoring jobs
Module 3: Integrating Dataproc with Google Cloud Platform
- Customizing clusters with startup actions
- BigQuery support
- Lab: How to take advantage of Google Cloud Platform services
Module 4: Solution for unstructured data with Google's Machine Learning APIs
- Google Machine Learning APIs
- Common ML use cases
- Invoking ML APIs
- Lab: How to add Machine Learning features to Big Data analysis
Module 5: Serverless data analysis with BigQuery
- What is BigQuery
- Queries and functions
- Lab: How to write queries in BigQuery
- Loading data into BigQuery
- Exporting data from BigQuery
- Lab: How to load and export data
- Nested and repeated fields
- Querying several tables
- Laboratory: Complex consultations
- Performance and prices
Module 6: Auto-scaling and serverless data channels with Dataflow
- The Beam programming model
- Data channels in Beam Python
- Data channels in Beam Java
- Lab: How to write a Dataflow channel
- Scalable Big Data processing with Beam
- Lab: MapReduce in Dataflow
- Incorporating additional data
- Lab: Secondary inputs
- Streaming data processing
- GCP reference architecture
Module 7: Getting started with Machine Learning
- What is machine learning (ML)
- Effective ML: concepts, types
- ML data sets: generalization
- Lab: Explore and create ML datasets
Module 8: Creating ML models with TensorFlow
- Getting started with TensorFlow
- Lab: How to use tf.learn
- TensorFlow graphs and loops + lab
- Lab: How to use low-level TensorFlow + early stop
- Monitoring ML training
- Lab: TensorFlow training tables and graphs
Module 9: Scaling ML models with CloudML
- Why use Cloud ML?
- Packaging a TensorFlow model
- Complete training
- Lab: Running an ML model locally and in the cloud
Module 10: Attribute engineering
- Creating ideal attributes
- Input transformation
- Synthetic attributes
- Pre-processing with Cloud ML
- Laboratory: Attribute Engineering
Module 11: Architecture of streaming analysis channels
- Streaming data processing: challenges
- Processing variable data volumes
- Processing unsorted/delayed data
- Lab: How to create streaming channels
Module 12: Intake of variable volumes
- What is Cloud Pub/Sub?
- How it works: topics and subscriptions
- Laboratory: Simulator
Module 13: Implementing streaming channels
- Streaming processing challenges
- Delayed data processing: watermarks, triggers, accumulation
- Laboratory: Streaming data processing channel for real-time traffic data
Module 14: Dashboards and streaming analysis
- Streaming analytics: from data to decisions
- Querying streaming data with BigQuery
- What is Google Data Studio?
- Lab: Create a real-time dashboard to visualize processed data
Module 15: High capacity and low latency with Bigtable
- What is Cloud Spanner?
- Creating Bigtable schemas
- How to process in Bigtable
- Lab: how to stream on Bigtable

Data Driven Transformation with Google Cloud

Mode:

Online, In person

Learning never stops:
continue with Matza on other channels

Go behind the scenes at Matza:
Follow us on social media

Our support:
Contact us directly

Data Driven Transformation with Google Cloud

Mode:

Online, In person

Inactive