杭州Cloudera Apache Spark程序員

授課機構：杭州博學國際教育培訓中心

關注度：75

課程價格：請咨詢客服

上課地址：請咨詢客服

開課時間：滾動開班

咨詢熱線：400-850-8622

在線報名

課程詳情在線報名

更新時間：2025-02-12

Cloudera Apache Spark程序員培訓班型：公開課，內訓課程長度： 3天/18小時培訓日期：待定認證考試：暫無培訓地點：博學國際教育培訓中心環(huán)境要求：投影儀、白板、大白紙培訓形式：實例講授，現(xiàn)場演、練、及時溝通培訓資料：培訓教材課程內容 Cloudera Developer Training for Apache Spark 課程概述：結合批處理、流媒體和交互分析技術，利用 Apache Spark 構建完整統(tǒng)一的大數據應用。學習編寫復雜的并行應用程序，為各種用例、架構和行業(yè)執(zhí)行快速良好的決策和實時行動。授課對象：面向意欲優(yōu)化應用程序速度、易用性和復雜程度的開發(fā)人員和工程師。培訓對象要求具備Python或Scala背景知識，具備Linux 相關基礎知識更佳。培訓目標： Using the Spark shell for interactive data analysis ? The features of Spark’s Resilient Distributed Datasets ? How Spark runs on a cluster ? How Spark parallelizes task execution ? Writing Spark applications ? Processing streaming data with Spark 課程內容： Introduction to Spark ? What is Spark? ? Review: From Hadoop MapReduce to Spark ? Review: HDFS ? Review: YARN ? Spark Overview Spark Basics ? Using the Spark Shell ? RDDs (Resilient Distributed Datasets) ? Functional Programming in Spark Working with RDDs in Spark ? Creating RDDs ? Other General RDD Operations Aggregating Data with Pair RDDs ? Key-Value Pair RDDs ? Map-Reduce ? Other Pair RDD Operations Writing and Deploying Spark Applications ? Spark Applications vs. Spark Shell ? Creating the SparkContext ? Building a Spark Application (Scala and Java) ? Running a Spark Application ? The Spark Application Web UI ? Hands-On Exercise: Write and Run a Spark Application ? Configuring Spark Properties ? Logging Parallel Processing ? Review: Spark on a Cluster ? RDD Partitions ? Partitioning of File-based RDDs ? HDFS and Data Locality ? Executing Parallel Operations ? Stages and Tasks Spark RDD Persistence ? RDD Lineage ? RDD Persistence Overview ? Distributed Persistence Basic Spark Streaming ? Spark Streaming Overview ? Example: Streaming Request Count ? DStreams ? Developing Spark Streaming Applications Advanced Spark Streaming ? Multi-Batch Operations ? State Operations ? Sliding Window Operations ? Advanced Data Sources Common Patterns in Spark Data Processing ? Common Spark Use Cases ? Iterative Algorithms in Spark ? Graph Processing and Analysis ? Machine Learning ? Example: k-means Improving Spark Performance ? Shared Variables: Broadcast Variables ? Shared Variables: Accumulators ? Common Performance Issues ? Diagnosing Performance Problems Spark SQL and DataFrames ? Spark SQL and the SQL Context ? Creating DataFrames ? Transforming and Querying DataFrames ? Saving DataFrames ? DataFrames and RDDs ? Comparing Spark SQL, Impala and Hive-on-Spark

相關課程

熱門課程

申請試聽課程

只要一個電話
我們免費為您回電

姓名不能為空

手機號格式錯誤