前沿拓展:
kb3189866
如果更新之前不卡的話可以試試卸載KB3189866來自這個更新第二用win10更新隱藏工具隱藏掉KB3189866這個更新看看。
原文鏈接:https://mp.weixin.qq.com/s/MXemqdout74rTmhJpnD8Ng
默認情況下,每個 Task 任務(wù)都需要啟動一個 JVM 來運行,如果 Task 任務(wù)計算的數(shù)據(jù)量很小,我們可以讓同一個 Job 的多個 Task 運行在一個 JVM 中,不必為每個 Task 都開啟一個 JVM。(1)未開啟 uber 模式,在 /input 路徑上上傳多個小文件 并執(zhí)行 wordcount 程序
[Tom@hadoop102 hadoop-3.1.3]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output2
(2)觀察控制臺
2021-06-26 16:18:07,607 INFO mapreduce.Job: Job job_1613281510851_0002 running in uber mode : false
(3)觀察 http://hadoop103:8088/cluster
(4)開啟 uber 模式,在 mapred-site.xml 中添加如下配置
<!–開啟uber模式,默認關(guān)閉–>
<property>
<name>mapreduce.job.ubertask.enable</name>
<value>true</value>
</property>
<!–uber模式中最大的mapTask數(shù)量,可向下修改–>
<property>
<name>mapreduce.job.ubertask.maxmaps</name>
<value>9</value>
</property>
<!–uber模式中最大的reduce數(shù)量,可向下修改–>
<property>
<name>mapreduce.job.ubertask.maxreduces</name>
<value>1</value>
</property>
<!–uber模式中最大的輸入數(shù)據(jù)量,默認使用dfs.blocksize 的值,可向下修改–>
<property>
<name>mapreduce.job.ubertask.maxbytes</name>
<value></value>
</property>
(5)分發(fā)配置
[Tom@hadoop102 hadoop]$ xsync mapred-site.xml
(6)再次執(zhí)行 wordcount 程序
[Tom@hadoop102 hadoop-3.1.3]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output2
(7)觀察控制臺
2021-06-27 16:28:36,198 INFO mapreduce.Job: Job job_1613281510851_0003 running in uber mode : true
(8)觀察 http://hadoop103:8088/cluster
8.2 測試MapReduce計算性能
使用 Sort 程序評測 MapReduce注:一個虛擬機不超過 150G 磁盤盡量不要執(zhí)行這段代碼
(1)使用 RandomWriter 來產(chǎn)生隨機數(shù),每個節(jié)點運行 10 個 Map 任務(wù),每個 Map 產(chǎn)生大約 1G 大小的二進制隨機數(shù)
[Tom@hadoop102 mapreduce]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar randomwriter random-data
(2)執(zhí)行 Sort 程序
[Tom@hadoop102 mapreduce]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar sortrandom-data sorted-data
(3)驗證數(shù)據(jù)是否真正排好序了
[Tom@hadoop102 mapreduce]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.1.3-tests.jar testmapredsort -sortInput random-data -sortOutput sorted-data
8.3 企業(yè)開發(fā)場景案例8.3.1 需求
(1)需求:從 1G 數(shù)據(jù)中,統(tǒng)計每個單詞出現(xiàn)次數(shù)。服務(wù)器 3 臺,每臺配置 4G 內(nèi)存,4 核 CPU,4 線程。(2)需求分析:1G/128m=8個MapTask;1個ReduceTask;1個mrAppMaster,平均每個節(jié)點運行 10個/3臺≈3個任務(wù)(4 3 3)
8.3.2 HDFS參數(shù)調(diào)優(yōu)
(1)修改 hadoop-env.sh
export HDFS_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,RFAS-Xmx1024m"
export HDFS_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS-Xmx1024m"
(2)修改 hdfs-site.xml
<!–NameNode有一個工作線程池,默認值是10–>
<property>
<name>dfs.namenode.handler.count</name>
<value>21</value>
</property>
(3)修改 core-site.xml
<!–配置垃圾回收時間為60分鐘–>
<property>
<name>fs.trash.interval</name>
<value>60</value>
</property>
(4)分發(fā)配置
[Tom@hadoop102 hadoop]$ xsync hadoop-env.sh hdfs-site.xml core-site.xml
8.3.3 MapReduce參數(shù)調(diào)優(yōu)
(1)修改 mapred-site.xml
<!–環(huán)形緩沖區(qū)大小,默認100m–>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>100</value>
</property>
<!–環(huán)形緩沖區(qū)溢寫閾值,默認0.8–>
<property>
<name>mapreduce.map.sort.spill.percent</name>
<value>0.80</value>
</property>
<!–merge合并次數(shù),默認10個–>
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>10</value>
</property>
<!–maptask內(nèi)存,默認1g;maptask堆內(nèi)存大小默認和該值大小一致mapreduce.map.java.opts–>
<property>
<name>mapreduce.map.memory.mb</name>
<value>-1</value>
<description>The amount of memory to request from the scheduler for each map task. If this is not specified or is non-positive, it is inferred frommapreduce.map.java.opts and mapreduce.job.heap.memory-mb.ratio. If java-opts are also not specified, we set it to 1024.
</description>
</property>
<!–matask的CPU核數(shù),默認1個–>
<property>
<name>mapreduce.map.cpu.vcores</name>
<value>1</value>
</property>
<!–matask異常重試次數(shù),默認4次–>
<property>
<name>mapreduce.map.maxattempts</name>
<value>4</value>
</property>
<!–每個Reduce去Map中拉取數(shù)據(jù)的并行數(shù)。默認值是5–>
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>5</value>
</property>
<!–Buffer大小占Reduce可用內(nèi)存的比例,默認值0.7–>
<property>
<name>mapreduce.reduce.shuffle.input.buffer.percent</name>
<value>0.70</value>
</property>
<!–Buffer中的數(shù)據(jù)達到多少比例開始寫入磁盤,默認值0.66。–>
<property>
<name>mapreduce.reduce.shuffle.merge.percent</name>
<value>0.66</value>
</property>
<!–reducetask內(nèi)存,默認1g;reducetask堆內(nèi)存大小默認和該值大小一致mapreduce.reduce.java.opts –>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>-1</value>
<description>The amount of memory to request from the scheduler for each reduce task. If this is not specified or is non-positive, it is inferred
from mapreduce.reduce.java.opts and mapreduce.job.heap.memory-mb.ratio.
If java-opts are also not specified, we set it to 1024.
</description>
</property>
<!–reducetask的CPU核數(shù),默認1個–>
<property>
<name>mapreduce.reduce.cpu.vcores</name>
<value>2</value>
</property>
<!–reducetask失敗重試次數(shù),默認4次–>
<property>
<name>mapreduce.reduce.maxattempts</name>
<value>4</value>
</property>
<!–當MapTask完成的比例達到該值后才會為ReduceTask申請資源。默認是0.05–>
<property>
<name>mapreduce.job.reduce.slowstart.completedmaps</name>
<value>0.05</value>
</property>
<!–如果程序在規(guī)定的默認10分鐘內(nèi)沒有讀到數(shù)據(jù),將強制超時退出–>
<property>
<name>mapreduce.task.timeout</name>
<value>600000</value>
</property>
(2)分發(fā)配置
[Tom@hadoop102 hadoop]$ xsync mapred-site.xml
8.3.4 Yarn參數(shù)調(diào)優(yōu)
(1)修改 yarn-site.xml 配置參數(shù)如下:
<!–選擇調(diào)度器,默認容量–>
<property>
<description>The class to use as the resource scheduler.</description>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>
<!–ResourceManager處理調(diào)度器請求的線程數(shù)量,默認50;如果提交的任務(wù)數(shù)大于50,可以增加該值,但是不能超過3臺* 4線程=12線程(去除其他應(yīng)用程序?qū)嶋H不能超過8)–>
<property>
<description>Number of threads to handle scheduler interface.</description>
<name>yarn.resourcemanager.scheduler.client.thread-count</name>
<value>8</value>
</property>
<!–是否讓yarn自動檢測硬件進行配置,默認是false,如果該節(jié)點有很多其他應(yīng)用程序,建議手動配置。如果該節(jié)點沒有其他應(yīng)用程序,可以采用自動–>
<property>
<description>Enable auto-detection of node capabilities such as memory and CPU.</description>
<name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
<value>false</value>
</property>
<!–是否將虛擬核數(shù)當作CPU核數(shù),默認是false,采用物理CPU核數(shù)–>
<property>
<description>Flag to determine if logical processors(such as hyperthreads) should be counted as cores. Only applicable on Linux when yarn.nodemanager.resource.cpu-vcores is set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true.</description>
<name>yarn.nodemanager.resource.count-logical-processors-as-cores</name>
<value>false</value>
</property>
<!–虛擬核數(shù)和物理核數(shù)乘數(shù),默認是1.0–>
<property>
<description>Multiplier to determine how to convert phyiscal cores to vcores. This value is used if yarn.nodemanager.resource.cpu-vcores is set to -1(which implies auto-calculate vcores) and yarn.nodemanager.resource.detect-hardware-capabilities is set to true. Thenumber of vcores will be calculated asnumber of CPUs * multiplier.</description>
<name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
<value>1.0</value>
</property>
<!–NodeManager使用內(nèi)存數(shù),默認8G,修改為4G內(nèi)存–>
<property>
<description>Amount of physical memory, in MB, that can be allocated for containers. If set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true, it is automatically calculated(in case of Windows and Linux).In other cases, the default is 8192MB.</description>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
</property>
<!–nodemanager的CPU核數(shù),不按照硬件環(huán)境自動設(shè)定時默認是8個,修改為4個–>
<property>
<description>Number of vcores that can be allocated
for containers. This is used by the RM scheduler when allocating resources for containers. This is not used to limit the number of CPUs used by YARN containers. If it is set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true, it is automatically determined from the hardware in case of Windows and Linux.In other cases, number of vcores is 8 by default.</description>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>4</value>
</property>
<!–容器最小內(nèi)存,默認1G –>
<property>
<description>The minimum allocation for every container request at the RMin MBs. Memory requests lower than this will be set to the value of thisproperty. Additionally, a node manager that is configured to have less memorythan this value will be shut down by the resource manager.</description>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<!–容器最大內(nèi)存,默認8G,修改為2G –>
<property>
<description>The maximum allocation for every container request at the RMin MBs. Memory requests higher than this will throw anInvalidResourceRequestException.</description>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
</property>
<!–容器最小CPU核數(shù),默認1個–>
<property>
<description>The minimum allocation for every container request at the RMin terms of virtual CPU cores. Requests lower than this will be set to thevalue of this property. Additionally, a node manager that is configured tohave fewer virtual cores than this value will be shut down by the resourcemanager.</description>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
</property>
<!–容器最大CPU核數(shù),默認4個,修改為2個–>
<property>
<description>The maximum allocation for every container request at the RMin terms of virtual CPU cores. Requests higher than this will throw an InvalidResourceRequestException.</description>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
</property>
<!–虛擬內(nèi)存檢查,默認打開,修改為關(guān)閉–>
<property>
<description>Whether virtual memory limits will be enforced for containers.</description>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<!–虛擬內(nèi)存和物理內(nèi)存設(shè)置比例,默認2.1 –>
<property>
<description>Ratio between virtual memory to physical memory whensetting memory limits for containers. Container allocations areexpressed in terms of physical memory, and virtual memory usageis allowed to exceed this allocation by this ratio.</description>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
(2)分發(fā)配置
[Tom@hadoop102 hadoop]$ xsync yarn-site.xml
8.3.5 執(zhí)行程序
(1)重啟集群
[Tom@hadoop102 hadoop-3.1.3]$ **in/stop-yarn.sh
[Tom@hadoop103 hadoop-3.1.3]$ **in/start-yarn.sh
(2)執(zhí)行 WordCount 程序
[Tom@hadoop102 hadoop 3.1.3]$ hadoop jar
share/hadoop/ mapreduce/hadoop mapreduce examples 3.1.3.jar
wordcount /input /output
(3)觀察 Yarn 任務(wù)執(zhí)行頁面 http://hadoop103:8088/cluster/apps
拓展知識:
原創(chuàng)文章,作者:九賢生活小編,如若轉(zhuǎn)載,請注明出處:http:///8323.html