Your cluster's mapred-start.xml includes the following parameters
And any cluster's yarn-site.xml includes the following parameters
What is the maximum amount of virtual memory allocated for each map task before YARN will kill its Container?
A. 4 GB
B. 17.2 GB
C. 8.9 GB
D. 8.2 GB
E. 24.6 GB
Which three basic configuration parameters must you set to migrate your cluster from MapReduce 1 (MRv1) to MapReduce V2 (MRv2)? (Choose three)
A. Configure the NodeManager to enable MapReduce services on YARN by setting the following property in yarn-site.xml:
B. Configure the NodeManager hostname and enable node services on YARN by setting the following property in yarn-site.xml:
C. Configure a default scheduler to run on YARN by setting the following property in mapred- site.xml:
D. Configure the number of map tasks per jon YARN by setting the following property in mapred:
E. Configure the ResourceManager hostname and enable node services on YARN by setting the following property in yarn-site.xml:
F. Configure MapReduce as a Framework running on YARN by setting the following property in mapredsite.xml:
Which YARN daemon or service monitors a Controller's per-application resource using (e.g., memory CPU)?
A. ApplicationMaster
B. NodeManager
C. ApplicationManagerService
D. ResourceManager
You are working on a project where you need to chain together MapReduce, Pig jobs. You also need the ability to use forks, decision points, and path joins. Which ecosystem project should you use to perform these actions?
A. Oozie
B. ZooKeeper
C. HBase
D. Sqoop
E. HUE
During the execution of a MapReduce v2 (MRv2) job on YARN, where does the Mapper place the intermediate data of each Map Task?
A. The Mapper stores the intermediate data on the node running the Job's ApplicationMaster so that it is available to YARN ShuffleService before the data is presented to the Reducer
B. The Mapper stores the intermediate data in HDFS on the node where the Map tasks ran in the HDFS / usercache/and(user)/apache/application_and(appid) directory for the user who ran the job
C. The Mapper transfers the intermediate data immediately to the reducers as it is generated by the Map Task
D. YARN holds the intermediate data in the NodeManager's memory (a container) until it is transferred to the Reducer
E. The Mapper stores the intermediate data on the underlying filesystem of the local disk in the directories yarn.nodemanager.locak-DIFS
Your cluster is running MapReduce version 2 (MRv2) on YARN. Your ResourceManager is configured to use the FairScheduler. Now you want to configure your scheduler such that a new user on the cluster can submit jobs into their own queue application submission. Which configuration should you set?
A. You can specify new queue name when user submits a job and new queue can be created dynamically if the property yarn.scheduler.fair.allow-undecleared-pools = true
B. Yarn.scheduler.fair.user.fair-as-default-queue = false and yarn.scheduler.fair.allow- undecleared-pools = true
C. You can specify new queue name when user submits a job and new queue can be created dynamically if yarn .schedule.fair.user-as-default-queue = false
D. You can specify new queue name per application in allocations.xml file and have new jobs automatically assigned to the application queue
A slave node in your cluster has 4 TB hard drives installed (4 x 2TB). The DataNode is configured to store HDFS blocks on all disks. You set the value of the dfs.datanode.du.reserved parameter to 100 GB. How does this alter HDFS block storage?
A. 25GB on each hard drive may not be used to store HDFS blocks
B. 100GB on each hard drive may not be used to store HDFS blocks
C. All hard drives may be used to store HDFS blocks as long as at least 100 GB in total is available on the node
D. A maximum if 100 GB on each hard drive may be used to store HDFS blocks
What two processes must you do if you are running a Hadoop cluster with a single NameNode and six DataNodes, and you want to change a configuration parameter so that it affects all six DataNodes. (Choose two)
A. You must modify the configuration files on the NameNode only. DataNodes read their configuration from the master nodes
B. You must modify the configuration files on each of the DataNodes machines
C. You don't need to restart any daemon, as they will pick up changes automatically
D. You must restart the NameNode daemon to apply the changes to the cluster
E. You must restart all six DatNode daemon to apply the changes to the cluster
Given:
You want to clean up this list by removing jobs where the State is KILLED. What command you enter?
A. Yarn application refreshJobHistory
B. Yarn application kill application_1374638600275_0109
C. Yarn rmadmin refreshQueue
D. Yarn rmadmin kill application_1374638600275_0109
Assume you have a file named foo.txt in your local directory. You issue the following three commands:
Hadoop fs mkdir input
Hadoop fs put foo.txt input/foo.txt
Hadoop fs put foo.txt input
What happens when you issue the third command?
A. The write succeeds, overwriting foo.txt in HDFS with no warning
B. The file is uploaded and stored as a plain file named input
C. You get a warning that foo.txt is being overwritten
D. You get an error message telling you that foo.txt already exists, and asking you if you would like to overwrite it.
E. You get a error message telling you that foo.txt already exists. The file is not written to HDFS
F. You get an error message telling you that input is not a directory
G. The write silently fails