Exam2pass > Databricks > Databricks Certifications > DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK > DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Online Practice Questions and Answers

DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Online Practice Questions and Answers

Questions 4

Which of the following statements about stages is correct?

A. Different stages in a job may be executed in parallel.

B. Stages consist of one or more jobs.

C. Stages ephemerally store transactions, before they are committed through actions.

D. Tasks in a stage may be executed by multiple machines at the same time.

E. Stages may contain multiple actions, narrow, and wide transformations.

Buy Now

Questions 5

Which of the following code blocks creates a new one-column, two-row DataFrame dfDates with column date of type timestamp?

A. 1.dfDates = spark.createDataFrame(["23/01/2022 11:28:12","24/01/2022 10:58:34"], ["date"]) 2.dfDates = dfDates.withColumn("date", to_timestamp("dd/MM/yyyy HH:mm:ss", "date"))

B. 1.dfDates = spark.createDataFrame([("23/01/2022 11:28:12",),("24/01/2022 10:58:34",)], ["date"]) 2.dfDates = dfDates.withColumnRenamed("date", to_timestamp("date", "yyyy-MM-ddHH:mm:ss"))

C. 1.dfDates = spark.createDataFrame([("23/01/2022 11:28:12",),("24/01/2022 10:58:34",)], ["date"]) 2.dfDates = dfDates.withColumn("date", to_timestamp("date", "dd/MM/yyyy HH:mm:ss"))

D. 1.dfDates = spark.createDataFrame(["23/01/2022 11:28:12","24/01/2022 10:58:34"], ["date"]) 2.dfDates = dfDates.withColumnRenamed("date", to_datetime("date", "yyyy-MM-ddHH:mm:ss"))

E. 1.dfDates = spark.createDataFrame([("23/01/2022 11:28:12",),("24/01/2022 10:58:34",)], ["date"])

Buy Now

Questions 6

Which of the following code blocks reduces a DataFrame from 12 to 6 partitions and performs a full shuffle?

A. DataFrame.repartition(12)

B. DataFrame.coalesce(6).shuffle()

C. DataFrame.coalesce(6)

D. DataFrame.coalesce(6, shuffle=True)

E. DataFrame.repartition(6)

Buy Now

Questions 7

The code block shown below should return a DataFrame with two columns, itemId and col. In this DataFrame, for each element in column attributes of DataFrame itemDf there should be a separate

row in which the column itemId contains the associated itemId from DataFrame itemsDf. The new DataFrame should only contain rows for rows in DataFrame itemsDf in which the column attributes

contains the element cozy.

A sample of DataFrame itemsDf is below.

Code block:

itemsDf.__1__(__2__).__3__(__4__, __5__(__6__))

A. 1. filter

array_contains("cozy")

select

"itemId"

explode

"attributes"

B. 1. where

"array_contains(attributes, 'cozy')"

select

itemId

explode

attributes

C. 1. filter

"array_contains(attributes, 'cozy')"

select

"itemId"

map

"attributes"

D. 1. filter

"array_contains(attributes, cozy)"

select

"itemId"

explode

"attributes"

E. 1. filter

"array_contains(attributes, 'cozy')"

select

"itemId"

explode

"attributes"

Buy Now

Questions 8

Which of the following code blocks returns a DataFrame that matches the multi-column DataFrame itemsDf, except that integer column itemId has been converted into a string column?

A. itemsDf.withColumn("itemId", convert("itemId", "string"))

B. itemsDf.withColumn("itemId", col("itemId").cast("string"))

C. itemsDf.select(cast("itemId", "string"))

D. itemsDf.withColumn("itemId", col("itemId").convert("string"))

E. spark.cast(itemsDf, "itemId", "string")

Buy Now

Questions 9

Which of the following describes Spark's way of managing memory?

A. Spark uses a subset of the reserved system memory.

B. Storage memory is used for caching partitions derived from DataFrames.

C. As a general rule for garbage collection, Spark performs better on many small objects than few big objects.

D. Disabling serialization potentially greatly reduces the memory footprint of a Spark application.

E. Spark's memory usage can be divided into three categories: Execution, transaction, and storage.

Buy Now

Questions 10

Which of the following code blocks prints out in how many rows the expression Inc. appears in the stringtype column supplier of DataFrame itemsDf?

A. 1.counter = 0

3.for index, row in itemsDf.iterrows():

if 'Inc.' in row['supplier']:

counter = counter + 1

7.print(counter)

B. 1.counter = 0

3.def count(x):

if 'Inc.' in x['supplier']:

counter = counter + 1

7.itemsDf.foreach(count)

8.print(counter)

C. print(itemsDf.foreach(lambda x: 'Inc.' in x))

D. print(itemsDf.foreach(lambda x: 'Inc.' in x).sum())

E. 1.accum=sc.accumulator(0)

3.def check_if_inc_in_supplier(row):

if 'Inc.' in row['supplier']:

accum.add(1)

7.itemsDf.foreach(check_if_inc_in_supplier)

8.print(accum.value)

Buy Now

Questions 11

Which of the following options describes the responsibility of the executors in Spark?

A. The executors accept jobs from the driver, analyze those jobs, and return results to the driver.

B. The executors accept tasks from the driver, execute those tasks, and return results to the cluster manager.

C. The executors accept tasks from the driver, execute those tasks, and return results to the driver.

D. The executors accept tasks from the cluster manager, execute those tasks, and return results to the driver.

E. The executors accept jobs from the driver, plan those jobs, and return results to the cluster manager.

Buy Now

Questions 12

The code block displayed below contains an error. The code block should merge the rows of DataFrames transactionsDfMonday and transactionsDfTuesday into a new DataFrame, matching column names and inserting null values where column names do not appear in both DataFrames. Find the error.

Sample of DataFrame transactionsDfMonday:

1.+-------------+---------+-----+-------+---------+----+

3.+-------------+---------+-----+-------+---------+----+

4.| 5| null| null| null| 2|null|

5.| 6| 3| 2| 25| 2|null|

6.+-------------+---------+-----+-------+---------+----+

Sample of DataFrame transactionsDfTuesday:

1.+-------+-------------+---------+-----+

3.+-------+-------------+---------+-----+

4.| 25| 1| 1| 4|

5.| 2| 2| 2| 7|

6.| 3| 4| 2| null|

7.| null| 5| 2| null|

8.+-------+-------------+---------+-----+

Code block:

sc.union([transactionsDfMonday, transactionsDfTuesday])

A. The DataFrames' RDDs need to be passed into the sc.union method instead of the DataFrame variable names.

B. Instead of union, the concat method should be used, making sure to not use its default arguments.

C. Instead of the Spark context, transactionDfMonday should be called with the join method instead of the union method, making sure to use its default arguments.

D. Instead of the Spark context, transactionDfMonday should be called with the union method.

E. Instead of the Spark context, transactionDfMonday should be called with the unionByName method instead of the union method, making sure to not use its default arguments.

Buy Now

Questions 13

The code block shown below should return a new 2-column DataFrame that shows one attribute from column attributes per row next to the associated itemName, for all suppliers in column supplier whose name includes Sports. Choose the answer that correctly fills the blanks in the code block to accomplish this.

Sample of DataFrame itemsDf:

1.+------+----------------------------------+-----------------------------+-------------------+

3.+------+----------------------------------+-----------------------------+-------------------+

7.+------+----------------------------------+-----------------------------+-------------------+

Code block:

itemsDf.__1__(__2__).select(__3__, __4__) A. 1. filter

col("supplier").isin("Sports")

"itemName"

explode(col("attributes"))

B. 1. where

col("supplier").contains("Sports")

"itemName"

"attributes"

C. 1. where

col(supplier).contains("Sports")

explode(attributes)

itemName

D. 1. where

"Sports".isin(col("Supplier"))

"itemName"

array_explode("attributes")

E. 1. filter

col("supplier").contains("Sports")

"itemName"

explode("attributes")

Buy Now

Exam Code: DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK

Exam Name: Databricks Certified Associate Developer for Apache Spark 3.0

Last Update: Jul 01, 2025

Questions: 180

PDF (Q&A)

$45.99

ADD TO CART

VCE

$49.99

ADD TO CART

PDF + VCE

$59.99

ADD TO CART

DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Online Practice Questions and Answers

PDF (Q&A)

VCE

PDF + VCE

Exam2Pass----The Most Reliable Exam Preparation Assistance