Exam2pass > Databricks > Databricks Certifications > DATABRICKS-MACHINE-LEARNING-ASSOCIATE > DATABRICKS-MACHINE-LEARNING-ASSOCIATE Online Practice Questions and Answers

DATABRICKS-MACHINE-LEARNING-ASSOCIATE Online Practice Questions and Answers

Questions 4

Which of the following hyperparameter optimization methods automatically makes informed selections of hyperparameter values based on previous trials for each iterative model evaluation?

A. Random Search

B. Halving Random Search

C. Tree of Parzen Estimators

D. Grid Search

Buy Now

Questions 5

A data scientist is working with a feature set with the following schema:

Thecustomer_idcolumn is the primary key in the feature set. Each of the columns in the feature set has missing values. They want to replace the missing values by imputing a common value for each feature.

Which of the following lists all of the columns in the feature set that need to be imputed using the most common value of the column?

A. customer_id, loyalty_tier

B. loyalty_tier

C. units

D. spend

E. customer_id

Buy Now

Questions 6

A data scientist uses 3-fold cross-validation and the following hyperparameter grid when optimizing model hyperparameters via grid search for a classification problem:

Hyperparameter 1: [2, 5, 10] Hyperparameter 2: [50, 100]

Which of the following represents the number of machine learning models that can be trained in parallel during this process?

A. 3

B. 5

C. 6

D. 18

Buy Now

Questions 7

A health organization is developing a classification model to determine whether or not a patient currently has a specific type of infection. The organization's leaders want to maximize the number of positive cases identified by the model.

Which of the following classification metrics should be used to evaluate the model?

A. RMSE

B. Precision

C. Area under the residual operating curve

D. Accuracy

E. Recall

Buy Now

Questions 8

A machine learning engineer has grown tired of needing to install the MLflow Python library on each of their clusters. They ask a senior machine learning engineer how their notebooks can load the MLflow library without installing it each time. The senior machine learning engineer suggests that they use Databricks Runtime for Machine Learning.

Which of the following approaches describes how the machine learning engineer can begin using Databricks Runtime for Machine Learning?

A. They can add a line enabling Databricks Runtime ML in their init script when creating their clusters.

B. They can check the Databricks Runtime ML box when creating their clusters.

C. They can select a Databricks Runtime ML version from the Databricks Runtime Version dropdown when creating their clusters.

D. They can set the runtime-version variable in their Spark session to "ml".

Buy Now

Questions 9

An organization is developing a feature repository and is electing to one-hot encode all categorical feature variables. A data scientist suggests that the categorical feature variables should not be one-hot encoded within the feature repository.

Which of the following explanations justifies this suggestion?

A. One-hot encoding is a potentially problematic categorical variable strategy for some machine learning algorithms.

B. One-hot encoding is dependent on the target variable's values which differ for each apaplication.

C. One-hot encoding is computationally intensive and should only be performed on small samples of training sets for individual machine learning problems.

D. One-hot encoding is not a common strategy for representing categorical feature variables numerically.

Buy Now

Questions 10

A data scientist has defined a Pandas UDF function predict to parallelize the inference process for a single-node model:

They have written the following incomplete code block to use predict to score each record of Spark DataFramespark_df:

Which of the following lines of code can be used to complete the code block to successfully complete the task?

A. predict(*spark_df.columns)

B. mapInPandas(predict)

C. predict(Iterator(spark_df))

D. mapInPandas(predict(spark_df.columns))

E. predict(spark_df.columns)

Buy Now

Questions 11

A machine learning engineering team has a Job with three successive tasks. Each task runs a single notebook. The team has been alerted that the Job has failed in its latest run.

Which of the following approaches can the team use to identify which task is the cause of the failure?

A. Run each notebook interactively

B. Review the matrix view in the Job's runs

C. Migrate the Job to a Delta Live Tables pipeline

D. Change each Task's setting to use a dedicated cluster

Buy Now

Questions 12

A data scientist has produced three new models for a single machine learning problem. In the past, the solution used just one model. All four models have nearly the same prediction latency, but a machine learning engineer suggests that the new solution will be less time efficient during inference.

In which situation will the machine learning engineer be correct?

A. When the new solution requires if-else logic determining which model to use to compute each prediction

B. When the new solution's models have an average latency that is larger than the size of the original model

C. When the new solution requires the use of fewer feature variables than the original model

D. When the new solution requires that each model computes a prediction for every record

E. When the new solution's models have an average size that is larger than the size of the original model

Buy Now

Questions 13

A data scientist is attempting to tune a logistic regression model logistic using scikit-learn. They want to specify a search space for two hyperparameters and let the tuning process randomly select values for each evaluation.

They attempt to run the following code block, but it does not accomplish the desired task:

Which of the following changes can the data scientist make to accomplish the task?

A. Replace the GridSearchCV operation with RandomizedSearchCV

B. Replace the GridSearchCV operation with cross_validate

C. Replace the GridSearchCV operation with ParameterGrid

D. Replace the random_state=0 argument with random_state=1

E. Replace the penalty= ['12', '11'] argument with penalty=uniform ('12', '11')

Buy Now

Exam Code: DATABRICKS-MACHINE-LEARNING-ASSOCIATE

Exam Name: Databricks Certified Machine Learning Associate

Last Update: Jul 01, 2025

Questions: 74

PDF (Q&A)

$45.99

ADD TO CART

VCE

$49.99

ADD TO CART

PDF + VCE

$59.99

ADD TO CART

DATABRICKS-MACHINE-LEARNING-ASSOCIATE Online Practice Questions and Answers

PDF (Q&A)

VCE

PDF + VCE

Exam2Pass----The Most Reliable Exam Preparation Assistance