databricks CERTIFIED MACHINE LEARNING PROFESSIONAL Exam Questions

Questions for the CERTIFIED MACHINE LEARNING PROFESSIONAL were updated on : Jan 11 ,2025

Page 1 out of 6. Viewing questions 1-10 out of 57

Question 1

A data scientist is utilizing MLflow to track their machine learning experiments. After completing a series of runs for the experiment with experiment ID exp_id, the data scientist wants to programmatically work with the experiment run data in a Spark DataFrame. They have an active MLflow Client client and an active Spark session spark.
Which of the following lines of code can be used to obtain run-level results for exp_id in a Spark DataFrame?

  • A. client.list_run_infos(exp_id)
  • B. spark.read.format("delta").load(exp_id)
  • C. There is no way to programmatically return row-level results from an MLflow Experiment.
  • D. mlflow.search_runs(exp_id)
  • E. spark.read.format("mlflow-experiment").load(exp_id)
Answer:

b

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 2

A machine learning engineer wants to move their model version model_version for the MLflow Model Registry model model from the Staging stage to the Production stage using MLflow Client client. At the same time, they would like to archive any model versions that are already in the Production stage.
Which of the following code blocks can they use to accomplish the task?

  • D. None
  • E. It is not possible to transition models from Production to Archived.
Answer:

c

User Votes:
D
50%
E
50%
Discussions
vote your answer:
D
E
0 / 1000

Question 3

A data scientist has developed and logged a scikit-learn random forest model model, and then they ended their Spark session and terminated their cluster. After starting a new cluster, they want to review the feature_importances_ of the original model object.
Which of the following lines of code can be used to restore the model object so that feature_importances_ is available?

  • A. mlflow.load_model(model_uri)
  • B. client.list_artifacts(run_id)["feature-importances.csv"]
  • C. mlflow.sklearn.load_model(model_uri)
  • D. This can only be viewed in the MLflow Experiments UI
  • E. client.pyfunc.load_model(model_uri)
Answer:

a

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 4

A machine learning engineer needs to deliver predictions of a machine learning model in real-time. However, the feature values needed for computing the predictions are available one week before the query time.
Which of the following is a benefit of using a batch serving deployment in this scenario rather than a real-time serving deployment where predictions are computed at query time?

  • A. Batch serving has built-in capabilities in Databricks Machine Learning
  • B. There is no advantage to using batch serving deployments over real-time serving deployments
  • C. Computing predictions in real-time provides more up-to-date results
  • D. Testing is not possible in real-time serving deployments
  • E. Querying stored predictions can be faster than computing predictions in real-time
Answer:

a

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 5

A machine learning engineer is converting a Hyperopt-based hyperparameter tuning process from manual MLflow logging to MLflow Autologging. They are trying to determine how to manage nested Hyperopt runs with MLflow Autologging.
Which of the following approaches will create a single parent run for the process and a child run for each unique combination of hyperparameter values when using Hyperopt and MLflow Autologging?

  • A. Starting a manual parent run before calling fmin
  • B. Ensuring that a built-in model flavor is used for the model logging
  • C. Starting a manual child run within the objective_function
  • D. There is no way to accomplish nested runs with MLflow Autologging and Hyperopt
  • E. MLflow Autologging will automatically accomplish this task with Hyperopt
Answer:

a

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 6

A machine learning engineer wants to log feature importance data from a CSV file at path importance_path with an MLflow run for model model.
Which of the following code blocks will accomplish this task inside of an existing MLflow run block?

  • B. None
  • C. mlflow.log_data(importance_path, "feature-importance.csv")
  • D. mlflow.log_artifact(importance_path, "feature-importance.csv")
  • E. None of these code blocks tan accomplish the task.
Answer:

c

User Votes:
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
B
C
D
E
0 / 1000

Question 7

Which of the following is a probable response to identifying drift in a machine learning application?

  • A. None of these responses
  • B. Retraining and deploying a model on more recent data
  • C. All of these responses
  • D. Rebuilding the machine learning application with a new label variable
  • E. Sunsetting the machine learning application
Answer:

a

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 8

Which of the following tools can assist in real-time deployments by packaging software with its own application, tools, and libraries?

  • A. Cloud-based compute
  • B. None of these tools
  • C. REST APIs
  • D. Containers
  • E. Autoscaling clusters
Answer:

a

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 9

Which of the following Databricks-managed MLflow capabilities is a centralized model store?

  • A. Models
  • B. Model Registry
  • C. Model Serving
  • D. Feature Store
  • E. Experiments
Answer:

c

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 10

After a data scientist noticed that a column was missing from a production feature set stored as a Delta table, the machine learning engineering team has been tasked with determining when the column was dropped from the feature set.
Which of the following SQL commands can be used to accomplish this task?

  • A. VERSION
  • B. DESCRIBE
  • C. HISTORY
  • D. DESCRIBE HISTORY
  • E. TIMESTAMP
Answer:

d

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000
To page 2