You have been assigned to perform a study of the daily revenue effect of a pricing model of online transactions. When is the analytics lifecycle considered completed?
A. When written documentation has been produced and the code has been handed off to the DBA/ operations.
B. When a model has been completely developed and the results have shown statistically acceptable results.
C. When the results of the model have been presented to both the internal analytics team and the business owner of the project.
D. When a model has been completely developed based on both a sample of the data and the entire set of data available.
Refer to the exhibit.
You have created a density plot of purchase amounts from a retail website as shown. What should you do next?
A. Recreate the plot using the barplot() function
B. Use the rug() function to add elements to the plot
C. Recreate the density plot using a log normal distribution of the purchase amount data
D. Reduce the sample size of the purchase amount data used to create the plot
Which functionality do regular expressions provide?
A. text pattern matching
B. underflow prevention
C. increased numerical precision D. decreased processing complexity
When is a Wilcoxon Rank-Sum test used?
A. When an assumption about the distribution of the populations cannot be made
B. When the data can be easily sorted
C. When the populations represent the sums of other values
D. When the data cannot be easily sorted
A data scientist is asked to implement an article recommendation feature for an on-line magazine. The magazine does not want to use client tracking technologies such as cookies or reading history. Therefore, only the style and subject matter of the current article is available for making recommendations. All of the magazine's articles are stored in a database in a format suitable for analytics.
Which method should the data scientist try first?
A. K Means Clustering
B. Naive Bayesian
C. Logistic Regression
D. Association Rules
Which word or phrase completes the statement? Structured data is to OLAP data as quasi- structured data is to____
A. Clickstream data
B. XML data
C. Text documents
D. Image files
In R, functions like plot() and hist() are known as what?
A. generic functions
B. virtual methods
C. virtual functions
D. generic methods
Refer to the exhibit.
You are building a decision tree. In this exhibit, four variables are listed with their respective values of info-gain.
Based on this information, on which attribute would you expect the next split to be in the decision tree?
A. Credit Score
B. Age
C. Income
D. Gender
Refer to the exhibit.
The graph represents an ROC space with four classifiers labelled A through D. Which point in the graph represents a perfect classification?
A. S
B. P
C. Q
D. R
Refer to the exhibit.
In the exhibit, a correlogram is provided based on an autocorrelation analysis of a sample dataset. What can you conclude from only this exhibit?
A. There is significant autocorrelation through lag 3
B. There is no structure left to model in the data
C. Lag 7 has a significant negative autocorrelation
D. Differencing is required before proceeding with any analysis