A00-255: SAS Predictive Modeling Using SAS Enterprise Miner 14

Successful candidates should have the ability to:

Try Online Exam »

  • Prepare data
  • Build predictive models
  • Assess and implement models
  • Perform pattern analysis.

SAS A00-255 Exam Summary:

Exam Name SAS Predictive Modeling Using SAS Enterprise Miner 14
Exam Code  A00-255
Exam Duration  165 minutes
Exam Questions  55-60
Passing Score  725/1000
Exam Price  $250 (USD)
Books  Predictive Modeling with SAS Enterprise Miner: Practical Solutions for Business Applications, Third Edition
Sample Questions   SAS Predictive Modeler Certification Sample Question
Practice Exam   SAS Predictive Modeler Certification Practice Exam

SAS A00-255 Exam Topics:

Objective Details 
Data Sources – 20-25%
Create data sources from SAS tables in Enterprise Miner – Use the Basic Metadata Advisor
– Use the Advanced Metadata Advisor
– Customize the Advanced Metadata Advisor
– Set Role and Level meta data for data source variables
– Set the Role of the table (raw, scoring, transactional, etc)
Explore and assess data sources – Create and interpret plots, including Histograms, Pie charts, Scatter plot, Time series, Box plot
– Identify distributions
– Find outlying observations
– Find number (or percent) of missing observations
– Find levels of nominal variables
– Explore associations between variables using plots by highlighting and selecting data
– Compare balanced and actual response rates when oversampling has been performed
– Explore data with the STAT EXPLORER node.
– Explore input variable sample statistics
– Browse data set observations (cases)
Modify source data – Replace zero values with missing indicators using the REPLACEMENT node
– Use the TRANFORMATION node to be able to correct problems with input data sources, such as variable distribution or outliers.
– Use the IMPUTE node to impute missing values and create missing value indicators
– Reduce the levels of a categorical variable
– Use the FILTER node to remove cases
Prepare data to be submitted to a predictive model – Select a portion of a data set using the SAMPLE node
– Partition data with the PARTITION Node
– Use the VARIABLE SELECTION node to identify important variables to be included in a predictive model.
– Use the PARTIAL LEAST SQUARES node to identify important variables to be included in a predictive model.
– Use a DECISION TREE or REGRESSION nodes to identify important variables to be included in a predictive model.
Building Predictive Models – 35-40%
Describe key predictive modeling terms and concepts – Data partitioning: training, validation, test data sets
– Observations (cases), independent (input) variables, dependent (target) variables
– Measurement scales: Interval, ordinal, nominal (categorical), binary variables
– Prediction types: decisions, rankings, estimates
– Dimensionality, redundancy, irrelevancy
– Decision trees, neural networks, regression models
– Model optimization, overfitting, underfitting, model selection
– Describe ensemble models
Build predictive models using decision trees – Explain how decision trees identify split points
– Build decision trees in interactive mode
– Change splitting rules
– Explain how missing values can be handled by decision trees
– Assess probability using a decision tree
– Prune decision trees
– Adjust properties of the DECISION TREE node, including: subtree method, Number of Branches, Leaf Size, Significance Level, Surrogate Rules, Bonferroni Adjustment
– Interpret results of the decision tree node, including: trees, leaf statistics, treemaps, score rankings overlay, fit statistics, output, variable importance, subtree assessment plots
– Explore model output (exported) data sets
Build predictive models using regression – Explain the relationship between target variable and regression technique
– Explain linear regression
– Explain logistic regression (Logit link function, maximum likelihood)
– Explain the impact of missing values on regression models
– Select inputs for regression models using forward, backward, stepwise selection techniques
– Adjust thresholds for including variables in a model
– Interpret a logistic regression model using log odds
– Interpret the results of a REGRESSION node (Output, Fit Statistics, Score Ranking Overlay charts)
– Use fit statistics and iteration plots to select the optimum regression model for different decision types
– Add polynomial regression terms to regression models.
– Determine when to add polynomial terms to linear regression models.
Build predictive models using neural networks – Theory of neural networks (Hidden units, Tanh function, bias vs intercept, variable standardization)
– Build a neural network model
– Use regression models to select inputs for a neural network
– Explain how neural networks optimize their model (stopped training)
– Recognize overfit neural network models.
– Interpret the results of a NEURAL NETWORK node, including: Output, Fit Statistics, Iteration Plots, and Score Rankings Overlay charts
Predictive Model Assessment and Implementation – 25-30%
Use the correct fit statistic for different prediction types – Misclassification
– Average Square Error
– Profit/Loss
– Other standard model fit statistics
Use decision processing to adjust for oversampling (separate sampling) – Explain reasons for oversampling data
– Adjust prior probabilities 
Use profit/loss information to assess model performance – Build a profit/loss matrix
– Add a profit/loss matrix to a predictive model
– Determine an appropriate value to use for expected profit/loss for primary outcome
– Optimize models based on expected profit/loss 
Compare models with the MODEL COMPARISON node – Model assessment statistics
– ROC Chart
– Score Rankings Chart, including (cumulative) % response chart, (cumulative) Lift chart, gains chart.
– Total expected profit
– Effect of oversampling 
Score data sets within Enterprise Miner – Configure a data set to be scored in Enterprise Miner
– Use the SCORE node to score new data
– Save scored data to an external location with the SAVE DATA node
– Export SAS score code 
Pattern Analysis – 10-15% 
Identify clusters of similar data with the CLUSTER and SEGMENT PROFILE nodes – Select variables to use to define the clusters
– Standardize variable scales
– Explore clusters with results output and plots
– Compare distribution of variables within clusters
Perform association and sequence analysis (market basket analysis) – Explain association concepts (Support, confidence, expected confidence, lift, difference between association and sequence rules)
– Create a data set for association analysis
– Interpret the results and graphs of the ASSOCIATION node.