Price Predictions: How to Build Online Probabilistic Attribution Models for DSP Optimization

In the ever-evolving world of digital marketing, optimizing dynamic bidding to drive lower funnel metrics is crucial for Demand-Side Platforms (DSPs). A key component of this optimization is accurate attribution, determining which interactions and touchpoints contribute to conversions. Probabilistic attribution models provide a sophisticated approach to this, offering nuanced insights that can significantly enhance bidding strategies. In this blog, we’ll explore how to build and implement these models to optimize dynamic bidding and drive lower funnel metrics.

Understanding Probabilistic Attribution

Attribution models help marketers understand which channels and touchpoints contribute most effectively to conversions. Traditional attribution models, such as last-click or linear attribution, often fail to capture the complexity of user journeys. Probabilistic attribution models, on the other hand, use statistical methods to estimate the probability that each touchpoint contributes to a conversion. This approach accounts for the randomness and uncertainty inherent in user behavior, providing a more accurate and comprehensive view of marketing effectiveness.

Steps to Build Online Probabilistic Attribution Models

1. Data Collection

The foundation of any attribution model is robust data collection. This involves gathering data from various sources, including:

  • Impressions: Records of ads viewed by users.
  • Clicks: Data on ads clicked by users.
  • Conversions: Actions taken by users that are deemed valuable (e.g., purchases, sign-ups).
  • User Journey Data: Sequence of interactions a user has with different touchpoints before conversion.

2. Data Preprocessing

Before feeding data into the model, it needs to be cleaned and preprocessed:

  • Deduplication: Remove duplicate records to ensure accuracy.
  • Sessionization: Group user interactions into sessions based on predefined rules (e.g., a session ends after 30 minutes of inactivity).
  • Feature Engineering: Create relevant features that capture the essence of user interactions, such as time spent on each touchpoint, device type, and geographic location.

3. Model Selection

Choosing the right probabilistic model is crucial. Some popular choices include:

  • Markov Chains: Used to model user journeys as sequences of states (touchpoints). The probabilities of transitions between states are estimated from the data.
  • Bayesian Models: Incorporate prior knowledge and update beliefs based on observed data. Useful for modeling uncertainty and incorporating expert knowledge.
  • Hidden Markov Models (HMM): Extend Markov Chains by introducing hidden states, capturing the latent factors influencing user behavior.

4. Model Training

Training the model involves estimating the parameters that best fit the observed data. This can be done using various techniques:

  • Maximum Likelihood Estimation (MLE): Finds parameter values that maximize the likelihood of the observed data.
  • Expectation-Maximization (EM) Algorithm: Iteratively estimates the parameters of the model, particularly useful for HMMs.
  • Bayesian Inference: Uses methods like Markov Chain Monte Carlo (MCMC) to sample from the posterior distribution of the model parameters.

5. Model Validation

Validate the model to ensure it accurately captures the attribution dynamics:

  • Cross-Validation: Split the data into training and validation sets to evaluate model performance.
  • Lift Analysis: Compare the predicted impact of touchpoints on conversions with actual outcomes to assess model accuracy.
  • A/B Testing: Implement controlled experiments to test the effectiveness of model-driven bidding strategies.

6. Real-Time Implementation

Integrate the trained model into the DSP for real-time bidding optimization:

  • Scoring Algorithm: Develop a scoring algorithm that assigns value to each impression based on the model’s attribution probabilities.
  • Bid Optimization Engine: Adjust bids dynamically in real-time, prioritizing impressions with higher attribution probabilities to maximize conversions.
  • Continuous Learning: Continuously update the model with new data to adapt to changing user behavior and market conditions.

Benefits of Probabilistic Attribution Models

  • Accuracy: Provides a more precise understanding of the impact of different touchpoints.
  • Optimization: Enhances bidding strategies by identifying high-value impressions.
  • Adaptability: Adapts to evolving user behaviors and market trends.
  • Insights: Offers deeper insights into user journeys and marketing effectiveness.


Building and implementing online probabilistic attribution models for DSPs is a powerful way to optimize dynamic bidding and drive lower funnel metrics. By accurately attributing conversions to the right touchpoints, marketers can make more informed decisions, enhance their bidding strategies, and ultimately improve ROI. As digital marketing continues to evolve, embracing probabilistic attribution models will be essential for staying competitive and achieving sustained success.