Statistical Noise and Sales Spreads

Most machine learning systems predict outcomes using historic data. Unfortunately, most historic datasets possess errors, omissions, and statistical noise. This makes users skeptical. And when users distrust systems, they engage less, making the data even more noisy. Machine learning requires data with the least possible noise. And noise is related to user engagement. We believe there is a gap in research on the connection between user behaviors and data quality.


To address these issues, we have launched research focused on identifying spreads between sales targets and actuals – a major form of statistical noise. We have three goals in this research: 1) Improve individual sales performance with apps that automatically compare performance to higher performers, 2) Track engagement rates and apply different nudges to see what works in sustaining engagement rates, and 3) Seek to change awareness and behaviors to lower signal noise. We will also use the interactions in this research to train our machine learning algorithms to adapt to peak engagement factors. All our findings will be shared with research participants.


To ensure data integrity, we are carefully inviting sales leaders and individuals to sector-specific cohort groups. Our first three groups will focus on consulting, technology, and financial services. We will manage invitation-only groups of people from pre-selected organizations in each sector. Access to the system is password protected and all data are encrypted both in transit and at rest. All individual and firm findings will be kept strictly confidential. However, both sales leaders and individuals will share in both personalized and general statistical findings of the research.