Getting ML-Based App Personalization Right: The Engagement Engineering Framework

This guest post is written by Dr. Julian Runge, an Assistant Professor of Marketing at Northeastern University who specializes in behavioral data science.

As we witness increasing deterioration in digital ad targeting performance, strong post-download personalization of app experiences becomes all the more relevant and valuable to drive higher customer lifetime value (CLV) and better unit economics. The Engagement Engineering framework that I present in this article assists with exactly that. In the service of sustainable customer engagement, it starts with a holistic model of human motivation.

Figure 1: Self-determination theory is the backbone of the Engagement Engineering framework. Humans need to experience competence, autonomy, and relatedness to feel well and sustainably motivated. App users do, too. ML-based personalization can help you provide this to them. (source)

Self-determination theory (SDT), developed since the 1980s by Edward Deci and Richard Ryan, provides us with such a model. Basic needs theory, a sub-theory of SDT, posits that humans need to experience autonomy, relatedness, and competence to feel well and stay sustainably motivated (see Figure 1). These three principles make up the frame for our exploration of effective app personalization.

Note: This article is adapted from my forthcoming workshop on ML-based app personalization. The workshop draws on more than a decade of data science and research work at leading companies (Meta/Facebook, Wooga, N3TWORK) and top-tier universities (Stanford, Duke). This article aims to give a high-level introduction.

1. Engineering Competence: Personalization to Fuel User Skill

Tutorials, notifications, welcome videos, and many other parts of the UX all serve to familiarize users with an app and help them build skills in exploring the environment and its affordances. These are design elements that can, and should, be tuned, refined, and personalized using analytics and product iteration. When it comes to identifying effective levers for ML-based personalization, it serves us to go as close to the core engagement loop of an app as possible. In games, this is mostly the core game, and the lever for personalization can be the difficulty of the core game. From a user’s perspective, the value of personalization here is to receive the right amount of reward (game progress) for exerted effort – and hence to give a user the sense that they are developing competence by exerting effort.

Let’s use the example of a puzzle game to illustrate this more: Say the core game loop consists of clearing game boards by connecting puzzle pieces of the same color, and game progression happens by beating sequential levels arranged on a map (very similar to many of the most successful casual games out in the market). Such a core game offers a powerful mechanism to control the difficulty that a player faces: The distribution of different colors among the puzzle pieces on a board. The lower the number of colors, and the more of one color compared to other colors, the more pieces an average player can connect.

Figure 2: Engineering Competence. Left panel: Use ML to understand users’ expected level of motivation (e.g., measured as rounds played) and skill (e.g, measured as rounds won over rounds played) and then personalize difficulty by making the game easier for users with lower skill and lower expected motivation. Darker blue means a more challenging game experience. Right panel: By supporting players who have less skill (lower capacity to act) and lower motivation (lower opportunity to act), we move players towards a more intense flow state, increasing their enjoyment of play and likelihood to retain. Darker blue here means a more intense expected flow state.

Using this mechanism, to help users achieve feelings of competence, we can make the game easier for players who (i) have lower motivation and (ii) are less skilled and are hence experiencing less progress per effort. The left panel of Figure 2 tries to capture this idea: We want to use ML to understand users’ level of motivation and skill and then personalize difficulty by making the game easier for users with lower skills and lower expected motivation. Future engagement (a proxy for expected motivation to engage) can be predicted using ML for casual F2P games and apps. Doing so allows less skilled (lower capacity to act, see Figure 2) and less motivated (lower opportunity to act) users to keep learning by rewarding them more than more skilled and motivated players per unit of exerted effort. Doing so increases capacity and opportunity to act for all players, making flow experiences more likely, see right panel of Figure 2.

Building on this rationale, we built a difficulty personalization system for a puzzle game that we tested in an A/B test. It increased the number of rounds played by 22.4% and the chance of a player logging in the next day by 9.2% (both effects are highly statistically significant), compared to a condition without personalized difficulty. While not the aim, the system also achieved a substantial lift in average revenue per user as many more players paid for new content packs, and players bought more content packs. (You can find details on the implementation and results here. Note that we will get to the question if such a personalized difficulty system is problematic from a player’s perspective in Section 4.)

2. Engineering Relatedness: Personalizing to Create Engaging Social Experiences

Many apps and games rely on social elements to drive engagement, either at the core (e.g., PvP, social media) or as meta systems (e.g., guilds/teams in all kinds of games, comments sections in news apps), or as a mix of both. Where possible, we can support users in finding relatedness by matching them into environments with similar skill levels and engagement styles. Doing so very much supports players in finding the right opportunity and capacity to act (see in the right panel of Figure 2) and can be operationalized as aiming to maximize realized social interaction.

In a first simple step, that can mean pairing users with a high propensity to engage with highly engaged social environments, something we explored with a large app publisher. We predicted the propensity to engage for newly arriving users who just downloaded an app for the first time and classified existing in-game teams by a “community health score” that was a combination of several input variables (for details, see here). We then matched high-propensity users into engaged social environments.

A field experiment with our system in a highly successful top-grossing mobile game showed that the system could deliver substantial increases in engagement (+12% in rounds played), socialization (+16% messages sent), and even mild and statistically significant increases in monetization (+3% in revenue per new user). These effects were sustained over several weeks. However, we also saw that less engaged communities became even less engaged. A point we will return to later in Section 4.

3. Engineering Monetization: Personalizing Offers Not Prices

Price personalization is largely a no-no in gaming where large and engaged communities do not take it well when companies try to charge different prices for one and the same good. Personalized offers on the other hand are probably the most widely understood and used tool to drive engagement and monetization in apps and F2P games. The freemium pricing model has something to do with why this works so well.

Figure 3: Engineering Monetization. A simple RFM-based offer personalization can go a long way in driving large amounts of extra revenue in freemium apps (+20% in this study). In the table, the vertical dimension captures how long it has been since a user’s last purchase (recency), and the horizontal dimension captures the maximum amount spent by a user in the past (monetary value). Each cell indicates the offer price targeted to users in the respective segment.

Beyond offers, commonly accepted and practiced monetization personalization pertains to sorting of offers in the shop and possibly quantity discounts. Even just a simple RFM-based (recency, frequency, monetary value) personalization as shown in Figure 3 can go a long way in driving large amounts of extra revenue. ML works very well here, too. I have tried reinforcement learning for this in the past but would likely recommend a “simple” supervised learning approach. You can, e.g., predict CLV and then assign different offers based on users’ expected CLV. Really, getting engineering of competence and relatedness right sets a strong baseline from where personalized monetization is the easier exercise. All three dimensions also tend to exert strong complementarities with each other.

4. Engineering Autonomy: Enabling Choice, Fairness, and Inclusion

Many consumers, especially gamers, would be concerned about the above systems in terms of fairness and possibly being manipulative. And rightfully so. As product designers and managers, we need to be careful when using powerful technologies such as ML-based personalization in the background. To avoid and prevent undue manipulation and unfair outcomes, it is important to work from clearly defined and spelled out principles when designing such systems. In the end, to achieve sustainable engagement and relationships, the aim has to be to support autonomy and free choice in consumers’ decision-making.

In my example for competence engineering and personalized difficulty, we aspired to achieve that by setting the following ground rules:

Build the system to drive retention not monetization
Build the system to drive extensive margin, not intensive margin (= to include marginal players, not to “addict” engaged players)

We “complied” with these rules by (1) leaving the base game untouched for the top 50% most engaged players that accounted for almost all revenue generation, and (2) supporting players more the more marginalized / less skilled they were. We hence did not change the gameplay experience for an engaged, spending player – which in turn ensured fair gameplay, e.g., in leaderboard rankings.

In the case study about the engineering of relatedness, we discussed fairness considerations at length. If the ML system would misclassify a new user as “less likely to engage,” this user may not be offered seats in engaged social environments. What happens if the system was wrong and the user actually had a high likelihood to engage? In that case, that user would probably go on to quit the offered environment and seek out a more engaging one. So, in our system, a user who would do so would always be offered a seat in a highly engaged social environment on the second attempt.

The results of our field experiment still showed that more engaged social environments became even more engaged, and less engaged ones were even less engaged. Is that problematic? Possibly. In the case of a social game as in our study, we agreed that this outcome is acceptable (especially as players who wanted to engage could easily find a better social environment, see above). In the case where an algorithmic system may not match users based on local engagement (in an app) but rather on global (=real-life) political preferences, such a system could lead to bad outcomes that reach beyond the app in question. Polarization and division may then be a result of such algorithmic systems and the respective designers and managers need to take different actions to ensure non-misinformed choice, fairness, and inclusion.

Figure 4: The Engagement Engineering framework is platform-agnostic and can inform your personalization initiatives across desktop, laptop, mobile, console, wearables, VR.

For the case of monetization personalization, to facilitate fairness and inclusion, do not price-discriminate (=do not charge different prices for the same good), and use differently-sized and -priced offers with nice discounts. Advertise these offers using popups targeted at appropriate contextual points and showing the offer most matched to a user’s preferences. To facilitate inclusion further, you can make all offers available to all users in an app shop.

The topic of appropriately enabling autonomy, choice, and inclusion on digital platforms is clearly much more complex and involved than I can cover here. For the purposes of this article, I urge you, in the service of sustainable outcomes for you and your customers, to (a) invest dedicated effort in “autonomy engineering,” (b) set ground rules that would satisfy you if you were your own customer.

Concluding Remark

With this article, I’m sharing what I have found to be a powerful framework for thinking about ML-based app personalization. The experiences I researched and helped design, produce, and monetize particularly were mobile games, gamified apps, and social media sites (details here). However, I believe the framework to apply across different computing platforms ranging from desktop- and laptop-based (early-day Facebook apps and games) over mobile-based (modern-day mobile apps) and console-based to wearable and more immersive (VR headsets) experiences that may gain increasing footing in the future (Figure 4).

I hope the framework can be helpful to you in designing personalized app experiences as well. It is a work in progress and I am always looking for feedback and enriching conversations. So, please do reach out.

Dr. Julian Runge is a behavioral economist and data scientist. After years of research on game data science and digital marketing at leading universities and companies such as Facebook (Meta), Stanford, and Duke, he is now an assistant professor at Northeastern University in Boston. Julian has published extensively in outlets ranging from machine learning conference proceedings to Harvard Business Review and Information Systems Research.

Photo by Yura Fresh on Unsplash