Token economies — sticker charts, star jars, point systems, in-app reward stars — are the most-tested behavior-change tool in applied psychology. They reliably increase the frequency of a target behavior in the short term. They also have a documented failure mode in which they reduce a child’s intrinsic motivation for the very behavior they were meant to encourage. Both findings are real. The design choices in between decide which one shows up in your kid’s life.
This post goes through what the research says, names the failure mode specifically, and describes the design choices that separate a reward system that supports learning from one that quietly poisons it.
Where the technique came from
The first formal token economies were built in the 1960s — Ayllon and Azrin’s work in psychiatric inpatient settings, then Kazdin and others extending the approach into classrooms for kids with behavioral challenges. The technique works through a simple loop: a desired behavior occurs, a token is delivered immediately, accumulated tokens are exchanged for backup reinforcers (privileges, items, time). Four decades of replications have made the basic finding boring: the contingency reliably increases the target behavior while the system is in place (Kazdin, 1982, Journal of Applied Behavior Analysis).
The interesting question has always been what happens when the system is removed.
The motivation caveat
Self-determination theory — Deci and Ryan’s research program starting in the 1970s — produced a finding that looks at first like a contradiction. In a series of experiments, children who were intrinsically interested in an activity (drawing, solving puzzles) and were then offered tangible rewards for doing it spent less time on the activity once the rewards stopped, compared to a control group that was never rewarded.
That finding has been re-tested constantly, and the meta-analytic verdict (Deci, Koestner & Ryan, 1999, Psychological Bulletin, 128 experiments) is consistent: tangible rewards reduce intrinsic motivation when they are perceived as controlling. The effect is robust. It is also conditional. Cameron, Banko & Pierce (2001, The Behavior Analyst) showed the same body of evidence supports a more precise statement: rewards that signal competence, are unexpected, or are tied to effort rather than to outcomes do not reduce intrinsic motivation, and often increase it.
The conclusion isn’t “rewards are bad.” It’s that the reward’s meaning to the child — controlling vs. informational — is what determines the outcome. A star that means “you showed up today” reads differently than a star that means “now do ten more of these or you don’t get the next thing.”
The reward’s meaning to the child — controlling vs. informational — is what determines the outcome. A star that means “you showed up today” reads differently than a star that means “now do ten more or you don’t get the next thing.”
Six design choices that decide the outcome
The research literature is consistent enough to name specific design properties that separate reward systems that help from ones that hurt:
Immediacy. The token shows up the moment the behavior happens. Delay between behavior and reward weakens the contingency and pushes the system toward feeling like a deferred bargain rather than feedback.
Effort-based, not outcome-based. Rewarding “you did 10 flashcards” supports the habit. Rewarding “you got 9 of 10 correct” creates pressure to perform and, in younger kids, often suppresses risk-taking on harder material.
Informational language. “You read three topics today” reads as a signal of what happened. “If you read three more you’ll unlock the next thing” reads as a controlling bargain. The same star can be either, depending on how the surrounding UI talks about it.
Variable rather than fixed schedule once the habit forms. Behaviorism’s classic finding (Skinner, 1957) is that variable-ratio reinforcement produces more durable behavior than fixed-ratio reinforcement. Most app reward systems default to fixed because it’s easier to implement; this is a missed opportunity.
Gradual fading. A reward system that quietly reduces its frequency as the habit becomes self-sustaining respects the kid more than one that keeps the pressure on indefinitely. Behavior-therapy programs explicitly include a fade-out phase. App reward systems rarely do.
No content gating. The moment reward currency is required to access content, the reward system has stopped being a habit-supporter and started being a paywall in disguise. This is the single design choice most likely to convert a reward system from helpful to actively harmful.
How we apply this in Encyclopedia: Kids Learning
Disclosure: this section is about our own app, so the relevant skepticism applies.
The star system in Encyclopedia is a literal token economy, and its design choices map onto the research above. Parents define the tasks — homework, chores, reading habits — and decide what the stars buy. Stars land the moment a child taps “I did it!”, at a fixed value per task: completion is what gets rewarded, never accuracy, so there is no pressure to perform — only to show up. No content is gated behind reward currency; every topic, video, and flashcard deck stays reachable at all times. And the whole layer is optional: the Parent Dashboard has a per-child off switch for families who would rather the reading carry itself.
What the system does not try to do is generate intrinsic curiosity where none exists. No reward design can. What a careful one can do is support the formation of a short, calm, daily reading habit. If a kid stops needing the stars after a few weeks because the reading itself has become the reason to open the app, the reward system has done its job and should fade into the background. That’s the design intent.
What to look for in any kids’ app
Whether the app is ours or someone else’s, the research-grounded checklist is the same:
- Are rewards immediate?
- Are they tied to effort, not outcome?
- Is the language describing what happened, or pushing toward more?
- Does the app gate content behind currency?
- Is there a fade-out path, or is the loop on forever?
- Can a parent turn the whole thing off?
An app that fails on the last three is one to be cautious about. An app that fails on all six is one whose reward system is doing the opposite of what its marketing claims.