Avid Weekly Ideas #5 – When to Explore vs. When to Exploit
What is Avid Weekly Ideas?
Each week I publish an idea or two either from a book I am currently reading or from my backlog of book notes. 📚
Of course, this isn’t intended to replace book notes or reading the book itself. Instead the aim is to extract a key idea or two from a book that struck me as insightful and share-worthy. Then I package it into a bite-sized digestible chunk. I hope these little nuggets of insight will spark some inspiration or ideas in your own mind. 💡
This week’s key concept/idea/passage is from Algorithms to Live By: The Computer Science of Human Decisions (by Brian Christian and Tom Griffiths).
See Amazon page for book reviews.
See Goodreads page for book reviews.
Today’s idea is a follow up on my last post as I feel the ideas are somewhat related.
I came across this concept while reading Algorithms to Live By, a fascinating book that explores how insights from computer algorithms can be applied to our everyday lives and help us solve day-to-day problems. I thought that was a very creative linkage, combining two topics that don’t prima facie relate to each other.
In particular, the book explores 11 big ideas from computer science and I will focus on one of them here, Explore vs Exploit.
Explore vs. Exploit
Let’s start with a working definition.
In English, the words “explore” and “exploit” come loaded with completely opposite connotations. But to a computer scientist, these words have much more specific and neutral meanings. Simply put, exploration is gathering information, and exploitation is using the information you have to get a known good result.
So exploration = gathering information and exploitation = using/exploiting information. Which begs the question: when should you be doing which?
Does that sound familiar? If so, don’t be surprised because this is a dilemma faced by people all around the world every single day.
Should you keep searching, exploring or gathering more information? Or do you already have enough information at your disposal and should just make a decision? That is the aim of understanding this trade-off. To help you make the optimal decision.
Sometimes it is not clear what the balance should be. Certainly both exploring and exploiting seem important, maybe equally so, so which should you prioritize?
It’s fairly intuitive that never exploring is no way to live. But it’s also worth mentioning that never exploiting can be every bit as bad. In the computer science definition, exploitation actually comes to characterize many of what we consider to be life’s best moments. A family gathering together on the holidays is exploitation. So is a bookworm settling into a reading chair with a hot cup of coffee and a beloved favorite, or a band playing their greatest hits to a crowd of adoring fans, or a couple that has stood the test of time dancing to “their song.”
The authors go on to give a concrete example by introducing the “multi-armed bandit problem.”
In computer science, the tension between exploration and exploitation takes its most concrete form in a scenario called the “multi-armed bandit problem.” The odd name comes from the colloquial term for a casino slot machine, the “one-armed bandit.” Imagine walking into a casino full of different slot machines, each one with its own odds of a payoff. The rub, of course, is that you aren’t told those odds in advance: until you start playing, you won’t have any idea which machines are the most lucrative (“loose,” as slot-machine aficionados call it) and which ones are just money sinks.
Naturally, you’re interested in maximizing your total winnings. And it’s clear that this is going to involve some combination of pulling the arms on different machines to test them out (exploring), and favoring the most promising machines you’ve found (exploiting).”
But this is more subtle than it seems. For example if you have a choice between two machines.
Machine 1: played 15 times and 9 times it paid out (9/15 success rate)
Machine 2: played 2 times and once it paid out (1/2 success rate)
Which is the better machine to play on?
If you simply do the math of expected value, Machine 1 comes on top with a 60% win rate versus a 50% win rate for Machine 2. Does this mean you should certainly pick Machine 1?
Well it turns out, there’s more to it than that, “after all, just two pulls aren’t really very many. So there’s a sense in which we just don’t yet know how good the second machine might actually be.”
And that’s really the heart of the problem isn’t it? The possibility that there is greener pastures on the other side. That you never know if you never try. One can easily go mad with the endless possibilities.
So how do we resolve this?
Time to Exploit is Key
When balancing favorite experiences and vs. exploring new ones, “nothing matters as much as the interval over which we plan to enjoy them.”
For example, if you are leaving a city permanently, it makes sense to go back to all your favorites, rather than trying out new stuff. Because trying something new carries an inherent risk/reward with it. The risk with trying a new restaurant is that it isn’t to your liking. But the reward is you potentially discovered a great new place. But since you are leaving the city soon, the potential value of the reward is diminished because you only get to go back to the restaurant once or twice. So logically speaking why take the risk?
This also applies to human age and might help explain a natural tendency for younger people to be more risk-seeking as opposed to older people who are generally more risk averse. If you’ve ever wondered why that is the case, I think this is a part of it. Older people have already spent more time on earth exploring, so it logically makes sense that they now “enjoy the fruits of their labor” (exploit). Whereas younger people often “don’t know what they want” and that could easily just be because they haven’t spent enough time exploring or searching enough to know what they want or like.
So hopefully you start to see now, that the time-interval-remaining-to-exploit is the key consideration here.
A sobering property of trying new things is that the value of exploration, of finding a new favorite, can only go down over time, as the remaining opportunities to savor it dwindle. Discovering an enchanting café on your last night in town doesn’t give you the opportunity to return.
The flip side is that the value of exploitation can only go up over time. The loveliest café that you know about today is, by definition, at least as lovely as the loveliest café you knew about last month. (And if you’ve found another favorite since then, it might just be more so.) So explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.”
TL;DR of key ideas
The Explore/Exploit trade-off is a common risk/reward dilemma faced by people around the world every single day, whether or not you know it by that name.
The problem is “should I take a risk and explore and potentially discover something better or should I just stick to what I know and exploit that?”
One key factor that may help you resolve that dilemma is by considering the remaining Time to Exploit: “explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.”
If you liked this week’s idea, please consider reading the entire book to get the full context, meaning and nuance. As always, happy reading! 📚
See Amazon page for book reviews.
See Goodreads page for book reviews.
Thanks for reading. If you enjoy these weekly bite-sized chunks of ideas from books and would like to support them, there are a couple of ways you could do that.
The easiest way is to share these posts with anyone whom you think would find them useful towards developing a reading habit. So feel free to pass these on to your friends of family. ✉
If you have the means you could subscribe to my paid Substack publication. It would go a long way towards helping me devote more time towards my pursuit of lifelong learning and enable me to create more content. 📚
Either way, I really appreciate the time you spent reading this article. 🙏
Let’s stay in touch. You can find me on:
Twitter | The Avid Bookreader Blog | The Avid Bookreader Substack