Temporal Difference Learning

How can humans or machines interact with an environment and learn a strategy for selecting actions that are beneficial to their goals? Answers to this question fall under the artificial intelligence category of reinforcement learning. Here I am going to provide an introduction to temporal difference (TD) learning, which is the algorithm at the heart of reinforcement learning.

I will be presenting TD learning from a computational neuroscience background. My post has been heavily influenced by Dayan and Abbott Ch 9, but I have added some additional points. The ultimate reference for reinforcement learning is the book by Sutton and Barto, and their chapter 6 dives into TD learning.


To start, let’s review conditioning. The most famous example of conditional is Pavlov’s dogs. The dogs naturally learned to salivate upon the delivery of food, but Pavlov realized that he could condition dogs to associate the ringing of a bell with the delivery of food. Eventually, the ringing of the bell on its own was enough to cause dogs to salivate.

The specific example of Pavlov’s dogs is an example of classical conditioning. In classical conditioning, no action needs to be taken. However, animals can also learn to associate actions with rewards and this is called operant conditioning.

Before I introduce some specific conditioning paradigms, here are the important definitions:

  • s = stimulus
  • r = reward
  • x = no reward
  • v = value, or expected reward (generally a function of r, x)
  • u = binary, indicator variable, of stimulus (1 if stimulus present, 0 otherwise)

Here are the conditioning paradigms I want to discuss:

  • Pavlovian
  • Extinction
  • Blocking
  • Inhibitory
  • Secondary

For each of these paradigms, I will introduce the necessary training stages and the final result. The statement, a \rightarrow b, means that a becomes associated (\rightarrow) with b.


Training: s \rightarrow r. The stimulus is trained with a reward.

Results: s \rightarrow v[r]. The stimulus is associated with the expectation of a reward.


Training 1: s \rightarrow r. The stimulus is trained with a reward. This eventually leads to successful Pavlovian training.

Training 2: s \rightarrow x. The stimulus is trained with a no reward.

Results: s \rightarrow v[x]. The stimulus is associated with the expectation of no reward. Extinction of the previous Pavlovian training.


Training 1: s_1 \rightarrow r. The first stimulus is trained with a reward. This eventually leads to successful Pavlovian training.

Training 2: s_1 + s_2 \rightarrow r. The first stimulus and a second stimulus is trained with a reward.

Results: s_1 \rightarrow v[r], and s_2 \rightarrow v[x]. The first stimulus completely explains the reward and hence “blocks” the second stimulus from being associated with the reward.


Training: s_1+s_2 \rightarrow x, and s_1 \rightarrow r. The combination of two stimuli leads to no reward, but the first stimuli is trained with a reward.

Results: s_1 \rightarrow v[r], and s_2 \rightarrow -v[r]. The first stimuli is associated with the expectation of the reward while the second stimuli is associated with the negative of the reward.


Training 1: s_1 \rightarrow r. The first stimulus is trained with a reward. This eventually leads to successful Pavlovian training.

Training 2: s_2 \rightarrow s_1. The second stimulus is trained with the first stimulus.

Results: s_2 \rightarrow v[r]. Eventually the second stimulus is associated with the reward despite never being directly associated with the reward.

Rescorla-Wagner Rule

How do we turn the various conditioning paradigms into a mathematical framework of learning? The Rescorla Wagner rule (RW) is a very simple model that can explain many, but not all, of the above paradigms.

The RW rule is a linear prediction model that requires these three equations:

  1. v=w \cdot u
  2. \delta = r-v
  3. w_{new} = w_{old}+\epsilon \delta u

and introduces the following new terms:

  • w = weights associated with stimuli state
  • \epsilon = learning rate, with 0 \le \epsilon \le 1

What do each of these equations actually mean?

  1. The expected reward, v, is a linear dot product of a vector of weights, w, associated with each stimuli, u.
  2. But there may be a mismatch, or error, between the true actual reward, r, and the expected reward, v.
  3. Therefore we should update our weights of each stimuli. We do this by adding a term that is proportional to a learning rate \epsilon, the error \delta, and the stimuli u.

During a Pavlovian pairing of stimuli with reward, the RW rule predicts an exponential approach of the weight to w = \langle ru\rangle over the course of several trials for most values of \epsilon (if \epsilon=1 it would instantly update to the final value. Why is this usually bad?). Then if the reward stops being paired with the stimuli, the weight will exponential decay over the course of the next trials.

The RW rule will also continue to work when the reward/stimulus pairing is stochastic instead of deterministic and the will will still approach the final value of w = \langle ru\rangle.

How does blocking fit into this framework? Well the RW rule says that after the first stage of training, the weights are w_1 = r and w_2 = 0 (since we have not presented stimulus two). When we start the second stage of training and try and associate stimulus two with the reward, we find that we cannot learn that association. The reason is that there is no error (hence \delta = 0) and therefore w_2 = 0 forever. If instead we had only imperfectly learned the weight of the first stimulus, then there is still some error and hence some learning is possible.

One thing that the RW rule incorrectly predicts is secondary conditioning. In this case, during the learning of the first stimulus, s_1, the learned weight becomes w_1 >0. The RW rule predicts that the second stimulus, s_2, will become w_2 <0. This is because this paradigm is exactly the same as inhibitory conditioning, according to the RW rule. Therefore, a more complicated rule is required to successfully have secondary conditioning

One final note. The RW rule can provide an even better match to biology by assuming a non-linear relationship between v and the animal behavior. This function is often something that exponentially saturates at the maximal reward (ie an animal is much more motivated to go from 10% to 20% of the max reward rather than from 80% to 90% of the max reward). While this provides a better fit to many biological experiments, it still cannot explain the secondary conditioning paradigm.

Temporal Difference Learning

To properly model secondary conditioning, we need to explicitly add in time to our equations. For ease, one can assume that time, t, is discrete and that a trial lasts for total time T and therefore 0 \le t \le T.

The straightforward (but wrong) extension of the RW rule to time is:

  1. v[t]=w[t-1] \cdot u[t]
  2. \delta[t] = r[t]-v[t]
  3. w[t] = w[t-1]+\epsilon \delta[t] u[t]

where we will say that it takes one time unit to update the weights.

Why is this naive RW with time wrong? Well, psychology and biology experiments show that animals expected rewards does NOT reflect the past history of rewards nor just reflect the next time step, but instead reflects the expected rewards during the WHOLE REMAINDER of the trial. Therefore a better match to biology is:

  1. v[t]=w[t-1] \cdot u[t]
  2. R[t]= \langle \sum_{\tau=0}^{T-t} r[t+\tau] \rangle
  3. \delta[t] = R[t]-v[t]
  4. w[t] = w[t-1]+\epsilon \delta[t] u[t]

where R[t] is the full reward expected over the remainder of the trial while r[t] remains the reward at a single time step. This is closer to biology, but we are still missing a key component. Not all future rewards are treated equally. Instead, rewards that happen sooner are valued higher than rewards in the distant future (this is called discounting). So the best match to biology is the following:

  1. v[t]=w[t-1] \cdot u[t]
  2. R[t]= \langle \sum_{\tau=0}^{T-t} \gamma^\tau r[t+\tau] \rangle
  3. \delta[t] = R[t]-v[t]
  4. w[t] = w[t-1]+\epsilon \delta[t] u[t]

where 0 \le \gamma \le 1 is the discounting factor for future rewards. A small discounting factor implies we prefer rewards now while a large discounting factor means we are patient for our rewards.

We have managed to write down a set of equations that accurately summarize biological reinforcement. But how can we actually learn with this system? As currently written, we would need to know the average reward over the remainder of the whole trial. Temporal difference learning makes the following assumptions in order to solve for the expected future rewards:

  1. Future rewards are Markovian
  2. Current observed estimate of reward is close enough to the typical trial

A Markov process is memoryless in that the next future step only depends on the current state of the system and has no other history dependence. By assuming rewards follow this structure, we can make the following approximation:

  • R[t]= \langle r[t+1] \rangle + \gamma \langle \sum_{\tau=1}^{T-t} \gamma^{\tau-1} r[t+\tau]
  • R[t]= \langle r[t+1] \rangle + \gamma R[t+1]

The second approximation is called bootstrapping. We will use the currently observed values rather than the full estimate for future rewards. So finally we end up at the temporal difference learning equations:

  1. v[t]=w[t-1] \cdot u[t]
  2. R[t] =  r[t+1] + \gamma v[t+1]
  3. \delta[t] =r[t+1] + \gamma v[t+1]-v[t]
  4. w[t] = w[t-1]+\epsilon \delta[t] u[t]


Screen Shot 2017-05-15 at 5.06.51 PM.png

Dayan and Abbott, Figure 9.2. This illustrates TD learning in action.

I have included an image from Dayan and Abbott about how TD learning evolves over consecutive trials, please read their Chapter 9 for full details.

Finally, I should mention that in practice, people often use the TD-Lambda algorithm. This version introduces a new parameter, lambda, which controls how far back in time one can make adjustments. Lambda 0 implies one time step only, while lambda 1 implies all past time steps. This allows TD learning to excel even if the full system is not Markovian.

Dopamine and Biology’s TD system

So does biology actually implement TD learning? Animals definitely utilize reinforcement learning and there is strong evidence that temporal difference learning plays an essential role. The leading contender for the reward signal is dopamine. This is a widely used neurotransmitter that evolved in early animals and remains widely conserved. There are a relatively small number of dopamine neurons (in the basal ganglia and VTA in humans) that project widely throughout the brain. These dopamine neurons can produce an intense sensation of pleasure (and in fact the “high” of drugs often comes about either through stimulating dopamine production or preventing its reuptake).

There are two great computational neuroscience papers that highlight the important connection between TD learning and dopamine that analyze two different biological systems:

Both of these papers deserved to be read in detail, but I’ll give a brief summary of the bee foraging paper here. Experiments were done that tracked bees in an controlled environment consisting of “yellow flowers” and “blue flowers” (which were basically just different colored cups). These flowers had the same amount of nectar on average, but were either consistent or highly variable. The bees quickly learned to only target the consistent flowers. These experimental results were very well modeled by assuming the bee was performing TD learning with a relatively small discount factor (driving it to value recent rewards).

TD Learning and Games

Playing games is the perfect test bed for TD learning. A game has a final objective (win), but throughout play it can be difficult to determine your probability of winning. TD learning provides a systematic framework to associate the value of a given game state with the eventual probability of learning. Below I highlight the games that have most significantly showcased the usefulness of reinforcement learning.


Backgammon is a two person game of perfect information (neither player has hidden knowledge) with an element of chance (rolling dice to determine one’s possible moves). Gerald Tesauro’s TD-Gammon was the first program to showcase the value of TD learning, so I will go through it in more detail.

Before getting into specifics, I need to point out that there are actually two (often competing) branches in artificial intelligence:

Symbolic logic tends to be a set of formal rules that a system needs to follow. These rules need to be designed by humans. The connectionist approach uses artificial neural networks and other approaches like TD learning that attempt to mimic biological neural networks. The idea is that humans set up the overall architecture and model of the neural network, but the specific connections between “neurons” is determined by the learning algorithm as it is fed real data examples.

Tesauro actually created two versions of a backgammon program. The first was called Neurogammon. It was trained using supervised learning where it was given expert games as well as games Tesauro played against himself and told to learn to mimic the human moves. Neurogammon was able to play at an intermediate human level.

Tesauro’s next version of a backgammon program was TD-Gammon since it used the TD learning rule. Instead of trying to mimic the human moves, TD-Gammon used to the TD learning rule to assign a score to each move throughout a game. The additional innovation is that the TD-Gammon program was trained by playing games against itself. This initial version of TD-Gammon soon matched Neurogammon (ie intermediate human level). TD-Gammon was able to beat experts by both using a supervised phase on expert games as well as a reinforcement phase.

Despite being able to beat experts, TD-Gammon still had a weakness in the endgame. Since it only looked two-moves ahead, it could miss key moves that would have been found by a more thorough analytical approach. This is where symbolic logic excels and hence TD-Gammon was a great demonstration of the complimentary strength and weaknesses of symbolic vs connectionist logic.


Go is a two person game of perfect information with no element of chance. Despite this perfect knowledge, the game is complex enough that there are around 10^170 possible games (for reference, there are only about 10^80 atoms in the whole universe). So despite the perfect information, there are just too many possible games to determine the optimal move.

Recently AlphaGo made a huge splash by beating one of the world’s top players of Go. Most Go players, and even many artificial intelligence researchers, thoughts an expert level Go program was years away. So the win was just as surprising as when DeepBlue beat Kasparov in chess. AlphaGo is a large program with many different parts, but at the heart of it is a reinforcement learning module that utilizes TD learning (see here or here for details).


The final frontier in gaming is poker, specifically multi-person No-Limit Texas Hold’em. The reason this is the toughest game left is that it is a multi-player game with imperfect information and an element of chance.

Last winter the computer systems won against professionals for the first time in a series of heads up matches (computer vs only one human). Further improvements are needed to actually beat the best professionals at a multi-person table, but these results seem encouraging for future successes. The interesting thing to me is that both AI system seems to have used only a limited amount of reinforcement learning. I think that fully embracing reinforcement and TD learning should be the top priority for these research teams and might provide the necessary leap in ability. And they should hurry since others might beat them to it!

Research Experience for Undergrads (REU)

This National Science Foundation program is designed to give undergraduates, especially those from smaller schools, a chance to gain real research experience for a summer. Personally I participated in one official REU and one program modeling on REUs. I learned a lot (and they were tons of fun!). The best part is not the specific topic you research, but the opportunity to learn how to be a researcher.
Most of the applications are due in February. Check out the the official NSF REU website for the latest details.
When you are ready to apply, go here to search for programs of REUs in various subjects. Also, search the internet for other research opportunities; Harvard has a nice list of research programs for undergrads. For more detailed tips on applications, I recommend this site
If you want to get an idea of what an REU is like, here are some interviews of past Math REU participants. And also keep in mind these research tips for undergrads if you do get an REU.

QFT Resources

Quantum Field Theory is a notoriously difficult subject to learn, but I found the following resources to be extremely helpful when I took the course a few years ago. I just learned about a few resources that I wish I had then, so here are my current tips for learning QFT. 
Tony Zee’s book QFT in a Nutshell provides a great intuition into what QFT is all about. If you actually want to do calculations, then Peskin and Schroeder’s book is a nice compliment. These two books were the heart of my studies into QFT.
David Tong’s Notes:
Great set of lecture notes that provides a different perspective.
Sidney Coleman’s Lectures:
Apparently, all modern QFT books are based on Coleman (since all the authors learned QFT from him or his students), and you can still see the original videos.  For years there was a set of hand-written notes that served as a transcript of the video but this was recently LaTeXed and shared on the ArXiv.

Deep Learning Seminar Course

This semester Terry Sejnowski is teaching a graduate seminar course that is focused on Deep Learning. The course meets weekly for two hours to discuss papers. Here I’ll just outline the course and in later posts I’ll add some thoughts on each specific week.

Week 1: Perceptrons

Week 2: Hopfield Nets and Boltzmann Machines

Week 3: Backprop

Week 4: Independent Component Analysis (ICA)

Week 5: Convolutional Neural Networks (CNN)

Week 6: Recurrent Neural Networks (RNN)

Week 7: Reinforcement Learning

Week 8: Information and Control Theory

2016 Election Thoughts: Part 2/2

This post has absolutely nothing to do with science and is just some of my thoughts on the recent US Presidential election. I started writing up my thoughts and I realized it was easiest to organize my thoughts by things I would like to say to Anti-Trump vs Trump voters. In reality, both posts are relevant to either side, but it was a convenient way to cleanly separate my points. Since I respect everyone’s right to a private vote, I’m writing these thoughts as open letters to both sides.

Dear Trump Voters Who Love Me,

I cried.

I’m scared and I cried.

I need you to understand that. This fear of Trump has not gotten better since the election. In fact, it took me until Friday November 11th at 8PM PT for the full implications of the election of President Trump to set in. I finally truly understood what this election meant to me.

I need you to know that when I fully understood what this election meant to me, I cried. Uncontrollable sobbing. It hit me while walking down the hallway towards my apartment. I held it together long enough to go inside, sit down in the dark, and sob uncontrollably by myself. I cried because I was scared. I cried because of innocence lost, both my own and my future children’s. I cried because I didn’t do enough to prevent me from crying. I cried for being naive and stupid and taking this long to truly see the world. I cried for not figuring it out in time to communicate my viewpoint with Trump voters. I cried because I was crying. I cried out of despair and frustration because I realized my future children, at a much younger age, would feel a much worse pain. I cried because I had entered the Dark Forest.

I need you to know that I will remember that cry for a long time. I cry rarely enough that I am pretty sure I can name ever event since my teenage years. This is something I won’t forget anytime soon.

And I realized, that more than anything else, I need you to understand why I cried. I need you to understand why President Donald J. Trump can never be just another politician to me. I need you to realize that you have unleashed a political weapon on me that scares the shit out of me. I need you to understand why this just became a defining point in my life. I need you to understand that I have entered the Dark Forest and what it means for me.

First what is the Dark Forest. I am stealing this from a science fiction series, the Three Body Problem. While the book focuses on interactions between alien civilizations, I think it also a useful analogy for politics today since both sides seem to be alien to each other. The Dark Forest translated to democracy is this:

Axiom 1: A voter’s goal is to survive
Axiom 2: Resources are finite
Axiom 3: Voters and politicians have limited communication
Axiom 4: Strangers have limited communication

Consequence 1, The Light Forest: The combination of axiom 1 and 2 mean that we are all hunters in a forest, competing for resources. This by itself is a perfectly fine world and democracy. Yes we are competing with each other, but since we have plenty of light, we can stay safe. We don’t need to worry that we will mistake each other for the animals we are hunting.

Consequence 2, Chains of Suspicion: The combinations of axiom 3 and 4 lead to Chains of Suspicion. The extreme distances between strangers creates an insurmountable ‘Chain of Suspicion’ where the two strangers cannot communicate fast enough to relieve mistrust, making conflict inevitable.

Consequence 3, The Dark Forest: The Chains of Suspicion cast a dark shadow over the Forest, turning it dark. In the Dark Forest, other hunters become threats. I no longer know if the noise I hear in the dark is an animal or another hunter. I also know that the other hunter has the same problem. I know that this other hunter may shoot me, either by accident, out of fear, or worse, on purpose. Therefore, I can only guarantee my safety if I shoot first and ask questions later.

I need you to realize that politicians words matter. Trump and I will never talk in person. I will never be able to truly get to know Trump. That means, that when Trump says or tweets authoritarian or racist things, I will never know his true intent. It means that Trump and I have an insurmountable Chain of Suspicion.

Looking back, Trump and I have had this Chain of Suspicion for a long time. This Chain did not directly drive me into the Dark Forest of distrust largely because of you. I love and trust you. I know that we may have political differences, but I am confident we can work them out. But you and I are not the issue. You and I are not strangers.

What drove me into the Dark Forest is that Chains of Suspicion multiply like a virus. In the Dark Forest, Trump’s words matter because they are him broadcasting his potential future actions. Maybe Trump’s threats are just a bluff. Maybe those words won’t lead to actions. But I need you to understand, there are others that scare me to my core and I am afraid that Trump has given them more power. Trump has reinforced their terrible ideas and made them seem slightly more normal.

I need you to know that Trump is not a standard politician to me. Trump successfully won election despite doing two things that I thought individually would be disqualifying in modern society:

  1. disregard for democracy
  2. explicit racism

I need you to understand that when Trump combined those two together, he crossed a line that should never be crossed in a functioning democracy. Trump crossed the safety tape separating democracy and fascism. Trump himself has NOT taken us to fascism. But I am afraid he made fascism seem just a little more mainstream to extremists.

One major reason words speak louder than actions is that there are certain words that can’t be unsaid. Trump proclaimed in a nationally televised debate that he may not accept the outcome of the election if he does not win. I need you to really think about the future consequences of that. You need to understand what those words mean  to me and my insurmountable Chain of Suspicion with Trump.

Imagine this scenario that scares the shit out of me and needs to scare you too. Trump in 4 years, as the sitting President (maybe with a Republican House and Senate) says in a presidential debate that he may not accept the outcome of the election if he doesn’t win.

What am I suppose to believe if Trump wins again by a small margin like this year? Should I believe that the election was fair? Or should I worry that Trump used his power as president to ensure his own victory?

If you don’t understand this fear, and why the MERE POSSIBILITY of this fear itself should scare you too, please reconsider. Learn more about history. You need to understand the Dark Forest that I am in now. Talk to me until you understand my fear. A democracy CANNOT survive long if even a small percentage of voters fear the integrity of future votes. I have this fear. This fear leads to a Dark Forest where democracy will struggle.

This fear needs to be extinguished now because when it combines with my next issue, I am afraid it leads to an even Darker Forest were democracy is guaranteed to die. Trump has created an insurmountable racial Chain of Suspicion with me. Trump has engaged a variety of terrible racial rhetoric but there are two things that especially stick with me. The first is Trump’s attack on Judge Curiel which even Paul Ryan called “the textbook definition of a racist comment.”

I need you to know that since I have a Chain of Suspicion with Trump, I cannot avoid taking that attack personally. Trump attacked Judge Curiel for his Mexican heritage despite being born in the United States. Judge Curiel is clearly not American enough for Trump. It doesn’t matter that Tina has Chinese heritage. I need you to know that I see an attack on one minority as an attack on all. I need you to know that I see it as an attack against Tina and our future kids. Will they be American enough for Trump? I just don’t know.

But I really need you to the final realization that made me break down crying and pushed me deep into the Dark Forest. I had managed to forget about Trump’s strange relationship with David Duke (KKK member), see here for details. Trump’s refusal to disavow David Duke in 2016 despite doing so in 2000 scares me. I realized I truly don’t understand Trump.

What drove me to tears was that I realized, even if Trump made an innocent mistake, the damage is done. Trump broadcast a message to David Duke and other racists that can never be unsaid. Trump (unintentionally or intentionally) screamed to them: I can win the presidency despite authoritarian and racist rhetoric. It is not Trump I am scared of. It is the dark hunters he just empowered. I had no illusions that racial extremists did not exist, but now, due to Chains of Suspicion, I am no longer optimistic that their numbers are small.

I need you to realize that this is when I personally entered the Dark Forest. I was walking back from my car to the apartment when I walked past a large group of white men. I unconsciously started doing some math, trying to calculate what are the odds that they voted for Trump and specifically voted for Trump because of his racial rhetoric. Before I could finish the math, I realized I was deciding if I was safe around them and started tearing up. This is when I cried uncontrollably. This is when I realized that I had been naive and living in a false world. I thought I was realistic and understood the darkness that existed in the world. But I was living in a Light Forest that was only a product of many factors including but not limited to me being: male, white, upper middle class, well-educated, etc. I truly saw the Dark Forest.

I cried because I got the tiniest possible sliver of understanding of what it truly means to be a minority and I couldn’t handle the truth. As a minority, they live in the Dark Forest. They have heard and felt the racism. They know that not everyone can be trusted. They know that people can attack them when least expected and they must be suspicious. But I cried because its worse: minorities live in the Dark Forest but have a permanent spotlight on them. They are emitting light into this darkness. They don’t blend in. They always stand out in this vast darkness. That means they are always a target for those that hunt minorities.

I cried because I realized that I live in a Dark Forest and that Tina and our future children will always have a spotlight on them. I cried because the tiny glimpse of the darkness scared me. I cried because I realized that my future children will learn the nature of the Dark Forest at an age that is much too young. I cried because I know the Dark Forest my children will live in is worse than the one I am in. I cried because I am scared of hunters like David Duke. I cried because President Trump doesn’t seem to understand that his words empower these hunters. I cried because I was too stupid to put this all into words sooner. I cried because I don’t know how to protect Tina and our future children. I cried because my natural response to that helplessness was to lash out at others in the same way they want to attack Tina. And I had one final burst of tears when I realized the deep irony that David Duke had just made me into an inverse of himself and made me racist against random white people. I laughed, probably like a maniac, because I realized that after that, I am so far lost in the Dark Forest of distrust that I had managed to become the type of hunter that probably scares David Duke the most.

But most of all, I need you to understand that I love you and look forward to working with you to end the Dark Forest of distrust. I am sorry for not communicating better with you. I don’t know why you voted for Trump. Maybe you are already in the Dark Forest of distrust. Maybe you hated Hillary and had an insurmountable Chain of Suspicion with her. Maybe you thought Trump was a standard Republican candidate.

I know you didn’t mean to scare me. But I need you to realize that Trump is not a standard candidate to me. I need you to realize that I can never personally trust Trump based on the words he has said. I need you to realize that I am especially scared of Trump and the people he might either intentionally or accidentally empower.

And I especially need you to realize that what I am actually more scared about is the fact that I am scared. The part of me that remembers the Light Forest thinks the fear is irrational. But the part of me that has seen the Dark Forest of distrust thinks the fear is rational and maybe that I am not scared enough. I see how the Chains of Distrust multiply. If even a few people share my distrust, it must be extinguished now before it grows too strong.

We have to break taboos. We need to talk about politics. We need to establish ground rules for the type of political discourse and political tactics that are allowed in America. We need to talk about race and discrimination. The only way to turn the Dark Forest into the Light Forest is to break Chains of Suspicion by better communication. We can’t wait four years to discuss these issues. We had a deep divide in this country before the election and Trump made the divide wider. We can only heal this distrust if we start soon.

And finally, I want you to know that I have made peace with this election. I want to sincerely thank you for voting for President Trump. I can now see the world clearer than before. My naivety was dangerous to Tina and our future children. I was complacent. I assumed my children would grow up in a Light Forest. I now realize that they cannot. But I will fight to make the Dark Forest just a little bit brighter. I will fight to extend the time that my children think they are only in a Light Forest. And I now realize the true depths of the Dark Forest, and that I can only fight it with you help. I look forward to working with you to bring Light to the Dark Forest.

With all the love in my heart,

PS. This is not the world’s weirdest baby announcement. These children I discuss are still in the future. But I still cried for the hypothetical children.

PSS. Dave Chappelle and SNL are very wise. I admit thinking I was more realistic about the US than the people in the skit, but I was just in a slightly different bubble than they were.

2016 Election Thoughts: Part 1/2

This post has absolutely nothing to do with science and is just some of my thoughts on the recent US Presidential election. I started writing up my thoughts and I realized it was easiest to organize my thoughts by things I would like to say to Anti-Trump vs Trump voters. In reality, both posts are relevant to either side, but it was a convenient way to cleanly separate my points. Since I respect everyone’s right to a private vote, I’m writing these thoughts as open letters to both sides.

Dear Anti-Trump Voters Who Love Me,

We fucked up.

Don’t get me wrong, I voted against Trump and you voted against Trump, but that doesn’t mean I don’t still have issues with both you and myself. We didn’t do enough. You can read my letter to Trump Voters to realize the pain I felt.

I have several central ideas and several additional points later.

1. Don’t Disrespect Democracy

We lost and we lost fair and square. I am 100% in support of electoral college reform for 2020 and beyond. I am 0% in any attempt to change it in 2016. Don’t sow seeds of doubt. Accept the results and move forward.

2. Think Long and Hard about WHY People Supported Trump

Spend a lot of time thinking about the chart in this article.  The automation and elimination of jobs is real and will only accelerate. The pain and despair are real. Trump addressed the anger and angst felt by people in these counties. These issues are not going away. I don’t claim to have an answer, but if you want to win over the hearts of Trump supporters, this is a great starting point. Also, despite being on a comedy website, this article also makes many serious points. Its time to win over Trump supporters not demonize them.

3. Words Matter: Stop Crying Wolf

A recent conversation with a wise office mate of mine involved us reminiscing about the good old days when with Mitt Romney we only had to worry about binders full of women and terrible renditions of Who Let the Dogs Out. Those were not real issues, but we cried wolf. Well, the real wolf just got elected and we blew all credibility too soon.

Trump must be opposed. But its time to reserve the harsh words for him and others who are truly racist, sexist, etc. Don’t use the same rhetoric on other Republicans. The false equivalence will continue to cause a credibility gap in the future.

4. Governance Reform Starts Now and MUST Continue When Democrats Win

The political system is broken and we were part of the problem. It doesn’t matter who did it first, last, or most. Both sides have abused weird technicalities in our process of government and that must stop.

I have ideas for more sweeping reforms, but for now, I will just focus on a few of the major problems I see.

A. President: Limit Executive Power
Executive power is like heroin. Might feel great while you are high and in charge but it sucks the rest of the time. We let Obama do too much. The withdrawal is going to suck majorly.

B. House of Representatives: Gerrymandering
The Republicans are going to win just over 52% of the two party vote but around 55% of the seats. Not all of that is due to gerrymandering, but at least part of it is. Check out the Texas districts. Both Democrats and Republicans should learn about California’s new redistricting commission. I can attest that the districts seem more reasonable and that the “jungle” primary is quite fun.

C. Senate: Filibuster

Let’s all agree to just end the filibuster now. Just because the Republicans successfully used the filibuster to block a Supreme Court nominee for nearly a year does not mean that Democrats should turn around and do the same. It is time to end the filibuster and just let the majority of the Senate govern. This will really hurt in the short term. But it will be much better in the long term.

D. Electoral College Reform

Again, I am 100% in support of electoral college reform for 2020 and beyond. I am 0% in any attempt to change it in 2016.

Any argument in favor of the electoral college has to explain this fact for me: Hillary Clinton will probably win the popular vote by about 1% and lose the electoral college by 6.5%. That huge discrepancy goes against every principle of one person, one vote. Look back at past elections, the popular vote is way out of sync with the electoral college.


PS Points:

1. #TrumpIsOurPresident

While I understand the spirit of #NotOurPresident is that you disagree with Trump, no one gets to pretend that Trump isn’t truly our president. We are all responsible for Trump. I know I personally didn’t do enough to oppose him, since I honestly didn’t truly think he would win. But Trump did win and this is on everyone now.

2. Please Protest Peacefully
I am 1000% behind everyone’s right to protest. Just please don’t turn violent, that will only play into Trump’s hand and give his paranoid rants more legitimacy.

3. Stop Crashing Canada’s Immigration Website
Back to PS point 1, Trump is our president. Deal with it here. You don’t get to flee.

4. Stop Imagining Alternative Pasts
What if Bernie Sanders was the nominee? What if the third party vote was different? Etc, etc, etc. The election is done. Now don’t get me wrong, it is worth learning from mistakes. But learn from the past to make the future you dream of a reality, instead of only dreaming about the past.

5. California Doesn’t Get to Secede
Just stop, its stupid.

NSF GRFP 2016-2017

For a couple of years now, I have had a website with my thoughts on the National Science Foundation Graduate Research Fellowship (NSF GRFP) and examples of successful essays. The popularity of the site in the past few years has grown well beyond what I expected, so this year I’m going to use this blog to try out a few new things.


Questions from You

I end up getting lots of emails asking for advice. While sometimes the advice really does merit an individualized result, many of the questions are applicable to everyone. So in the interest of efficiently answering questions, here is my plan this year.

  1. Before asking me, make sure you’ve read my advice, checked out the NSF GRFP FAQ, skimmed GradCafe, read my FAQ (next section), and checked out the comments for this blog post.
  2. I will not answer any questions about eligibility due to gaps in graduate school because I am honestly clueless on it.
  3. If you feel comfortable asking the question publicly, post it by commenting below.
  4. If you want to ask me privately, send me an email (my full name at gmail.com, include NSF GRFP Question in subject line). I will try and answer you and also work with you on a public question/answer that I can include here.



Here are some past questions I have been asked and/or questions I anticipate being asked this year.

  • My research is closely related to medicine. Am I still eligible?
    • I think the best test for this is to ask your advisor if they would apply to NSF or NIH for grants on this topic. If NSF you are definitely good, but if NIH, you will need to reframe the research to fit into NSF.
  • I am a first year graduate student. Should I apply this year or wait until my second year? (New issue this year since incoming graduate students can only apply once).
    • This is the toughest question for me since no one has had to make this choice yet. However, here is how I would personally decide. The important thing to remember is that undergrads, first year grads, and second year grads are each separately graded relative to their respective years. So you really need to decide how you currently rank relative to your peers versus how you will rank next year. If you did a bunch of undergrad research, have papers, etc, definitely apply as a first year. If you didn’t, it might payoff to wait, but only if your program lets you get right into research. If you will just be taking classes, I’m less confident your relative standing will improve. Good luck to everyone with this tough choice!


Requests for Essay Reading

Unfortunately, I now get more requests to read essays than I can reasonably accomplish. But I am still willing to read over a few and here is how I will decide on the essays to read.

  1. If you are in San Diego, and you think I am a better fit for you than the other local people on the experienced resource list,  send me an email with the subject NSF GRFP Experienced Resource List.
  2. If you are not in San Diego, first check out the experienced resource list and also ask around your school for other resources.
  3. If you can’t find anyone to read your essays, fill out this form. I will semi-randomly select essays to read.

What do I mean by semi-randomly? Well, in the interest of supporting the NSF GRFP’s goal of increasing the diversity of graduate school, I will give priority to undergrads who are without a local person on the experienced resource list and/or are from underrepresented groups. The NSF GRFP specifically “encourages women, members of underrepresented minority groups, persons with disabilities, and veterans to apply”, and I am willing to extremely loosely define minority group by race, ethnicity, sexual orientation, family socio-economic status, geography, colleges that traditionally send few students to graduate school, etc. The form is fill in the blank, so feel free to justify your inclusion in any other underrepresented group that I did not explicitly list.

I’ll then take the prioritized list and make some random selection. The number of people I select this way will depend on the number of local people I end up advising, but I will definitely read at least 2 non-local applications.


Here is a my time-line for essay reading:

  • Sept 16th – Random drawing number 1
  • Sept 30th Extended to Oct 5th – Random drawing number 2 (I’ll include everyone again, so early birds get double the chances of being selected)
  • Oct 21st – Last day I will help people (sorry I’m traveling near the deadline)