A very important aspect of policy gradient algorithms is that they are on-policy. Their on-policy nature comes from the formula (6.4), as it is dependent on the current policy. Thus, unlike off-policy algorithms such as DQN, on-policy methods aren't allowed to reuse old experiences.
This means that all the experience that has been collected with a given policy has to be discarded once the policy changes. As a side effect, policy gradient algorithms are less sample efficient, meaning that they are required to gain more experience to reach the same performance as the off-policy counterpart. Moreover, they usually tend to generalize slightly worse.