How can we guarantee that every action of each state is chosen? And why is that so important? We will first answer the latter question, and then show how we can (at least in theory) explore the environment to reach every possible state.
How can we guarantee that every action of each state is chosen? And why is that so important? We will first answer the latter question, and then show how we can (at least in theory) explore the environment to reach every possible state.