Reinforcement learning is often described in an MDP or POMDP framework. By framework, I mean a set of abstract concepts which can be used to describe a large number of different specific problems / games at once. Frameworks are generally useful because it allows you to reason about many different specific things at once. In the (PO)MDP framework, concepts include such things as "reward", and "state", and "transition".
Driving a car is an example of a task which can be abstracted as an POMDP: the state consists of the relevant state of the world (e.g. the road ahead, nearby cars, pedestrians, and other objects, the car itself and its mechanical parts), the "transition function" is simply the laws of physics, and the "reward" is a bit subjective, but you can imagine you're rewarded for getting to your destination and penalized for crashing into things.
A robot trying to navigate a maze can also be abstracted as a POMDP: the state consists of the location of the robot in the maze, the transition is governed by again by the laws of physics governing how the robot can physically move, and the reward is presumably positive if the robot solves the maze.
So returning to your questions:
how to generate the next state?
The next state comes from the transition function of your (PO)MDP. Exactly what that transition function depends on what your (PO)MDP is modeling, it may be physical laws, or the rules of a board game, etc. If it's a board game, you can just use the rules of the game to determine what happens next.
And for the reward r(s,a,s′), in the algorithms, why isn't it a input
function
In order for the (PO)MDP framework to be able to model a large number of different games and problems, the abstract reward function is often formulated as being random. Maybe you're playing a game where you roll a dice, and get the resulting number of dollars (aka reward). If MDPs could only have deterministic reward, then it would be difficult to fit this type of game into the framework. So in an effort to make the framework as general as possible, rewards are stochastic.