The point wouldn't be Face ID (other than identifying the people you meet), it's 3-D sensing for AR. The better the depth perception, the more convincing the integration of computer-generated objects into a 3-D environment.
The thing is, I don't think the combination of dot projector and 2-D camera (which is what we have with Face ID) is as effective on the back camera, for the same reason that a camera flash becomes useless when the subject matter is too far from the camera - the light spreads out and dims over distance. I think old-fashioned stereoscopic cameras will be more useful - it's likely easier to extract depth/position info by comparing the images from two lenses (the principle behind human depth perception) than it is to infer that depth from a single, 2-D image.
So, my prediction is that we'll see dual-lens back cameras (or more) on every new iPhone model going forward.