A couple similar podcasts/videocast shows have discussed how they do it. I think what they do is run the show off a Skype call (or iMessage or Zoom or whatever) with whatever crappy mic/webcam they have, then they also have each participant record using a studio quality mic and camera and save those files locally.
Only after the show do they have each participant send the video and audio files to someone who puts it all together in post.
In other words, I don't think the original call has this kind of quality – it's just a bunch of high quality recordings stitched together.