In the examples you've given it is mainly: depth-of-field, lighting, dynamic range and framing.
Depth-of-field is how shallow the focus is. In the tv show the background behind the characters is quite blurred (shallow DoF). This is down to the size of the image sensor or frame of film, and lens selection. 35mm film has a surface area about ten times the size of your average consumer camcorder CCD, and to put it too simply, this allows greater control over depth-of-field. The kind of film cameras used to shoot tv shows allow lots of things (like aperture) to be tweaked too.
Lighting can create definition in faces and objects, and can be used to control colours. You'd (probably) be surprised by how many bulbs and how much power is used on film and tv sets.
With better (longer) dynamic you can get more details in shadows, where a consumer camcorder would render it as a flat colour. It also means the sky will remain blue and not just become a big mass of pure white. In the real world, dynamic range tends to correlate with the price of the equipment.
One thing that is often missed in answers to questions like this is the positioning of the camera and actors. If you're recording someone with a handheld camcorder as they walk down the street, their facial features will be distorted because they're so close to the camera. The other part of this is framing, which is evident in any picture. Good framing is often the difference between a holiday photo you've taken looking good and looking "amateur" or just plain crap.
To attempt to explain why these things are more appealing to look at would be like trying to explain why one woman is more aesthetically appealing than another. (Though "nice tits" is a pretty standard and concise answer.)