One can get much smaller than 4nm by changing the geometry of the device. Years ago I worked on HBT devices, where the effective length is the laminar thickness of the base, not the lateral width. Minimal quantum effects so long as the base area is large enough. The problem, of course, is that these devices are much bigger than fets, even though they have a much faster Ft.
I am not an expert here (and did not stay in a Holiday Inn last night) but the literature on this topic almost universally talks to about 4nm being the ceiling (floor?) for this race. It does not offer changing geometry as a solution to this problem. As my limited knowledge of this particular topic perceived, below about 4nm and there is "bleed" where binary states are lost and/or flowing electrons jump lanes, either of which results in states that will make accurate computing fail.
My (perhaps poor analogy) would be worrying about the size of on and off light switches, so companies keep shrinking and shrinking them. What used to work with a finger tip flipping a plastic "handle" becomes too small for fingers anymore. At some point, we're bending paper clips to insert a point into a hole like we may do to pop a sim card holder out of a phone. And then it gets smaller than that and we're needing something thinner than a paper clip to toggle a light on and off. And then smaller. And then smaller still. Eventually, the size is reduced down so small, you no longer have a reliable way to turn that light on (binary 1) or off (0) without accidentally turning other lights on and off.
Again, perhaps very poor analogy- but the concept remains the same (as perceived by me). The literature on the topic says that size shrinks to this maximum (minimum) such that reliability of state is in jeopardy. Anyone who knows anything about programming knows that if some rouge program flips some 0s to 1s in some other program's code, that other program is going to probably crash/fail/exhibit wonky behaviors. I read the literature on this topic like below about 4nm and this kind of thing is
expected... though I thought I saw something about 3nm possibly being the max (min) instead of 4nm. I've seen nothing in support of anything below 3nm being possible.
Based on all that I've seen on this topic, I have zero expectations for 2nm, 1nm and then fractions of 1nm being possible. For example, I don't foresee rumors of the A18 chip in the iPhone XVs being spun as using the new 0.032nm process. Instead- from what I think I read on this topic- the peak at about 3-4nm is followed by "more cores" as the one way forward. However, just as it is with desktops/laptops, there quickly comes a point where more cores somewhat peaks out too (in short, I'm doubting the iPhone XVs rolls out with the new 24-core A18). As with desktop/laptop cores, eventually you have the proposition of more cores vs. most of them sitting around with nothing to do- thus long after dual cores become a common thing, we're still not seeing 50-core or 500-core PCs.
But one more time: I'm no expert on this topic- just trying to share what
I think I make out from reading up on this particular topic. I could be entirely misunderstanding the collective sources and/or maybe Pym particles are around the corner so that 4nm can be shrunk to .4nm to be shrunk to .04nm and so on (with full support of electrons perhaps shrunk to some kind of on/off quarks or similar to be shrunk to some kind of undiscovered fractional quark (or magic) or similar).