First, if you're trying to get up to speed with it, I suggest you start with the original 802.11-1999 spec. Everything else is built on top of this, and the current 1200 pages of so of the (incomplete!) rollup spec is due to the numerous amendments that generally just add complexity and confusion when you don't know what the core protocol is about.
Second, you have to remember that 802.11 is a *radio* protocol, not a wired protocol, and subsequently the fundamental Physical (PHY) layer is completely different than a wired layer because of this. That said, your two basic questions are actually due to the 802.2/3 heritage that 802.11 builds on. But first your bit about antennae.
Radio waves aren't constrained to nice neat cables. They radiate outwards in all directions. They bounce off of things, and these bounced signals may make it to the receiver too. Since these bounced signals have to travel further, they arrive later than the original signal, causing ghosts some of you may remember from the analog TV days. This phenomena is called multipath interference, and there's a ton of complexity in the receivers to detect and deal with this.
802.11n gets its speedups over 802.11g from three things: Better modulation (65Mbps vs 54Mbps); Wider bandwidth (40MHz channels vs 20MHz, doubling throughput to 130Mbps), and finally supporting multiple simultaneous streams (which takes us up to 600Mbps with four streams, but nothing I've seen supports more than 300Mbps with two streams). The problem is that you can't transmit multiple streams from the same antenna; they'll interfere with each other. That same multipath mess that causes problems before is instead deliberately harnessed -- but to do that, you need need multiple antennae, one for each spatial stream. Similarly, you'll need multiple antennae on the receiver, one more than the number of streams.
Meanwhile. Wired ethernet is not considered "reliable" but compared to wireless, it is bulletproof. Wired ethernet can detect collisions as they happen due to every transmitter sharing the same wire (CSMA/CD), but Wireless transmitters have no way of knowing if the receiver was being locally interfered with or not. This is the reason for adding a positive acknowledgement and retransmissions at such a low level.
Similarly, stations may (and often are) highly mobile, and may drop off of a network at any time. If the station connects to a different access point, how is the rest of the network to know that it's moved? By making association an explicit action, the AP knows to send a notification to the rest of the network to update their MAC address tables. Which brings us to "why can't we join networks simultaneously?" Fundamentally, it's because each radio only has one MAC address. If the station supported using multiple MAC addresses, then it could join multiple networks. There are other factors in play (mainly synchronization/timing; a STA is slaved to the AP's clocks), but that's one of the big ones. Oh, and the 802.11 spec can't just assume everyone's using IP, and there are 802.11 chipsets that support multiple MAC addresses.
Disassociate messages are there to explicitly tell the AP to free up the resources that the STA is using. It's not strictly necessary, but instead a highly useful optimization when you consider the bigger picture of multiple APs servicing the same logical network and that a single AP can only handle so many STAs before they interfere themselves into oblivion or simply run out of resources. (Anyone who's been to tradeshows with public wifi has seen this for themselves). Also keep in mind that the AP can also send out disassociations to force the STA to hand off to a different AP or if the AP has to go away for some reason (such as switching channels due to radar interference). Explicit notifications are always preferable to implicit ones, especially on a highly unreliable medium.
QoS stuff is (unfortunately) here to stay, and provides tangible throughput improvements by adding additional mechanisms to reduce collisions and minimize round trips (and their latencies, the real throughput killer), something that "over-provisioning" simply can't deal with -- remember, you can always add more wires bonded together ad nauseum, but you can't just add more RF spectrum to achieve the same thing. Again, radio, being a shared medium, is completely different. The nodes all have to be smart to not step on each other's toes or the whole house of cards collapses.
802.11 as it stands now is actually pretty well designed; it's complicated because it is trying to solve some very complicated problems. Trying to grok the whole thing at once is migraine-inducing, but if you start from the original 802.11-1999 spec, work your way through the amendments chronologically, and keep in mind its ethernet heritage (and the fundamental differences RF brings over a hardline connection) it'll make more sense more quickly.
Anyway, I'll shut back up now..