A couple of times recently I've had interviewers ask me to quickly sketch out the design for a web scale architecture. Of course, being a scientist by training the first thing I did was to work out what sort of system requirements we're looking at, to see what sort of scale we might really need.
In both cases even my initial extreme estimate, just rounding everything up, didn't indicate a scaling problem. Most sites aren't Facebook or Google, they see limited use by a fraction of the population. The point here is that while web scale sites exist, they are the exception rather than the rule, so why does everyone think they have to go to the complexity and expense of architecting a "web scale" solution?
To set this into perspective, assume you want to track everyone in the UK's viewing habits. If everyone watches 10 programmes per day and channel-hops 3 times for each programme, and there are 30 million viewers, then that's 1 billion data points per day or 10,000 per second. Each is small, so at 100 bytes each that's 100GB/day or ~10megabits/s. So, you're still talking single server. You can hold a week's data in RAM, a year on disk.
And most businesses don't need anything like that level of traffic.
Part of the problem is that most implementations are horrifically inefficient. The site itself may be inefficient - you know the ones that have hundreds of assets on each page, multi-megabyte page weight, widgets all over, that take forever to load and look awful - if their customers bother waiting long enough. The software implementation behind the site is almost certainly inefficient (and is probably trying to do lots of stupid things it shouldn't as well).
Another trend fueling this is the "army of squirrels" approach to architecture. Rather than design an architecture that is correctly sized to do what you need, it seems all too common to simply splatter everything across a lot of little boxes. (Perhaps because the application is so badly designed it has cripplingly limited scalability so you need to run many instances.) Of course, all you've done here is simply multiplied your scaling problem, not solved it.
As an example, see this article Scalability! But at what COST? I especially like the quote that Big data systems may scale well, but this can often be just because they introduce a lot of overhead.
Don't underestimate the psychological driver, either. A lot of people want to be seen as operating at "web scale" or with "Big Data", either to make themselves or their company look good, to pad their own CV, or to appeal to unsophisticated potential employees.
There are problems that truly require web scale, but for the rest it's an ego trip combined with inefficient applications on badly designed architectures.