The problems with the format centre around a few basic principles: generality, overhead, latency, random access, and timestamps.
The problem with generality is, well, that there isn't any. Ogg is supposed to be a general-purpose container format, meaning that "encoded data of any type can be encapsulated with a minimum of effort". According to Mans, Ogg doesn't fit this description.
"For every format one wishes to use with Ogg, a complex mapping must first be defined," Mans explains, "This mapping defines how to identify a codec, how to extract setup data, and even how timestamps are to be interpreted. All this is done differently for every codec. To correctly parse an Ogg stream, every such mapping ever defined must be known."
Sadly, there is no single place to go and get these codec mappings from, leading to problems. "It is simply impossible to obtain a exhaustive list of defined mappings, which makes the task of creating a complete implementation somewhat daunting," Mans explains. He also presents Matroska as a much better example of a general purpose container format.
Ogg also has overhead problems, Mans argues. After providing a detailed explanation of the concept, he concludes: "We thus see that in an Ogg file, the packet size fields alone contribute an overhead of 1/255 or approximately 0.4%. This is a hard lower bound on the overhead, not attainable even in theory. In reality the overhead tends to be closer to 1%. Contrast this with the ISO MP4 file format, which can easily achieve an overhead of less than 0.05% with a 1 Mbps elementary stream."
Latency-wise, it all gets very technical again:
On the receiving side, playback cannot commence until packets from all elementary streams are available. Hence, with two streams (audio and video) interleaved at the page level, playback is delayed by at least one page duration (two if checksums are verified).
This problem could technically be solved, but in doing so, you'd hit the overhead problem again. "Minimum latency is clearly achieved by minimising the page duration, which in turn implies sending only one packet per page," Mans explains, "This is where the size of the page header becomes important. The header for a single-packet page is 27 + packet_size/255 bytes in size. For a 1 Mbps video stream at 25 fps this gives an overhead of approximately 1%. With a typical audio packet size of 400 bytes, the overhead becomes a staggering 7%. The average overhead for a multiplex of these two streams is 1.4%."
Other formats, like ISO MPEG-PS and Microsoft's ASF, have far less overhead, and Mans suggests bringing Ogg in line with those.
Random access is also a problem with Ogg. Random access allows direct seeking at any position in the file; quite important for a general purpose container format. The method with which Ogg provides random access is crude, according to Mans, a problem which gets worse on slower media like optical disks or over the network.
Another important aspect of video files is that the audio and video streams must run in sync, or else it starts to get annoying really fast. "By the time Ogg was invented, solutions to this problem were long since explored and well-understood," Mans writes, "The key to proper synchronisation lies in tagging elementary stream packets with timestamps, packets carrying the same timestamp intended for simultaneous presentation. The concept is as simple as it seems, so it is astonishing to see the amount of complexity with which the Ogg designers managed to imbue it."
Mans concludes that Ogg is simply a bad format - patent free or no. "[Being patent-free] still does not alter the fact that Ogg is a bad format," Mans claims, "Being free from patents does not magically make Ogg a good choice as file format. If all the standard formats are indeed covered by patents, the only proper solution is to design a new, good format which is not, this time hopefully avoiding the old mistakes."