I've always found it difficult to get things clear with XHTML, the specs are not written for humans and all together...
However, it says
here (msg 3090864) that XHTML does not allow block-level content inside <p> tags either.
Diving into XHTML 1.1, I reached
this. Note it says:
Code:
<xs:group name="xhtml.p.content">
<xs:sequence>
<xs:group ref="xhtml.Inline.mix" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:group>
which I take to mean that a <p> element can contain any number of
inline elements (while a <div> can contain "flow", which is described
here as block or inline elements).