Linked by Thom Holwerda on Fri 8th Dec 2006 20:54 UTC
Features, Office Microsoft has hit back at critics, including IBM, which voted against approving the company's Office OpenXML format as an Ecma standard, claiming it is nothing more than a vendor-dictated specification that documents proprietary products via XML. Ecma International announced the approval of the new standard Dec. 6 following a meeting of its general assembly and said it will begin the fast track process for adoption of the Office OpenXML formats as an ISO international standard in January 2007.
Thread beginning with comment 189942
To view parent comment, click here.
To read all comments associated with this story, please click here.
dylansmrjones
Member since:
2005-10-02

A mail is text too, but it's still capable of embedding images. Same technique for documents. The question is whether or not the entire document is embedded as a binary blob. For ODF the answer is no. OpenXML is nothing but a wrapper around the old binary formats, as can be seen when opening .docx files on other platforms. It's not at all XML.

Somehow you think one cannot embed images in human-readable documents using ascii (well, utf-8 or another encoding in reality), but that merely proves you know next to nothing.

Reply Parent Bookmark Score: 5

n4cer Member since:
2005-07-06

A mail is text too, but it's still capable of embedding images. Same technique for documents. The question is whether or not the entire document is embedded as a binary blob. For ODF the answer is no. OpenXML is nothing but a wrapper around the old binary formats, as can be seen when opening .docx files on other platforms. It's not at all XML.

With that statement it's clear that you've never actually looked at a .docx file or any other file format covered under the Open XML standard. Neither images nor code is embedded as a blob in Open XML. XML element(s) are created and contain a reference to the binary object which is stored as-is in the package, allowing you to traverse the package manually or in an automated fashion to remove or otherwise manipulate the object(s) seperately from the document. It is the Office 2003 XML formats which Base64 encoded binary objects inline with the document. This method was dropped for the flexibility mentioned above. Open XML is totally different from the 2003 formats. Neither are wrappers around the binary formats. Look at the spec, syntax, and document samples freely available online instead of continuing to make false statements.

Edited 2006-12-09 05:54

Reply Parent Bookmark Score: 3

rajj Member since:
2005-07-06

You mean the trivial samples with little formatting and a jpeg?

OOXML has lots of places in the spec where it is little more than fields from a C struct with xml tags around it. Oh gee wiz, that XML around those hex values sure are helpful! This is even more so since the tags themselves are unreadable. OOXML a poster child for premature optimization. Leave it to MS to miss the point of XML.

Reply Parent Bookmark Score: 3

dylansmrjones Member since:
2005-10-02

Try opening a .docx file. It isn't exactly XML in human-readable form. And that's the problem.

EDIT:

It is the Office 2003 XML formats which Base64 encoded binary objects inline with the document. <-- That would make it a wrapper around the binary objects. You are contradicting yourself. Either the data isn't represented in binary form in any way, or the binary data will be wrapped one way or the other. Make a decision.

Edited 2006-12-09 07:27

Reply Parent Bookmark Score: 3

NotParker Member since:
2006-06-01

A mail is text too, but it's still capable of embedding images. Same technique for documents. The question is whether or not the entire document is embedded as a binary blob. For ODF the answer is no. OpenXML is nothing but a wrapper around the old binary formats, as can be seen when opening .docx files on other platforms. It's not at all XML.

Thats totally false.

What are you talking about?

I've got Office 2007. I created a word doc called smit.docx.

I rename smit.docx as smit.zip.

I unzip it.

Inside are severals folder.s One of those is word.

Inside the word folder are 5 .xml files representing the document all in plain text.


document.xml
fonttable.xml
settings.xml
styles.xml
websettings.xml

And there is a folder for the theme.

Everything is xml.

Didn't you know that a .docx. file is a zipped collection of xml files?

Edited 2006-12-09 16:47

Reply Parent Bookmark Score: 1

dylansmrjones Member since:
2005-10-02

Didn't you know that a .docx. file is a zipped collection of xml files?

Of course I know that. It's a common approach. But it still doesn't help much when the xml-files themselves contains proprietary binary blobs. And that's the problem.

Reply Parent Bookmark Score: 2