Home > Office > Microsoft adding XML files to Office 12 Microsoft adding XML files to Office 12 Eugenia Loli 2005-06-02 Office 43 Comments Microsoft said Thursday that it will introduce new XML-based file formats for its Excel, PowerPoint and Word applications when the company launches its Office 12 software package next year. About The Author Eugenia Loli Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker. Follow me on Twitter @EugeniaLoli 43 Comments 2005-06-02 5:53 am Hopefully this will help OpenOffice on the way to near perfect compatibility (or perfect ) 2005-06-02 6:04 am It wont do a thing. Office already has XML file format. For example, in Office 2003, you can save a document as XML, WordML to be exact. WordML is not compatible with OO.org. But I guess you could make it compat. if you make some kind of super complicated XSLT. 2005-06-02 7:00 am so how come MS did not adopt the recently-standardized OASIS file format of OpenOffice 2.0? 2005-06-02 7:23 am From oct 2002 http://www.pcworld.com/news/article/0,aid,106005,00.asp 2005-06-02 7:24 am Does OASIS have collaborative functions in it? 2005-06-02 7:26 am I wonder if MS is going to adopt Open Documents standards… the article only says they are going to move to xml but it doesnt say anything about open standards! So the migration and compatibility issues will remain as it is between Open source office tools and MS office. 2005-06-02 7:40 am Reverse engineering XML is somewhat easier than reverse engineering a binary format, wouldn’t you say? Of course, that is unless the XML author is determined to obfuscate the nature of the data 2005-06-02 8:04 am OASIS is a standards body, not a program 2005-06-02 8:18 am I wonder if MS is going to adopt Open Documents standards No they are not. They’ve already patented their XML format. 2005-06-02 8:21 am http://channel9.msdn.com/ShowPost.aspx?PostID=73329 The XML formats will be *open*. You people really need to do your research first. 2005-06-02 9:08 am If they did I’d be only to create OASIS like documents that don’t fully follow the standard and that make OpenOffice unable to render correctly, as allways. EEE. 2005-06-02 9:31 am http://oasis.sourceforge.net/ that’s a “program”. it’s a bit bad that they’ve got the same name, but hey, that’s life 2005-06-02 9:50 am Although I like the idea of an open file format for MS Office 12… something isn´t right here. 1. those new formats are open 2. therefor Open Office will easily adopt them (read and write) 3. everyone will use this new format because it´s now compatible to MS Office and OOo. 4. OASIS will be gone… 2005-06-02 10:00 am 5. ms starts inserting more and more binary files thats refrenced by the xml but keeps those file formats under lock and key. basicly it sounds like ms have cloned the oo files to make sure that they can pull the old embrace and extend… 2005-06-02 10:09 am if these new file formats will be open, how will MS implement DRM on them? will the specs for the DRM thing also be open to everyone so that, for example, you can still get OpenOffice or some other compatible program to open DRM-ed MS Office files? 2005-06-02 10:52 am They said it’ll be royalt-free and also explaned why they don’t use OASIS (they called it Sun’s format)… they said it’s because of all the legacy they need to support. (The news could mention it… besides the usefull information, means less trolling…) More about probably in the next week… 2005-06-02 11:01 am A closer look: Microsoft said Thursday that it will introduce new “XML”-based file formats… Interoperability not being Microsoft’s strongest attribute, I expect the usual serialisation of Word/Excel/PowerPoint structures, along with Steve Ballmer dancing around in a monkey suit to prove some kind of point – a very superficial point indeed. 2005-06-02 11:44 am @jayson knight it is NOT open, You need to do some research! http://www.microsoft.com/mscorp/ip/format/xmlpatentlicense.asp 2005-06-02 11:47 am edit well, they are open for reading – but only licensed apps can legally toy with the files. So bye bye OOo, and bye bye converters 2005-06-02 11:55 am Except that document files are my property so I can (at least in Europe) transform or modify it in any way I like. 2005-06-02 11:56 am @Brian good link but it seems to me that this licence is free (as in money) for anyone to take up, for reading and writing Office XML docs, providing that the following notice is included: “This product may incorporate intellectual property owned by Microsoft Corporation. The terms and conditions upon which Microsoft is licensing such intellectual property may be found at http://msdn.microsoft.com/library/en-us/odcXMLRef/html/odcXMLRefLeg… 2005-06-02 1:00 pm MSFT XML is not pure XML. it’s a binary/XML hybrid. MSFT has a patent on it. to appaise goverments like massachuttes it has made it royalty free. Don’t expect OO.o to be able to read it easily though. Now my second thought. Office 12? 12 revisions of MS Office. damn at $400 a pop for at least 6 of those revisions means if you have been using Office since 1995 you have spent the equivalent of a brand new high end PC on just Office versions. Of course most of the companies I know still use office 97 or 2000 2005-06-02 1:31 pm So if its not open why bother with a new Propriety format? I thought whats so good about XML in the first place was interoperability. 2005-06-02 2:07 pm If I were a filter programmer of a free MS-XML filter for OpenOffice.org, I would definitely not sign this license contract. Why? Because I would give something up for nothing. I would get a license to use MS’s patents MS may or my not have which may or may not be needed in order to implement said filter. This license is pure FUD, and I am quite certain, that any patent regarding XML schemas can easily be overturned in court. I would insist on clearer wording like: MS has patent Number xxxxx and yyyyy and zzzzz for which a license is needed in order to manipulate the MS XML schema files. MS also hereby declares that all other patents MS has or will have are not applicable to MS XML manipulation programs. Because that would be a contract I can get out of, once the patents are declared invalid by a court. The foggy MS license currently would even apply if the patents were declared invalid or if they were not granted. 2005-06-02 2:35 pm DRM is separate from file data!!!! all MS needs to do is to encrypt the file. bam!!! DRMed. 2005-06-02 2:38 pm >if these new file formats will be open, how will MS implement >DRM on them? will the specs for the DRM thing also be open >to everyone so that, for example, you can still get >OpenOffice or some other compatible program to open DRM-ed MS >Office files? I think Microsoft could encrypt the XML even with a weak encryption and then invoke the DMCA. I think the way OpenOffice.org password protects documents is that it password protects the ZIP file that holds all the XML contents. 2005-06-02 3:09 pm <!DOCTYPE MS-WORD> <msoffice:word-doc> A34F23D231B98AF783 [ for 4 megabytes … ] </msoffice:word-doc> 2005-06-02 3:11 pm Well, it was funnier when the HTML entities were rendered correctly <bold>like they did in the preview window</bold>. Grrrr. (And of course, previewing the above text does not show the bold attribute…) 2005-06-02 4:15 pm “Microsoft may have patents and/or patent applications that are necessary for you to license in order to make, sell, or distribute software programs that read or write files that comply with the Microsoft specifications for the Office Schemas.” … “Microsoft reserves the right to terminate this license grant if you sue Microsoft or any of Microsoft’s affiliates for patent infringement over claims relating to reading or writing of files that comply with the Office Schemas.” LOL 2005-06-02 4:17 pm Not talking about OOo compatibilty – they [MS] break their internal between-version compatibility again. Currently at least Office 2K and 2K3 documents are freely interchangeable (maybe not 100%, but I’ve not seen any problems yet). Because in Office12 they’ll use XML format by default, other users (old versions) cannot open these docs anyway. I’m slowly getting tired. 2005-06-02 4:31 pm Well those of us who have been watching the rise of XML and other common format scripting languages are rejoicing. At long last Microsoft will FINALLY adopt a standard the rest of the world is already embracing. Now all of the applications can share information and I dont need to buy Word to view a Word file. Linux, Unix, Windows, Mac, let us rejoice for this is great news. 2005-06-02 4:45 pm MSFT stated they are going to release patches to the previous 3 generations of Office to enable them to read this new format. I think a major reason MSFT is doing this is because large corporations use document management software. Generate documents automatically, etc. Much of this software is not written by MSFT, and MSFT doesn’t have this type of software on the roadmap. That means they need a readable format that they control. So, how do you do that? You get some form of legal protection for your format, be it copyright or patent, then you tell everyone what it is and make people license the format from you. That means you can deny Open Office, or StarOffice, etc, but still allow those companies who augment your business to use the format. – Kelson 2005-06-02 4:48 pm Because it is MS they hate standards. Plus they want to make it where office can read them & nothing else. 2005-06-02 4:51 pm Plus they want to make it where office can read them & nothing else can. 2005-06-02 5:04 pm I can see that many folks here are less than convinced of the openness of the new file formats for Office 12. I’m a Program Manger on the Access team. I’ve been a dev in the past and worked in Q/A as well. I’m also a long time OSNews reader (circa 1998), and a paid member of this site. I run a WinXP laptop, a Windows Media Center PC, a Mac Mini with Tiger, and a SUSE 9.2 file server at home. I’m not here to sell anyone on anything. I just want to get the facts straight. – The new Office XML document formats are not the same as the Office 2003 WordML and SpreadsheetXML formats – The new file formats will support DRM, though I’m not at liberty to discuss the technical details of how this works at this time – The new formats will be container based similar to the way OpenOffice works now, with a single compressed container (.zip format) that houses multiple XML files per document One reason we’re doing this: Our customers, specifically large corporations, have explicitly requested the ability to shred information out of Office documents for processing in their back-end document storage systems. These systems are typically pre-existing and can be based on any number of different platforms. It is critical to these customers that they be able to retrieve business information out of their documents to enable document workflow, search, and management scenarios. Additionally: By converting to a non-binary file format we are able to provide increased security and recovery features. – It is easier to fix-up damaged XML than it is to fix up a corrupted binary persistence format. – Because of the container architecture, macro code can be more fully segmented away from the content of the document. Zac 2005-06-02 5:50 pm “Our customers, specifically large corporations, have explicitly requested the ability to shred information out of Office documents for processing in their back-end document storage systems.” er… now i’m confused with this part: “Microsoft may have patents and/or patent applications that are necessary for you to license in order to make, sell, or distribute software programs that read or write files that comply with the Microsoft specifications for the Office Schemas.” doesn’t this mean that the customers will have to pay/infringe the Patent License? 2005-06-02 6:22 pm Yup, that’s true, you’ll have to get a licence… that’s exactly what the next sentence tell us: “Except as provided below, Microsoft hereby grants you a royalty-free license under Microsoft’s Necessary Claims to make, use, sell, offer to sell, import, and otherwise distribute Licensed Implementations solely for the purpose of reading and writing files that comply with the Microsoft specifications for the Office Schemas.” In short: we (may) have patents on stuff, you’re not getting right on our IP… but you have the right to infrige any patent you need to to read/write Office XML format file as long as: you cannot do differently, you’re respecting the schema… (and other stuff, read it!) 2005-06-02 6:35 pm Can you say whether the new formats will basically be “Metro” documents or are they seperate from that? IIRC, there seem to be some similarities between your brief description and the material released during WinHEC (not that I’d expect a big difference considering common needs). 2005-06-02 6:46 pm Found the format docs on MS’ website: (Sidebar of the following page) http://www.microsoft.com/presspass/press/2005/jun05/06-01OfficeXMLF… 2005-06-02 8:23 pm Good question, no these file formats will not be metro. Metro is a container based XML format specifically designed for presentation (on screen, print, etc…) in Longhorn. The new Office and Metro file formats do utilize a very similar container concept (XML files in a .zip package), but they are different technologies. Office’s document formats will be specific to the individual Office application, whereas Metro is a generic presentation persistance format that can be used to display virtually anything. 2005-06-02 10:21 pm Can you say if Microsoft will begin to offer Microsoft Office for other platforms like Linux, Solaris, MacOS X, SCO Unixware (I am kidding about these last platform 🙂 ? If any OpenOffice developer makes an import/export filter for these new file formats he would be infringing MS license ? I didn’t see any specific prohibition for GPL programs like MS .Net… 2005-06-02 10:37 pm Office is already offered on OSX. As for the other platforms, don’t push your luck :-). 2005-06-02 11:54 pm As long as including an MS license w/ the filter isn’t incompatible w/ the GPL, I don’t think there’d be a problem. Here’s the FAQ for the current Office 2003 XML Reference Schemas http://www.microsoft.com/Office/xml/faq.mspx It covers questions about Open Source projects. Licensing info here: http://www.microsoft.com/mscorp/ip/format/xmlpatentlicense.asp ——————————– @Mystic TaCo Thanks for the info.