Updated: 24.11.2002; 12:11:15 Uhr.
disLEXia
lies, laws, legal research, crime and the internet
        

Tuesday, May 29, 2001

Re: Word file turns into two disjoint texts (Page, RISKS-21.40)

[This item is an out-of-band response to Clive that is included here with the permission of Jeanne Sheldon. It provides an interesting case history, especially with the Unicode wrinkle, and seems RISKS-worthy even if it may seem like an old problem. PGN]

Here's a summary of what I've been able determine about the document.

The document was created in Word 97.

Word was set to allow "Fast Saves", which is a non-default setting that performs incremental rather than complete saves. It is a feature intended to speed the save operation. More information on fast save can be found in Microsoft Knowledge Base articles: Q71999 WD97: "How to Disable the FastSave Option in Word for Windows" Q190733 WD97: "Opening Word Document in Text Editor Displays Deleted Text"(this was first documented in Q113052 CREATED: 23-MAR-1994) Q192480 WD97: "Frequently Asked Questions About 'Allow Fast Saves'"

The document was saved three times; the second save was to a different filename. Because the second save initiates a second pass over the document, Word was able to compress the Unicode so that it was readable as ASCII characters and all incremental changes that were Fast Saved were collapsed. The first letter was then deleted and the letter to Dr. Page was composed. A single save was then performed to a local (non-network) drive using the same filename. Because "Fast Save" was enabled, the deleted text stream was identified but not actually deleted. Because a single save is a single pass and Unicode compression requires a second pass, the text remained as uncompressed Unicode. On Unicode compression, see: Q168967 "File Size Twice as Big When Compared to Earlier Version." While a non-Unicode aware tool would be unable to read the second set of text (the letter to you), it is actually quite readable on a Unicode-enabled text reader.

Extra notes: The document contains a unique identifier, indicating that the version it was authored on did not include the fix which removes that identifier. See Q222180 Unique Identifiers and Microsoft Office 97 Documents.

The document title, under properties, is generated automatically from the first line of the document on the first save. It is not subsequently updated, so it may contain text that is no longer in the document.

Comprehensive information on the topic: Q223790 How to Minimize Metadata in Word Documents.

From Word 97online documentation: The difference between a fast save and a full save If you select the Allow fast saves check box on the Save tab in the Options dialog box (Tools menu), Word saves only the changes to a document. This takes less time than a full save, in which Word saves the complete, revised document. Select the Allow fast saves check box when you are working on a very large document. However, a full save requires less disk space than a fast save. If you are working on a document over a network, clear the Allow fast saves check box. Fast saves cannot be performed over a network.

You should do a full save in the following situations: * Before you share a document with other people * When you finish working on a document and save it for the last time * Before you begin a task that uses a lot of memory, such as searching for text or compiling an index * Before you transfer the document text to another program * Before you convert the document to a different file format

Note: If you select the Always create backup copy check box on the Save tab in the Options dialog box (Tools menu), Word clears the Allow fast saves check box, because backup copies can be created only with full saves

... Clive, thank you very much for the time and effort that you have put into this. Although the Word setting that caused the document to be created in such a manner goes back to a time when electronic document exchange was not the norm (and, over the past 7 years, much effort has gone in to attempting to assure that private information is not accidentally included) it is humbling and daunting to realize once again how difficult it is to correct the mistakes of past versions with software patches, bulletins and product documentation.

Jeanne Sheldon, Microsoft Corporation ["Jeanne Sheldon" via risks-digest Volume 21, Issue 43]
0:00 # G!

Re: 37% of programs used in business are pirated (RISKS-21.42)

> tops the list in terms of dollars (an estimated $4 billion) lost to piracy.

This sounds like one of those inflammatory and inflationary statements the RIAA has become fond of recently. To my mind there is a big difference between this statement (which describes something that I can't imagine a means of estimating) and a statement like "tops the list in terms of dollars (an estimated $4 billion) retail value of pirated software". Many users would not be using the software they are using if they were forced to buy it rather than pirate it - they would be using a cheaper alternative. ["Merlyn Kline" via risks-digest Volume 21, Issue 44]
0:00 # G!


Maximillian Dornseif, 2002.
 
May 2001
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
Apr   Jun

Search


Subsections of this WebLog


Subscribe to "disLEXia" in Radio UserLand.

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.