Tips and Tricks
A place to store useful info I don't want to lose



Subscribe to "Tips and Tricks" in Radio UserLand.

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.
 

 

 

  

Cookie cookery

This week, an old Internet favorite - a technology that is pervasive on the Web, old as the hills (at least by Internet standards) and fairly misunderstood. The technology in question is cookies, those little nuggets of data that are ignored by most and considered to be "the end of civilization as we know it" by some.

Cookies are tiny chunks of data that Web sites hand to and receive from your Web browser in an effort to track your travels, tag your hopping to make you statistically significant or created to make your preferences available on subsequent visits. More ...

The way cookies are created is simple: When your browser makes a request to a Web server the server replies and a special field in the response header instructs your browser to store the cookie data supplied by the server. Here's what the header of a server response looks like when it includes a cookie setting request:

HTTP/1.1 200 OK
Date: Wed, 04 Sep 2002 20:20:13 GMT
Server: Apache/1.3.12 (Unix) mod_ssl/
2.6.6 OpenSSL/0.9.6 mod_fastcgi/2.2.10
Set-Cookie: Apache=63.207.158.10.
289511031170813230; path=/; expires=Fri, 03-Sep-04 20:20:13 GMT
Connection: close
Content-Type: text/html

This header is from a request for the root page of Network World's Web server, http://www.nwfusion.com, and the tool we used was Ipswitch's WS_Ping ProPak ($37.50), which has a feature that lets you retrieve Web pages as plain text (among other things).

By now you've probably figured out that the header line that is relevant to us is the one starting "Set-Cookie:." This is a request that tells a cookie-compliant browser to take the data following the request and create a file to store it in. The name of the cookie file is up to the browser implementer - Microsoft Internet Explorer under Windows names cookie files by appending the second-level domain name from the server's URL to the current user's name. Thus, our cookie for Network World Fusion is named gearhead@www.nwfusion.txt and is stored in the folder "C:\Documents and Settings\gearhead\Cookies." Under the Netscape browser cookies are stored in a file named cookies that can be found in "c:\netscape\users\default."

The cookie data is defined by six parameters. These are the cookie name, its value, the expiration date, the path for which the cookie is valid, the domain the cookie is valid for and whether a secure connection must be available when the browser returns cookie data to a server.

The name in our example is "Apache," and the value is "63.207.158.10.289511031170813230." The name is only significant to a server that sets the cookie, and you'll often see default values such as "Apache" and "SITESERVER" where a coding library has been used to handle cookies.

The domain is a critical part of this system because it defines the domain or subdomain to which the cookie data will be sent with each browser request. The path also defines the start of a subtree under the domain's Web root to which the cookie applies, thus info and users under myserver.com could have different cookies. If a path is not set, it defaults to the URL of the document creating the cookie.

Of course, you could just as easily set cookies for each subtree and by setting the path "/" have the cookies returned with every request. We've looked at quite a few cookies, and we suspect that this feature is rarely, if ever, used. The reason is obvious - the overhead of extra cookies is not significant, and it involves less work when Web site changes are required.

The expiration date is what you might guess, the date after which the cookie data is no longer valid. If a value isn't set, then the cookie - called a "session cookie" - is stored in memory only and deleted when the browser exits.

The designers of the cookie system never considered that someone might not want cookies to expire, so you'll often see cookies with expiration dates such as some date and time in 2038. 2038 is often the maximum year used for a really dumb reason: The Unix clock started on 1/1/70. The variable time_t, which records the number of seconds elapsed, is considered to have been 0 at 00:00:00 on 01/01/70. This variable is a long "int" - an integer represented by four bytes divided into a value of 31 bits with one sign bit. This means the highest value time_t can reach before it becomes 0 again is 2,147,483,648 seconds or 24,855.1348 days or roughly something over 68 years. This equates to 3:14:07 AM on Jan 19, 2038.

Once a cookie has been created, each time the browser makes a request for the cookie's domain and path, the browser will pass that cookie's data to the server automatically as part of the request header.

Note that the cookie is only handed to the server if the domain matches exactly as far as it is specified. That is to say if the given domain of a cookie was gibbs.com then the domains gibbs.com, http://www.gibbs.com and http://clients.gibbs.com would be passed the cookie. On the other hand, if the given domain was www.gibbs.com then gibbs.com and clients.gibbs.com wouldn't be passed the cookie, while http://www.gibbs.com and http://m1.www.gibbs.com would.

Another interesting feature of cookies managed by Internet Explorer is that the date and time of creation and last access, along with the number of times the cookie has been accessed, is recorded in the Explorer cookie file (none of these extras is included when the data is sent to a server).

One parameter we didn't go into in any depth was the secure flag. As we said, if this is set (secure=TRUE) then the cookie should not be passed to the server unless the connection used is secure (using SSL, for example). The default is secure=FALSE.

Of course all that this security requirement does is keep the cookie data private while crossing the Internet. Cookies have no mechanisms to ensure that cookie data on the client doesn't get manipulated. This has resulted in an exploit that falls into the category of parameter manipulation.

In a parameter manipulation attack using a cookie, a malicious user changes the data that will be sent from a Web browser to the Web server (and any other components, such as middleware and back-end databases). The effect of the changes will depend on what the cookie is used for and how well-coded the server and other downstream components are.

Manipulation of cookie data can be prevented by encrypting the cookie data sent to the client or by sending an ID string as the cookie contents that is essentially a pointer to a database entry. To prevent the ID string from being secretly modified, the string needs to have a validation mechanism such as check digits.

If you want to keep an eye on what cookies, you have you're going to need some tools. We've got tools.

First up is a freeware utility called IE-Cookies-View from Nirsoft (Nirsoft is an alias for someone named Nir Sofer who appears to be committed to making stuff for free). IECookiesView (ICV from now on) is a small utility (92K bytes) that lets you explore all the Internet Explorer cookies on your computer. Explorer stores all the cookies for a given site and path in a single file (remember that a cookie is the combination of a key and an associated value). You can find cookies by Web site name, sort the cookies list or by any attribute displayed, delete unwanted cookies, save cookies to readable text files, copy cookie information into the clipboard, automatically refresh the cookies list when a Web site sends you a cookie, and display cookies of other users on your machine.

ICV will show cookies that are active, expired and what the author refers to as "duplicated cookies" (we have yet to see these or actually understand what they are). You can set ICV to automatically scan for new cookies or manually refresh the cookie list. ICV shows you the contents of each cookie file and can save the cookie keys and values in a nice, readable format. An interesting feature is that it examines the cookie's URL and attempts to find cookies placed by advertising sites. Cookies are marked as "Yes" for known advertisers, "Suspect" if the URL is related to that of a known advertiser or "Unknown."

This is a pretty useful tool scoring a Gearhead rating of B- with a commendation for being free.

Today's focus: Cookies get weird

A reader enquired: "Concerning "Cookie crumbs", I have a question. What more information would one need that cannot be found in the "Local SettingsTemporary Internet Files"?"

More information? The answer depends what you want more information about. If we're talking about cookies then, for example, under IE on Windows XP cookies are kept in the folder C:Documents and SettingsUSERNAMECookies as we discussed in the previous Gearhead cookie column. If you examine these cookies with the tool we discussed last week (IECookies View from Nirsoft) you can see where PC users have been browsing or, for that matter, where you have been browsing. If you want to keep your browsing habits private, then you might think a bout of deleting is called for. While removing cookies gets rid of one log of your browsing trail - there are many more sources. We'll come back to that topic later.

And note that if you delete any or all cookies, you may well be disabling automatic logon to registered sites that use a cookie to re-authenticate you. Of course a more paranoid viewpoint might argue that this is a good thing because if a miscreant makes off with your PC or simply copies your cookies, they could access any server you were set up to automatically log on to.

Now as a Web site owner/manager/lackey you should think about how you would support autologon using cookies. If significant financial or privacy risk is involved and you absolutely must have a cookie-based autologon, then you might think about checking the domain or IP subnet location of the user presenting the cookie. If that location data changes, then you might ask for the user to re-authenticate. You should also check the browser identity: A valid user is unlikely to suddenly change from XP to 98 or from OSX to W2K without authenticating themselves at least once in the normal manner.

Anyway, back to the question. The directory C:Documents and SettingsUSERNAMELocal SettingsTemporary Internet Files that the reader referred to is used for temporary storage of Web content. In here you find HTML documents, Flash animations, GIF and JPEG images - in fact a copy of everything that the browser has used to build every Web page it has retrieved. You also will find that all the cookies you can see in the folder C:Documents and SettingsUSERNAME Cookies are also here!

This is more nefarious Microsoft systems programming slight-of- hand. What you see in these directories are not all real files - some are links that are derived from the contents of index files and presented as if they are files!

And you'll notice if you use Windows Explorer and double click on, say, a JPEG image in the Temporary Internet Files subdirectory - a warning will be displayed saying "Running a system command on this item might be unsafe. Do you wish to continue?" At this point you may be wondering what the heck is going on. "What system command?" Go to the header bar of the right-hand pane and right click. In the context menu select "Internet Address" and voila! Each item is associated with the URL it was downloaded from. Double clicking won't launch the program for the associated file type, it will load Internet Explorer! This is not your average subdirectory!

NW Digital Grease Monkey
By Mark Gibbs
September 9, 16, 23 2002,
Copyright Network World, Inc., 2002>



Click here to visit the Radio UserLand website. © Copyright 2002 Eric Hartwell.
Last update: 04/10/2002; 10:50:02 AM.
This theme is based on the SoundWaves (blue) Manila theme.

September 2002
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          
Aug   Oct


"Data! data! data!" he cried impatiently. "I can't make bricks without clay."
— Sherlock Holmes to Dr. Watson in "The Adventure of the Copper Beeches" by Arthur Conan Doyle. 


"I like deadlines," cartoonist Scott Adams once said. "I especially like the whooshing sound they make as they fly by."


"There is nothing like that feeling of spending days and days banging your head against a wall trying to solve a programming problem then suddenly finding that one tiny obscure and seemingly unrelated piece of the puzzle that unlocks the solution. Oh yeah!"

- Chris Maunder, CodeProject Newsletter 28 Jan 2002


"Management at eSnipe, which is me, is also feeling the pain of the 2002 bear market. So rather than pout about it, I bought some stuff on eBay that I really didn’t need, but made me feel better."

- Tom Campbell, president of eSnipe