How fast is a Power Mac G3? In our feature article, Rick Holzgrafe muses about speed and shows us a simple program he's run on different computers for 20 years. See how a G3 compares to a Cray Y-MP! Also this week, Adam examines the new email list headers you see at the top of TidBITS and TidBITS Talk. In the news, Apple releases Mac OS X Server and a public beta of QuickTime for Java, and announces that it is jumping on the open source bandwagon with Darwin.
This issue of TidBITS sponsored in part by:
Northwest Nexus -- 1 888-NWNEXUS -- <http://www.nwnexus.com/>
Internet business solutions throughout the Pacific Northwest.
Small Dog Electronics -- Special Deal for TidBITS Readers!
Farallon 8-port 100Base-T hub w/10Base-T switch:$99!
Reunion 6.0 genealogy software from Leister Productions: $89!
For Details: <http://www.smalldog.com/> -- 802/496-7171
Outpost.com - the Cool Place to Shop! --> TidBITS Exclusive!
Buy Connectix Virtual Game Station for $49.95 and receive
a $5 gift certificate toward PlayStation games or any other
Outpost.com product! <http://www.tidbits.com/tbp/VGS.html>!
Norton Utilities 4.0: The #1 Problem Solver for Today's Mac!
Protect your data and eliminate problems while your Macintosh
runs faster and better than ever. ONLY $39.95! (after rebate)
Click here! -----------> <http://www.digitalriver.com/TidBITS/>
MOVING AT A SNAIL'S PACE? MacAcademy Quick Start CD-ROM Training
Series gets you up and running with Quick Start training on
Photoshop, FileMaker Pro, Mac OS 8.5, Word 98, and much more.
<http://www.macacademy.com/tidbits.html> or call 800/527-1914
OPTIMIZED for Mac OS 8.5 - Farallon's new 10/100 drivers are
35% FASTER! Download the FREE 2.0 driver for Farallon's
Fast EtherTX-10/100 PCI and CommSlot II cards and give it a
test drive. <http://www.farallon.com/tidbits/driver.html>
REMEMBER LAST WEEK? What if your hard disk doesn't? You need
Retrospect Express 4.1 from Dantz Development for quick backups
to removable drives, CD-R, and the Internet for ONLY $49.95.
More info at <http://www.dantz.com/dantz_products/express.html>
Apple Announces Darwin Open Source Project -- Last week, Apple announced it plans to make the source code for the foundation layers of Mac OS X Server available via an open source initiative called Darwin. Developers who agree with the Apple Public Source License can register with Apple to gain access to the source code, which will include Apple's enhancements to the Mach 2.5 microkernel in Mac OS X Server, plus several Apple technologies such as AppleTalk, the HFS Plus file system, and the new NetInfo distributed database. Apple says it plans to include additional software in its open source offerings, but don't expect to see source code for Apple's bread-and-butter technologies (like the current Mac OS, QuickTime, WebObjects, or the NeXT application layer) released as open source. Source code for Darwin should be available to developers in early April. It remains to be seen whether Darwin will be genuinely useful to developers, or whether Apple is merely surfing the open source wave. See TidBITS Talk for debate on the topic. [GD]
Mac OS X Server Ships -- Apple last week shipped Mac OS X Server, a new Unix-based operating system for high-end server use. Formerly codenamed Rhapsody, Mac OS X Server features the popular Apache Web server, Apple's WebObjects, the capability to boot newer Macintosh models remotely via NetBoot, a high-performance Java virtual machine, network services such as DNS and Apple File Protocol, Web-based administration, and a consistent Mac-like user interface. (See "New iMacs, New G3s, and Mac OS X Server" in TidBITS-462 for more information.) Mac OS X Server runs BSD Unix 4.4 on top of the Mach 2.5 microkernel (which together offer preemptive multitasking and protected memory), plus features application technologies originally acquired from NeXT. Mac OS X Server reportedly includes the Blue Box application layer, enabling Mac OS X Server to run standard Mac OS applications. By all reports, the Blue Box isn't intended to allow Mac OS X Server to act as a workstation or to run current Mac OS server software. Developer support for Mac OS X Server is growing; several companies have already announced plans for Mac OS X Server, and more are sure to follow. Apple has priced Mac OS X Server aggressively at $499, with an unlimited client license; Apple is also selling 400 MHz G3-based servers with Mac OS X Server pre-installed starting at $4,999 (which Apple says is the fastest Apache server platform available for under $5000). [GD]
QuickTime Gets a Caffeine Boost -- Apple today announced the public beta release of QuickTime for Java, further extending the reach of QuickTime to any application written in Java on either the Mac OS or Windows. For the moment, QuickTime for Java will interest only developers, but Java support is an important step for Apple's goal of QuickTime ubiquity. QuickTime for Java requires that you first install MRJ 2.1 (or JRE 1.1.x or 1.2 if you're using the Windows Java Runtime Environment) and QuickTime 3.0.2 - see the QuickTime for Java installation page for links. [ACE]
by Adam C. Engst <firstname.lastname@example.org>
Whenever I explain how email works to novices, I call email headers "the glop at the top," since they aren't easy for us humans to digest. Headers are lines of text that precede any Internet email message; they carry descriptive information about the message rather than the message itself. You're probably familiar with the primary human-readable headers, such as Date, From, Subject, and To, but you probably wish you'd never seen some of the more indecipherable headers such as Message-Id, Content-Type, or Received, which together can form an impenetrable snarl. These esoteric headers are meant to be understood by email programs, not humans, so email programs often hide headers that rank high on the gobbledygook scale.
However, you may have noticed that TidBITS and TidBITS Talk have sprouted new and unusual email headers since the beginning of 1999. Most of these headers, which I collectively call the "list headers," are up-and-coming Internet standards. These list headers are human-readable and provide information useful to mailing list subscribers; however, they're also meant to be understood by email programs so they can help email users better manage their mailing list subscriptions.
Heads Up -- Because the list headers are standardized, email programs can start to pay attention to them. Initial support from lazy developers would be to hide the list headers, since they occupy a number of lines in each mailing list message. A better solution might be an Unsubscribe menu item when looking at a message containing list headers. Even more useful would be an interface that would help you track your mailing list subscriptions, let you unsubscribe from one or all lists with a click of a button, and automatically filter mailing list messages. Once that level of functionality was available, the list headers could be hidden, since the email program would have subsumed their utility.
Second, until email programs support list headers directly, the list headers can make it easier for normal people to manage their mailing list subscriptions. I describe each of the headers that we use in TidBITS below; since the list headers include URLs, clicking the appropriate list header URL could unsubscribe you from a list, get help, or send email to the list owner. Plus, if you wanted to send someone instructions for joining a list, you could send that person the List-Subscribe header's URL.
Listing the Headers -- Let's look now at each of the list headers we're using in TidBITS and why other lists may or may not want to use them. Note that the order is arbitrary - the order we chose is based purely on line length for an easily understandable visual display.
List-URL: <http://www.tidbits.com/> List-Archive: <http://www.tidbits.com/search/> List-Subscribe: <mailto:email@example.com> List-Unsubscribe: <mailto:firstname.lastname@example.org> List-Help: <http://www.tidbits.com/about/list.html> List-Owner: <mailto:email@example.com> (TidBITS Editors) List-Software: "ListSTAR v1.2 by StarNine Technologies, Inc." List-Id: "TidBITS Setext Distribution List" <setext.tidbits.tidbits.com> List-Post: <mailto:firstname.lastname@example.org> (Discussions on TidBITS Talk)
The following header also appears in TidBITS Talk:
List-URL: The List-URL header is not part of the Internet standards. We use it because we want keep our list headers consistent and to make sure that people can easily find our Web site's home page, from which they should be able to find detailed information about TidBITS and our mailing lists.
List-Archive: The List-Archive header points to our searchable article database of every TidBITS article ever published. Many lists may not have archives and thus wouldn't need this header.
List-Subscribe: List-Subscribe is one of the most important headers, because it contains the information necessary to subscribe to TidBITS. List-Subscribe might seem silly - after all, if someone's received a message with this header, they've probably already subscribed to the mailing list. However, List-Subscribe is useful if someone forwards you a message from a mailing list and you decide you'd like to sign up, or if you've unsubscribed from a list and later decide you want to sign up again. If an email program were to parse the list headers and provide an interface to their functionality, offering a Subscribe menu item would be helpful. Lists may not use the simple -on and -off addresses we've helped popularize, but mailto URLs can include additional information, including Subject-based commands like this: <mailto:email@example.com?subject=subscribe>. List-Subscribe could also point to a Web page with subscription forms or other options.
List-Unsubscribe: Like List-Subscribe, List-Unsubscribe is a perfect candidate for instantiating in interface. Considering how many bounces we get from people who couldn't be bothered to figure out how to unsubscribe to TidBITS (not to mention the number of unsubscribe requests sent to seemingly random addresses), we'd love it if people who don't want to receive TidBITS anymore could reliably unsubscribe without contacting us individually.
List-Help: The RFC that describes the list headers considers List-Help the most important, since it could contain a pointer to the information stored in all the other headers.
List-Owner: The List-Owner header points to the human contact for the list. For TidBITS, that's our general <firstname.lastname@example.org> address, although it could be more specific. On TidBITS Talk, for instance, I list myself as the List-Owner.
List-Software: Like List-URL, List-Software isn't part of the list header standard. It provides a spot to identify the list software that runs the mailing list. Although that's not necessary, it can be helpful to know precisely what program is handling distribution. For instance, from "ListSTAR v1.2 by StarNine Technologies, Inc.", you can tell that ListSTAR sends out TidBITS from the header above, but LetterRip Pro handles TidBITS Talk, as evidenced by the TidBITS Talk List-Software header's contents: "LetterRip Pro 3.0.4 by Fog City Software, Inc."
List-Id: The List-Id header comes not from the RFC describing the other list headers, but from the separate Internet draft linked above. Its purpose is to provide a unique identifier for every mailing list. Automated tools that would provide mailing list management interfaces in email programs need such markers to identify lists reliably. Also, people setting up filters need some way to tell when a message comes from a list. Other headers usually work, but we've all experienced problems with lists that change their headers when they switch to a different distribution program or host. Theoretically, the List-Id header could remain the same through such changes, preventing filters and other automated tools from breaking. List administrators make up the List-Ids for each list, though the Internet draft provides format recommendations along the lines of domain names. So <setext.tidbits.tidbits.com> identifies the setext distribution of TidBITS emanating from tidbits.com. TidBITS Talk uses a slightly shorter List-Id - <tbtalk.tidbits.com> - since it doesn't have to differentiate between setext and other possible formats.
List-Post: We use the List-Post header differently than most mailing lists would, since TidBITS is not itself a discussion group. We want to redirect discussion to TidBITS Talk, so we use that address in the mailto URL. The parenthetical comment (Discussions on TidBITS Talk) explains why we're sending people to another mailing list. Of course, TidBITS Talk itself uses the same mailto URL, but uses a different parenthetical comment (TidBITS Talk Moderator) to indicate that the list is moderated. The fact that we have two mailing lists that share the same List-Post header shows why we need the List-Id header. If someone decided to filter on the contents of the List-Post header, they would hit both TidBITS and TidBITS Talk.
List-Digest: Finally, we come to the List-Digest header, used only with TidBITS Talk. It's not part of the list header standard, but it provides a necessary URL to tell people how to subscribe to the digest version of TidBITS Talk. It's a prime candidate for instantiation in interface, along with List-Subscribe and List-Unsubscribe.
Who Should Use List Headers? I'll be honest: a major reason we adopted the list headers for TidBITS and TidBITS Talk is that we know some of the people involved in the standard process. These folks evangelized us to support the list headers, plus answered questions when I was trying to produce the most appropriate set of headers for TidBITS and TidBITS Talk. But aside from our specific situation, there are four groups of people who should pay attention to the list headers: those who run mailing lists, people who write email programs, developers of mailing list programs, and finally, individuals who subscribe to mailing lists.
I encourage everyone who runs a mailing list to add appropriate list headers. They aren't hard to create and most mailing list programs let you add custom headers. My hope is that the time and effort I put into creating the list headers will be repaid by less time helping TidBITS and TidBITS Talk subscribers manage their subscriptions.
Developers of email programs should start thinking about the best ways to support the list headers internally. I've seen an early version of a tool that provides an interface for managing your mailing list subscriptions based on these list headers and it's a verifiable Good Thing. Think of it this way, until support for list headers is ubiquitous, any email program that supports them can add it to the feature checklist.
Programmers who create mailing list management programs may not need to do much, since custom header features are already common. However, these programs should make the process of creating the list headers easier, which would encourage list header adoption.
From the standpoint of an individual user, I suggest merely that you take a look at the list headers and remember that they exist. Then, when you want to search for information in a message that you've deleted, look for the List-Archive header, or if you want to unsubscribe from a list, try using the URL in the List-Unsubscribe header.
As we've all seen over the years, users have trouble interacting with mailing list programs, and anything that improves that process serves the entire Internet community.
by Rick Holzgrafe <email@example.com>
The Cannonball Express was the fabled train that was so fast it took three men to say "Here she comes," "Here she is," and "There she goes." Computers are fast too, although unlike trains, most aren't self-propelled. What makes a computer fast, and how much effect does software design have? How much faster are today's computers than yesterday's? Recently I revisited some of these questions, beginning with a trip down memory lane.
Back in the Stone Age -- Twenty years ago, I was teaching myself programming and had access to a DEC PDP-11/60 minicomputer on evenings and weekends. This beast was bigger than a washing machine, and during workdays I shared it with two dozen other technicians and engineers. I found a word puzzle in a magazine and thought it would be fun to program the PDP to solve it. The puzzle was as follows.
Given a phrase and a sheet of graph paper, write the phrase on the graph paper according to these rules:
The goal is to write the phrase inside a rectangle of the smallest possible area. (A subtle point: you are not trying to write in a minimal number of squares.) To score your solution, draw the smallest enclosing rectangle you can and take its area. The rectangle may enclose some blank squares; count them, too.
Got it? Tongue-twisters are the most fun because they have lots of opportunities to reuse whole snaky strings of squares. The 37 letters in "Peter Piper picked a peck of pickled peppers" can be packed into a 3 by 5 rectangle, like this (view this in a monospaced font):
OFIPT KCPER LEDAS
In those days I knew computers were "fast" but had no idea how fast. The answer turned out to be "not very." I wrote a program to solve these puzzles and called it Piper after the tongue-twister. I set Piper running on a medium-length phrase on a Friday evening, and came back on Monday to find it still running. It had found several less-than-best solutions but hadn't finished. Way too slow - I found a better solution myself on paper in about half an hour.
Why did it take so long? Piper was a "brute force" program. It tried every possible solution to the problem, one after another. The trouble is that there are too many possible solutions. Exactly how many depends on the phrase, but for any non-trivial phrase the number is astronomical. I realized for the first time that "fast" sometimes isn't "fast enough." This point may be obvious today, when we all use computers and are weary of waiting for them. But in 1979, that PDP-11 was only the second computer I had ever seen!
What Part of Fast Don't You Understand? I saw that I would have to make Piper faster. There are two basic ways to speed up a program. Plan A is to find a better way of solving the problem, but after twenty years I still haven't thought of a better solution. That leaves plan B, the classic efficiency expert's solution: eliminate unnecessary steps. For example, Piper created every possible solution, then calculated the area of each. It built each solution one letter at a time, so instead of taking the area only for completed solutions, I changed Piper to check the area after placing each letter. If placing a letter made the solution-in-progress take up more space than the smallest complete solution found so far, Piper could skip the rest of that solution (and all other solutions that started the same way) and move right on to the next one. This eliminated a huge amount of work and greatly improved Piper's speed. Finding clever ways to track the area of a growing solution helped too, because it was faster than calculating the area from scratch after each letter. I also found a way to calculate a minimum size for the final solution quickly: I couldn't guarantee that the best solution would be that small, but I could guarantee that it wouldn't be smaller. If Piper got lucky and found a solution as small as that calculated minimum, it could stop immediately. Otherwise it would continue on after finding the best solution, vainly seeking a still better one.
Eventually Piper became clever enough to finish that original phrase in a reasonably short period of time. But the holy grail continued to elude me: I wanted a solution for "How much wood would a woodchuck chuck if a woodchuck could chuck wood?" That PDP (and, perhaps, my cleverness) were not up to the task. I had run out of ideas for speeding up Piper, and runs still took longer than a weekend. But if I couldn't improve Piper, I could at least hope to run it on a faster computer.
Big Iron -- People tend to think of processor speed as the speed of a computer, but many factors affect overall performance. Virtual memory lets you work on bigger data sets or on more problems at a time, but it's slow, so adding more physical RAM helps by reducing your reliance on virtual memory. Faster disks and I/O buses load and save data more quickly. RAM disks and disk caching replace slow disk operations with lightning-quick RAM access. Instruction and data caches in special super-fast RAM offer big improvements for some programs. Well-written operating systems and toolboxes can outrun poorly written ones.
But in the end, little of this matters to Piper. Piper has always used only a small amount of data, doesn't read or write the disk after it gets going, and does little I/O of any kind. With its small code and data set Piper can take good advantage of data and instruction caching, but what it mostly needs is "faster hamsters" - a faster processor to make the wheels turn more quickly.
As the years rolled on, I ran versions of Piper on my first Macs, but in the middle 1980's I worked for Apple Computer, and had access to a programmer's dream: Apple's $15 million Cray Y-MP supercomputer, one of only two dozen in the world and arguably the fastest computer in existence at the time. I figured the Cray would make short work of Piper. But the Cray was not well suited to the problem. It could barrel through parallel-processing floating-point matrix calculations like the Cannonball Express, but Piper was a highly linear, non-mathematical problem. Piper used only one of the Cray's four processors and didn't do the kind of operations at which the Cray excelled. Piper wasn't a fair test of the Cray's power, but the Cray was still the fastest machine I'd ever used. The Cray succeeded where all previous machines (that PDP, my Mac Plus, my Mac II) had failed. It solved "woodchuck" in less than a day, taking only about 20 hours to finish its run. I was awestruck - 20 hours?! I'd no idea that "woodchuck" was that big a problem!
Young Whippersnappers -- I set Piper aside for many years, but recently I began to wonder how a modern desktop box compares to those old minicomputers and mainframes. I rewrote Piper from memory and ran it on my new 400 MHz ice-blue Power Macintosh G3 with "woodchuck." The output is below. Piper first reprints the phrase, then prints solutions and elapsed times as it finds them. Each solution is the best found so far, culminating in the best of all. The times are in seconds from the beginning of the run; the final time is the total run time. (Unfortunately, the best solution for "woodchuck" is larger than Piper's calculated minimum, so Piper continued to run for a bit after finding the best solution.)
Here are the results. Some intermediate solutions have been left out for brevity, but you can see Piper finding ever smaller solutions. In the end, the 57 letters in "woodchuck" are packed into a 4 by 4 rectangle. Have a look at Piper's total run time, and the time needed to find its best solution:
How much wood would a woodchuck chuck if a woodchuck could chuck wood? 0 seconds: ULD ADLU HDOAIUCOHWUHOD UCOWFKHDOMCWOW K WO 1 seconds: DLIFADLU UCKOHWUHOD OHDWOMCWOW 2 seconds: ULCHC HWUHODKUK OMCWOWAFI 7 seconds: HWUHOW OMCWOD IKAUCW FLDHK 9 seconds: HWUH OMCW LUOK HDOI CWAF 65 seconds: HWM OUC IKH FWC ADO LUO 67 seconds: HWMU OOCH UDWK LAFI Total run time: 107 seconds
There you have it: a shade over a minute to find the best solution, under two minutes to finish its run. Two minutes! So much for the big iron of the 1980's. My new G3 Mac finished "woodchuck" over 600 times faster (and 5,000 times cheaper) than that 15 megabuck Cray. (For a more realistic comparison, see this description of UCLA's Project Appleseed.)
If you want to check Piper's speed on your Mac, I've placed the code in the public domain; it's a 40K package.
The Future -- What's yet to come? 400 MHz already looks a little pokey. It's the best Apple offers today, but I've seen claims of 550 MHz or so from third party accelerators and over-clocking tricks. People are predicting 1 GHz (1,000 MHz) chips for the near future. Buses are getting faster, and caches hold more data in less space and are moving onto the processor chip for still more speed. (Small is fast. Did you know that the speed of light is a serious limiting factor in modern computer design? The closer together the components are, the faster they can signal each other.)
And like the old Cray, multi-processor desktop systems are starting to appear. They gang up on a problem by having separate processors work on different parts of the problem simultaneously. Although I didn't try to use the Cray's extra processors, I've done a little thinking lately. Piper doesn't have to be completely linear. On an eight-processor system, I bet I could come close to making Piper run in one-eighth of the time of a single processor.
Are you ready?
Here she comes -
Here she is -
There she GOES!
Non-profit, non-commercial publications and Web sites may reprint or link to articles if full credit is given. Others please contact us. We do not guarantee accuracy of articles. Caveat lector. Publication, product, and company names may be registered trademarks of their companies. TidBITS ISSN 1090-7017.