Re: Excel Plug-in?

From George Buchanan
DateWed, 06 Feb 2002 12:50:55 +0000
Subject Re: Excel Plug-in?
In-Reply-To (C08FFFF817C9D2118C7C00A0C9ED1155019855AD-EXCHANGE)
Pam Osborne wrote:
> Has anyone come up with a plug-in that will handle Excel files? For the ones
> that are static, we know we can put them into a PDF document. But for the
> ones that are forms (i.e.templates) and contain formulas, we want those to
> be dynamic when someone downloads them from the collection. (We know we can
> also embed those in a PDF file, but it will only be dynamic when you open it
> if you have Adobe Writer on your machine.) Any solutions?
Two part answer:

Design Bit

A plugin for Excel files could be done (I've written Excel conversion
code before) - but there are many ways in which Excel is used, each of
which would require a different "intelligent" behaviour from the

There are two factors to consider. Firstly, how the content of the
Excel files is best delivered to the user; secondly, what the user would
be searching for, and by which means, that should result in the files
being returned as of interest.

In the case of predominantly tabular files of data, the standard
Greenstone method may not provide what the user requires (HTML isn't
the best delivery method, and we can't do much about that, and that
search is more likely to be on the subject of the data rather than
necessarily text found in the document itself). Furthermore, in an
extensive collection of numeric data, looking for an individual numeric
value rather than searching for the actual numeric value of a particular
required piece of information, is unlikely. Similarly for text data.

On the other hand, if the Excel files have been used to hold a
considerable about of extended text in a tabular format (folks who
use spreadsheets often do this instead of using a wordprocessor),
then the current Greenstone would fit well. Presuming this to be
the case, the usual Greenstone mechanisms would work well.

Technical Bit
Just to be useful, Excel file formats have changed rather often, and
MS don't like documenting such details. Many formats up to Office
97 (Windows) or 98 (Mac) can be converted using any number of command-
line tools to simple formats like CSV (Comma Separated Values) which
could be coarsely used to produce text files (though this can produce
problems if cell text includes commas). What you make a new paragraph
is a problem best resolved by what sort of layout the XLS files are in
(that hoary old chestnut returns). I can't give more specific advice
not knowing either the general layout of the files or which platform
you're on - but whatever, some degree of solution should be achieved
with relatively small amounts of work in Perl to write a plugin...

George Buchanan
Research Fellow, Digital Libraries
Middlesex University, London, UK