thanks for this valuable information, I guess you saved me a lot of
time. Now that I know how to start, I have to admit, that I am only a
beginner in programing perl. If you could send me a snippet or the whole
of the perl script that does the evaluation of the cgi-args of the
niupepa collection for demonstration purposes (or tell me where to get
it) that would help me tremendously. I could probably figure out how you
do the information extraction with perl and start to write an
appropriate script for my purposes...

Thanks again,


Katherine Don schrieb:

>Hi Axel
>We have done analysis like this on our niupepa (Maori newspaper)
>collection. We use the standard Greenstone usage.txt file, and do the
>session analysis using Perl scripts.
>We track sessions using cookies. If the user doesn't have cookies turned
>on, then its hard to get session info for them.
>The usage.txt entry looks like
>/cgi-bin/library [Mon Jul 11 08:34:25 +1200 2005] (a=d,
>b=0, b1=0
>, b2=0, bc1aboutdesc=, bc1cfgchanged=0, bc1clone=0, bc1clonechanged=0,
>ol=, bc1contactemail=, bc1dirname=, bc1dodelete=0, bc1econf=0,
>bc1esrce=0, bc1fr
>omsrce=0, bc1fullname=, bc1infochanged=0, bc1input=, bc1inputnum=3,
>=, bc1tmp=, bcp=, beu=, bft=, bnu=, bp=, bt=0, c=kjdon-pooh, cc=, ccp=0,
>cfgfile=, cl=CL2, cm=, cq2=, ct=0, d=, de=, debc=0, dm=, ds=, dsbc=0,
>0031-001-0-0utfZz-8-00, el=prompt, er=, f=0, fc=1, fqa=0, fqc=, fqf=,
>fqk=, fqn=
>3, fqs=, fqv=, g=, gc=0, gt=0, h=dtx, h2=, hd=0, hl=1, hp=, hs=0, ifl=,
>il=l, j=
>, j2=, k=1, ky=, l=en, m=50, n=, n2=, nl=, o=20, p=about, pc=, pfd=0,
>pfe=0, pfl
>=0, pld=10, ple=10, pll=10, ppnum=0, pptext=, pw=, pxml=0, q=the, q2=,
>qb=0, qf=
>0, qt=0, qto=3, r=1, rd=0, s=0, st=1, t=0, u=0, ua=, uan=, ug=,
>uma=listusers, u
>mc=, umnpw1=, umnpw2=, umpw=, umug=, umun=, umus=, un=, us=invalid, v=0,
>, x=0, xx=0, z= "Mozilla/5.0 (X11; U; Linux
>i686; en-U
>S; rv:1.7.8) Gecko/20050516"
>The z argument is the user id, and we use that to combine hits into
>sessions - we assume that two hits with the same z arg are in the same
>session if they are less than 30 minutes apart.
>You can work out what the user is doing by looking at a few basic arguments:
>a: action (p=page, q=search, d=document or classifier)
>p: if page action, what type of page (home page, about page etc)
>d: document id
>q: query string
>etc. If you go to the admin page of your greenstone installation
>(.../library?a=status&p=frameset) adn click on the arguments link, you
>can see a list of all the arguments and their names (which may or may
>not be helpful).
>I don't think you can track unsuccessful search events this way - unless
>the definition of unsuccessful is that the user never acessed any
>documents following the search.
>Hope this helps,
