IRC log for #koha, 2006-07-02

All times shown according to UTC.

Time S Nick Message
12:35 slef z39.50 over port 80?
12:35 slef I thought it was usually over UDP and ports like 7090
12:52 kados slef: every z3950 server uses a different port it seems
12:52 kados slef: 210 and 9999 are common ports
12:58 slef aye, but 80?
12:58 kados you could run it on 80
12:59 kados I've not seen one on 80 before
12:59 slef I doubt this troublesome firewall allows 80/udp anyway
12:59 slef 22/tcp was slow enough to get
12:59 kados lots of firewalls block udp alltogether :-)
12:59 slef deep joy
12:59 kados :-)
13:00 kados you could do it through a proxy
13:00 slef best get a working koha HEAD install first ;-D
13:00 slef does z3950-daemon support proxies?
13:00 kados it might
13:00 kados if not, it could very easily
13:01 kados bbiab
13:57 thd slef: disable all firewalls for testing :)
13:57 thd kados: are you back from lunch yet?
14:00 thd you would think that I was running MS Widows :)
14:18 thd kados?
14:18 dewey i think kados is becoming a true Perl Monger...
14:18 thd dewey: where is kados?
14:18 dewey rumour has it kados is becoming a true Perl Monger
14:19 thd dewey: what else does rumour have
14:19 dewey thd: wish i knew
15:02 thd kados: are you back from lunch
15:02 thd ?
15:39 thd dewey: seen kados?
15:39 dewey kados was last seen on #koha 2 hours, 38 minutes and 32 seconds ago, saying: bbiab [Sat Jul  1 06:01:02 2006]
15:40 thd dewey: how long is ab?
15:40 dewey no idea, thd
15:40 thd :0
17:52 thd kados?
17:52 dewey kados is becoming a true Perl Monger...
17:53 thd dewey: define monger
17:53 dewey thd: huh?
17:54 thd dewey: define true?
17:54 dewey i don't know, thd
17:59 thd dewey: there is a rumour that kados will be bbiab
17:59 dewey ...but there is a minor diff in <div>s, that I missed...
18:00 thd dewey: what diff was that?
18:00 dewey thd: i haven't a clue
18:00 thd :)
19:49 thd dewey?
19:49 dewey yes, thd?
19:49 thd dewey: seen kados?
19:49 dewey kados was last seen on #koha 6 hours, 48 minutes and 27 seconds ago, saying: bbiab [Sat Jul  1 06:01:02 2006]
19:50 thd dewey: how long is ab now?
19:50 dewey bugger all, i dunno, thd
19:50 thd dewey: ab is at least 6 hours and 49 minutes
19:50 dewey OK, thd.
19:51 thd dewey: ab?
19:51 dewey i guess ab is at least 6 hours and 49 minutes
22:03 kados dewey: seen kados?
22:03 dewey kados was last seen on #koha 0 seconds ago, saying: dewey: seen kados? [Sat Jul  1 15:03:30 2006]
22:03 kados hehe
22:03 kados ab seems to be more hours than I expected
22:03 kados thd: you around?
22:04 kados thd: I've been playing around with faceted searching
22:04 kados thd:
22:04 thd yes kados
22:04 kados thd: mainly with subjects, though I could easily add authors, etc.
22:04 kados thd: what I could use some advice on
22:04 kados thd: is how to construct 'groups' of subjects in the faceted result set
22:05 kados thd: (I assume you know what I mean by faceted )
22:05 kados thd: (it's the list of subjects on the left-hand side of the screen after a search)
22:06 thd kados: are you proffering that as a definition for faceted subjects?
22:06 kados hehe
22:06 kados yes
22:06 kados it's not as faceted as I'd like
22:06 kados I think I'd like to have groupings with expandable lists
22:06 kados so group all 'Fiction' into a 'folder'
22:07 kados so you can expand it and reveal
22:07 thd kados: I would like to see a proper definition of faceted one day
22:07 kados the sub subjects
22:07 kados thd: what do you think?
22:07 kados thd: is the demo interesting at least?
22:07 kados :-)
22:07 thd kados: Even The subject approach to information does not seem to have a definition
22:08 kados I think I could group things into folders based on what is in 650$x
22:08 kados ie, 650$x is the folder
22:08 kados the other subfields go into that folder
22:08 thd kados: where is the faceted search?
22:08 kados maybe
22:09 kados do any search
22:09 kados you'll see the subjects on the left-hand side
22:09 kados of the results screen
22:09 thd kados: do I merely search for a subject heading?
22:09 kados just search for anything
22:09 kados search for 'harry potter'
22:10 kados thd: do you see what I've done?
22:12 thd very nice
22:12 thd kados: what happens with subject subdivisions?
22:12 kados everything is mashed together
22:12 kados in the order they appear in the record
22:12 kados if you can think of a more clever way to do it
22:12 kados I'm all ears :-)
22:13 thd kados: that is not faceted, not that LCSH is really faceted
22:13 kados it seems like sometimes you could group things by 650$a
22:14 kados for instance
22:14 kados a keyword search on harry potter
22:14 kados returns the following:
22:14 kados # Wizards Fiction. (7)
22:14 kados # Schools Fiction. (7)
22:14 kados # Magic Fiction. (5)
22:14 kados in the faceted results
22:14 kados it would look better as:
22:14 thd kados: If it were faceted there would be separate access to each subdivision if LCSH were faceted
22:14 kados Fiction
22:14 kados  -Wizards
22:14 kados  _Schools
22:14 kados  -Magic
22:14 kados right?
22:15 thd kados: you have a faceted result set not faceted subjects :)
22:15 kados right :-)
22:16 thd kados: yes that is in a paper which I just copied a few days ago and have not read yet
22:16 kados is there a consistant way we could group the subjects?
22:16 kados I need a 'simple' formula
22:17 kados if you could explain it to me, I could finish this up tonight
22:17 kados :-)
22:20 thd kados: yes but we should read Gregory Wool. Filing and precoordination : how subject headings are displayed in online catqalogs and why it matters. 2000. In Cataloging & clanssification quarterly. v.29. no. 1/2. 2000.
22:20 kados is it available online?
22:20 kados google has it :-)
22:21 thd kados: I would have an electronic copy I could send you but the publisher was flooded along with the rest of Bingamton, NY and there is no access to the server.
22:23 thd kados: do really find the text in Google?
22:23 kados yes, but the pdf hasn't downloaded yet
22:23 thd s/do/do you/
22:23 kados so maybe the publisher has blocked access :(
22:23 kados and there's no google cache unfortunately
22:24 kados there is this:
22:24 thd kados: no there offices are closed because for the past few days because the whole of Bingamton flooded in heavy rains a few days ago.
22:25 kados ahh
22:26 kados bummer
22:26 kados (which runs zebra btw)
22:27 kados well ...
22:27 thd kados: you will only find a page that says pay your money here for access
22:28 kados are subjects in marc21 an arbitrary hierarchy?
22:28 kados ie, a potentially infinitly deep?
22:28 thd kados: I went to a library that has electronic access a day or so too late.
22:29 kados maybe I can find a review of it
22:29 thd kados: most people do not even know that LCSH is hierarchical
22:30 thd kados: the depth is arbitrary but the strictly LCSH hierarchy is somewhat shallow
22:31 thd kados: there is a hidden hierarchy but that is a secret until next year
22:31 kados here is an abstract of that article:
22:31 kados Library of Congress Subject Headings retrieved as the results of a search in an online catalog are likely to be filed in straight alphabetical, word-by-word order, ignoring the semantic structures of these headings and scattering headings of a similar type. This practice makes LC headings unnecessarily difficult to use and negates much of their indexing power. Enthusiasm for filing simplicity and postcoordinate indexing are likely contributing factors to this phenome
22:32 kados thd: you can't share with me? :-)
22:32 thd kados: you cannot determine any of the hierarchies to display without LCSH authority records
22:32 kados ahh
22:32 kados that's a bummer
22:32 kados so NPL doesn't have authorities
22:33 kados I suppose that means no hierarchy in the faceted results for them :(
22:34 thd kados: the authority records have designations for linking to broader, narrower, and parallel terms
22:35 thd kados: I think what you mean for display of the subject headings in the bibliographic record is likely to be different
22:37 thd kados: the use of $x $z $y $v is subsidiary to $a within the individual heading.
22:37 kados so I could at the very least, group by $a for libraries with no authorities?
22:38 thd kados: that is the only hierarchy that most people understand in LCSH
22:38 thd kados: yes, and it may be the only meaningful thing for the record itself
22:39 kados sometimes, 'Fiction' is both a $a and a $v
22:39 kados in different 650s
22:39 kados[…]
22:39 kados for instance
22:40 thd kados: $x could of course be a $a in a different subject heading
22:41 thd kados: there is a mistake in that record
22:42 kados heh
22:42 kados what's that?
22:42 thd kados: there is a use of the form subdivision as a $x instead of $v as in other instances
22:43 kados $x and $v should not be used together?
22:44 thd $a Wizards $x Fiction. should be $a Wizards $v Fiction.
22:44 kados writing an ILS is like writing a web browser
22:44 kados you want to support the standard, but you also have to support what's out there in the real world :-)
22:46 thd kados: MARC 21 is a standard with vestigial organs from its evolutionary origin
22:46 kados thd: can all the subfields in a 650 field be expressed as a sentence?
22:46 kados thd: and can you tell me what that sentence is? :-)
22:47 kados ie ...
22:47 thd that is only a theory that does not work in practise.
22:47 kados this record is $x of $a
22:48 kados this record is about $x in $a
22:48 kados or something
22:48 dewey something is not recognising the value actually in the leader.
22:48 kados dewey: no something is something
22:48 dewey OK, kados.
22:48 kados thd: what is the theory?
22:49 thd kados: there is a theory that subject headings should be readable in natural language as if they were a sentence
22:51 thd kados: that breaks immediately with postcoordinate word order to have the important word appear first.
22:54 thd kados: maybe the post coordinate word order has been eliminated
22:54 kados well ...
22:54 kados the bottom line is
22:54 thd kados: there was an effort to do that
22:54 kados is there a way to group the subjects?
22:54 kados as they exist in the bib record?
22:55 kados if so, how can it be done?
22:55 kados should it always be:
22:56 kados 650
22:56 kados   $a
22:56 kados      $x
22:56 kados      $y
22:56 kados etc
22:56 kados ?
22:56 thd kados: make a tree with branches representing $a with $z $x $y and $v as twigs. yes
22:56 kados ok
22:56 kados that I can do quite easily
22:57 kados thd: for all 6XX fields? or only 650?
22:57 thd kados: then have the twigs able to act as subordinate links
22:57 kados what's a subordinate link?
22:58 kados do you mean it searches on $a as well as $y when you click on $y?
22:59 thd kados: for all 649 - 651 at least.
23:01 thd kados: well actually for all cases of $z $x $y $v
23:02 kados ok
23:05 thd kados: you could have checkboxes to search by removing any of $z $x $y $v
23:15 thd kados: now that I have looked at most of the substance of the article that I had cited it appears to be mostly a complaint about the problems than a recommendation for what should be done.
23:16 thd kados: there is reference top some research work which is only partly described
23:19 thd kados: Mia Miassicotte. Improved browsable displays for online subject access. 1986. In Information technology and libraries. 7:373-80, 1986.
23:44 thd kados:[…]ructuresFinal.doc
23:47 thd kados: The link above is to ALCTS Cataloging and Classification Section Subject Analysis Committee (SAC). Recommendations for providing access to, display of, navigation within and among, and modifications of existing practice regarding subject reference structures in automated systems.  December 1, 2003.
23:48 thd kados; that is what you want
23:48 kados ok, thanks
23:48 thd kados: diagrams and everything
23:49 kados nice
23:50 thd kados: IFLA will never make such nice recommendations for online catalogues officially
23:51 kados heh
00:10 thd kados: Unfortunately, there is nothing especially innovative in those recommendations even if they go beyond what IFLA would do for subjects presently.
00:11 thd kados: most of the recommendations depend upon structure in authority records.
00:14 thd kados; Improved browsable displays for online subject access seems much more promising for what can be done for sorting by geographic, topical, chronological, or form subdivisions.
00:14 kados yes, I would like to experiment with that
00:15 kados even with rough data like NPL's there should be some interesting ways to organize the data on the results page
00:15 kados but now I must get some sleep
00:16 thd kados: unfortunately that article is probably in off site storage on very dirty microfilm at the New York Public library
00:17 thd kados: are you going to be around tomorrow.  I had not asked y9u about what tumer and I may have discovered
00:17 kados yes, I will be programming all day
00:17 thd ?
00:17 kados working mainly on zebra stuff
00:18 kados if you manage to write up a brief description via email I will read it first thing in the morning while I am fresh
00:18 thd kados: it would not take more than a few minutes to confirm what we saw
00:18 kados ok ... we'll do it tomorrow (or rather, later today :-))
00:19 thd good night kados
00:19 kados night
09:33 kados owen: morning
09:34 kados owen: last night I played around with 'faceted results'
09:34 kados owen: do a search on zoomopac to see what I mean
09:34 kados owen: the possibilities are limitless ... we could create facets, subfacets, etc. in search results
09:52 owen kados: that's really neat
09:53 owen How does it work?
10:14 kados owen: it basically just nabs subjects from the result set
10:14 kados owen: and counts how many of each one there i
10:14 kados owen: then orders it by the highest to lowest
10:14 kados owen: it's really rudementary, we could do lots more with it
10:15 kados owen: getting the interface to look nice might be a challange
10:15 kados owen: but that's your department :-)
10:26 owen kados: that sounds resource-intensive. Does it perform well?
10:32 kados very
10:32 kados there are two ways to do it
10:32 kados the first is in perl, you basically just nab the subjects as they are returned from zebra
10:33 kados the second, which I spoke at great length with ID about, would include modifications to Zebra to enable the indexes to retrieve data from the entire result set
10:33 kados (right now mine only retrieves it from the current page)
10:33 owen From the current page of search results?
10:33 kados yep
10:33 kados it's not ideal, but it works fairly well
10:33 kados and is fast :-)
10:34 kados once we get zebra stabilized for production systems
10:34 kados there are three things I'm planning to sponsor/develop for it:
10:34 kados 1. faceted search results
10:34 kados 2. phonetic indexes
10:34 kados 3. stem indexes
10:35 kados in the meantime, I can fake all of the above in perl
10:35 kados but eventually, we'll want all of that stuff handled by the search engine in the background
10:35 owen A couple of ideas, off the top of my head (without knowing if they're possible, of course): Initially display only results with more than one hit, with the option to expand, and sort the 1-hit results alphabetically.
10:36 kados yea, that would definitely be possible
10:36 kados what I really need
10:36 kados is some good ideas for how the interface should look
10:36 kados because I can imagine haveing several of these faceted results
10:36 kados one for
10:36 kados well ...
10:36 kados subject terms
10:37 kados places
10:37 kados people (about people)
10:37 kados then ...
10:37 kados authors
10:37 kados series titles
10:37 kados popularity
10:37 kados it's really simple to build these
10:37 kados but displaying them is a challenge for me :-)
10:37 kados (notice I removed your navbar :-))
10:38 kados
10:38 kados they've got something similar
10:38 kados and I think I like their display
10:38 kados so that would be one way to approach it
10:38 kados monologue
10:40 kados owen: something else to keep in mind
10:40 kados owen: just like the gapines opac, we could make each of the subjects expandable
10:40 kados owen: ie, group them into categories

| Channels | #koha index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary