IRC log for #koha, 2006-06-11

All times shown according to UTC.

Time S Nick Message
12:01 thd owen: so the search that kados has looks like he has modelled it on what is essentially a single word spelling check against an index scan for OCLC.  That search has a JavaScript button named javascript:scanindexadv\d .
12:07 thd owen: so the search that kados has looks like he has modelled it on what is essentially a single word spelling check against an index scan for OCLC.  That search has a JavaScript button named javascript:scanindexadv\d .
12:09 owen Yeah, I don't like the model of choosing from a keyword list, populating your search form, and then searching. I think that's too convoluted for patrons.
12:10 thd owen: that is for the advanced patrons :)
12:10 owen Sure... but I'd still like to have a browse search for the regular patrons
12:11 thd owen: how would you imagine such a browse search for regular patrons functioning
12:11 thd ?
12:13 kados owen-away: what you want is coming, and being sponsored by SMFPL
12:13 kados but there is something else, called 'scan'
12:13 kados it's not the same as authorities searching (what owen wants) or browsing (what thd wants)
12:13 thd kados: what does owen_+away want?
12:14 kados he wants a title or subject search to automatically be a title or authorities search
12:14 kados s/or authorities/or subject authorities/
12:15 thd kados: so that the default searches against the authority file if there is no exact match?
12:15 kados right
12:15 kados you can see this in action at the Ohio University catalog
12:15 kados
12:17 thd kados: I want that too but paul thinks that you should not trick the user into getting more than he expects so there needs to be an option to change such a default.
12:17 kados of course
12:17 kados quite simple to turn it on/off
12:18 kados thd: how do you like the rest of the zoomopac site?
12:18 thd kados: the OCLC index scan search would be much more useful if it processed more than just the first word in the query even if it processed them independently
12:18 kados thd: (set asside our discussion of boolean options since we haven't reached any conclusion on how to display them yet :-))
12:18 kados thd: it does if you do a phrase search
12:19 kados thd: do a 'title phrase' on 'harry potter'
12:21 thd kados: how do I do a phrase search in the advanced search, with double quotes?
12:21 kados I've also been looking at google's advanced search page:
12:21 kados
12:21 thd kados: Google is especially terrible at fielded searching.  Too bad it is everyones first model now.
12:21 kados thd: are you on worldcat in the 'browse index' search?
12:22 thd kados: no I was at liblime :)
12:22 kados ahh ...
12:22 kados no, I haven't done that yet
12:22 kados to see how it would work take a look at oclc
12:22 kados the 'browse index' search
12:23 kados (get to it by clicking on the icon next to each boolean row on the far-right)
12:23 kados scanindexadv(1) is the javascript command
12:24 thd kados: I actually see noting significantly different with phrase quoting than I had before without it.
12:24 thd s/noting/nothing/
12:25 kados at liblime you won't
12:25 kados neither at OCLC
12:25 kados you have to select 'phrase title'
12:25 kados it's last on the list
12:26 thd kados: oh yes I forgot that they treat it as a separate index :)
12:26 kados zebra does too
12:26 kados zebra has a word index and a phrase index
12:27 thd kados: yet you do not expect to select them separately from the field use drop down.
12:28 thd kados: field is a distinct concept from word or phrase.
12:29 kados yep
12:29 kados in zebra you can specify whether you want the scan to only find complete fields
12:30 kados which would be useful as a rudimentary authorities system of sorts
12:30 kados at least for titles perhaps
12:31 thd kados: complete fields means complete fields does it not such that if you missed some extra date subfield for a particular record in your search words you would miss the record
12:32 kados correct
12:32 kados here's an example
12:33 kados scan @attr 6=3 "harry potter"
12:33 kados * harry potter (2)
12:33 kados  harry potter and the chamber of secrets (11)
12:33 kados  harry potter and the goblet of fire (8)
12:33 kados  harry potter and the half blood prince (4)
12:33 kados  harry potter and the order of the phoenix (5)
12:33 kados  harry potter and the philosopher s stone (1)
12:33 kados etc.
12:34 thd kados: 245 is not an authority controlled field in any case
12:34 kados ok ...
12:35 kados I realize that :-)
12:35 kados here's an authorities-relevant scan:
12:35 kados scan @attr 6=3 @attr 1=21 "cat"
12:35 kados * cat (1)
12:35 kados  cat breeds (7)
12:35 kados  cat family mammals (3)
12:35 kados  cat in the hat fictitious character (6)
12:35 kados  cat owners (20)
12:35 kados  catalog (1)
12:35 kados  cataloging (1)
12:35 kados  cataloging of children s literature (3)
12:35 kados  cataloging of compu
12:35 thd kados: which is good that there is an alternative technique because an actual authorities search will not provide much for title
12:35 kados etc
12:36 thd kados: subjects are authority controlled
12:37 thd kados: where is the advanced search at ?
12:38 kados dunno
12:38 kados it's a black box to me :-)
12:40 kados I think the default for a scan should be 'phrase', at least on the OPAC
12:40 kados and by default it should return full fields, not just phrases
12:40 thd kados: It looks like they were afraid of confusing the user and providing more than one field in a search form or perhaps the system does not have the ability to manage more than one index
12:41 kados right
12:41 kados owen-away: I'm gonna edit the template quickly
12:42 thd the post search browse function works like the one in Voyager and many other systems
12:44 kados thd: on the liblime power search, you can now select 'Scan Indexes'
12:45 kados thd: the full bib1 attribute set should be available as well
12:46 kados hmmm ...
12:46 kados not sure it's working perfectly yet
12:46 kados ahh, yes it is
12:47 kados obviously authorities are better, but this might do the trick
12:48 kados for libraries who don't want to buy authorities, etc.
12:48 thd kados: how do you change the size or rather I mean height of selection boxes?
12:48 kados thd: you'd have to ask owen
12:48 kados thd: it's in CSS I think
12:49 kados owen-away: IMO 'Options' should be changed to 'As:'
12:49 thd kados: I always found selection box style to be outside my control but obviously I missed something.
12:50 kados I'd like to see more clearly defined columns in the power search
12:50 kados so that all the 'search point' selection boxes lined up
12:50 kados and all the 'Options' or 'As' lined up too
12:51 thd kados: have you looked at my Z39.50 client again?
12:51 kados thd: what's the url again?
12:54 thd kados: I was adding some options that are now hard coded but I kept falling asleep while sitting at the keyboard
12:56 thd kados: I spent a day and a half rethinking term set grouping as some hybrid of selection box parentheses and group marking but that is a very difficult problem.
12:58 thd kados: my conclusion two years ago was that term set marking without visual parenthesis was difficult to track in the mind with very deep nesting.
13:00 thd kados: I suspect that there is no term set grouping solution that satisfies my flexibility and speed of expert use goals without refreshing the page so that part of it can be rewritten.
13:03 thd kados: do you have Z39.50 target I can add that supports numeric comparison reliably?
13:03 kados zebra should :-)
13:04 kados it's not a live target though
13:04 kados I'll need to enable it, maybe this weekend I'll have time
13:04 thd kados: you only have a dead target?? :)
13:05 kados hehe
13:05 thd kados: did you see the current form of the interface?
13:05 kados yes
13:05 kados I think the group thing is one way to do it
13:05 kados I need to think about it a bit too
13:06 thd I tried to make explicit the relationship of relation and other options to the term and the field.
13:08 thd after placing that backwards in the same way that you had originally
13:08 thd kados: do you have a little time to consider what I meant by multi-MARC Koha?
13:09 kados not at the moment
13:09 kados right now I'm:
13:09 kados thinking about browsing
13:10 thd kados: real browsing?
13:10 kados sorry ... no, scanning :-)
13:10 kados we'll get to browsing
13:13 thd kados: I do not remember if I said properly but I had thought that multiple words should have an option for independent evaluation without phrase browsing so that a phrase search is not needed.
13:14 thd kados: phrase searches are unreliable and very prone to user error by the cataloguer or the patron
13:14 thd kados: unfortunately there are a few important targets that do not support word list searching properly
13:17 thd kados: word list searches need to be rewritten as word searches for those problem targets
13:17 kados thd: give the zebra one a shot on the liblime site
13:17 thd s/word searches/separate word searches/
13:18 kados check the difference between word and word list
13:26 owen <thd>kados: what does owen_+away want?
13:26 owen <kados>he wants a title or subject search to automatically be a title or authorities search
13:26 owen Except that it doesn't have to be an Authorities search with a capital A.
13:27 owen I just want the result to be displayed to the patron in context with other nearby results
13:27 owen I don't need an authority record to tell me what title in our index comes before and after the phrase 'Harry Potter.'
13:29 kados that's true
13:30 kados owen: so try the new Power Search checkbox with the 'complete field' Truncation attribute
13:30 thd owen: yes , some systems such as Voyager have this standard feature and the issue is really independent of authorities because one could still misspell something and miss every authorised form as well as all tracings and cross references.
13:30 kados owen: and see if the results are meaningful to you
13:31 kados owen: sorry ... it's the 'Completeness' attribute
13:31 kados owen: 'complete subfield and complete field are identical in Zebra, so it doesn't matter which one you select)
13:32 thd kados: Proper use of that option is supposed to cause the match to fail completely if the match is imprecise if I understand Z39.50 correctly.
13:32 owen so that search requires that I get the full title exactly right?
13:32 thd owen: that is my understanding
13:32 kados owen: you just have to get what you type right
13:32 kados owen: try it out
13:33 thd kados: that would be incomplete field
13:33 owen Doesn't seem very useful.
13:33 owen In what situation would you want that kind of search?
13:33 kados in power search
13:33 kados select 'Subject'
13:33 kados type in 'cat'
13:33 kados then select 'Completeness' and choose 'Complete Field'
13:34 kados then check the box
13:34 kados and do the search
13:34 owen I get one result :)
13:34 kados ?
13:35 kados you sure you don't have any othe rlimits set on the power search?
13:35 kados only option that should be set is the completeness one
13:35 thd s/incomplete field/incomplete subfield, there is no incomplete field option./
13:35 kados i get 10 results
13:35 owen I get 10 results for a keyword search, but 1 for a subject search
13:36 kados I get 10 for subject
13:37 kados thd: it's not specified in bib1, is it?
13:37 owen I did a shift-reload to make sure all the form fields were reset
13:37 kados me too
13:37 kados I still get 10 results :-)
13:38 thd kados: What is not specified in Bib-1?
13:38 kados[…]&query4=&op5=%40a
13:38 kados thd: there is no 'incomplete field' in bib1
13:38 owen kados: that link takes me to 1 result!
13:38 kados thd:[…]html#completeness
13:38 kados owen: !!
13:39 kados so something else is going on
13:39 kados I get:
13:39 thd kados: yes, only incomplete subfield which is what one would usually want
13:39 kados cat (1)
13:39 kados cat breeds (7)
13:39 kados cat family mammals (3)
13:39 kados cat in the hat fictitious character (6)
13:39 kados cat owners (20)
13:39 kados catalog (1)
13:39 kados cataloging (1)
13:39 kados cataloging of children s literature (3)
13:39 kados cataloging of computer network resources (1)
13:39 kados catalogs (292)
13:40 owen hunh? I'm not getting a result set like that, I'm getting the usual search results list
13:41 thd kados: However, it is unclear to me what should happen for the case of searches where the term cross subfield boundaries
13:41 kados owen: strange
13:43 kados thd: there is no handling of that currently IIUC
13:43 kados thd: it shouldn't be treated with a completeness attribute
13:44 thd kados: that is the completeness attribute option for incomplete field: no use of the completeness attribute.
13:45 thd kados: however, I know targets actually supply completeness as incomplete subfield by default.
13:47 kados thd: with the above link I posted, do you see a list of items on a blank page?
13:47 kados owen: this is really bugging me ... what is going on?
13:47 thd kados: look at how the National Library of Canada at last claims things work on AMICUS http://www.collectionscanada.c[…]06002-420-e.html7
13:49 thd kados: I have one cat title from that link but maybe the URL was truncated by the limits of this IRC service
13:49 kados thats' what owen is getting too
13:50 owen I'm trying it again and again, but still just the one result for subject:cat, completeness:complete field
13:51 kados ok
13:51 thd kados: is your URL supposed to end in &query4=&op5=%40a ?
13:51 kados hmmm
13:52 thd kados: Is it over 471 characters?
13:52 kados no no, that doesn't make sense
13:52 kados ahh, it must be
13:53 thd kados: I had the same problem sending a URL to hdl some time ago
13:54 thd kados: months ago, I discovered that if I type too many sentences in a single post no one but me can see them
13:55 kados hehe
13:55 thd logbot has missed my best posts ;(
13:56 owen dewey, what are thd's best posts?
13:56 dewey owen: bugger all, i dunno
13:56 owen dewey: thd's best posts are the ones that exceed 471 characters
13:56 dewey OK, owen.
13:56 kados hehe
13:58 kados owen: i think i might know the prob
13:58 kados owen: if you change all the forms to 'post' rather than 'get' it might work
13:59 kados owen: I bet your browser can't send more than X number of gets
13:59 kados owen: or maybe the OS
13:59 kados owen: but post is supposed to be unlimited
13:59 thd kados: how do you copy the URL then?
13:59 thd get has no limits either
13:59 kados thd: my browser obviously doesn't have that limitation :-)
14:00 thd the problem is the limits for this IRC channel
14:00 kados ahh, maybe
14:00 kados bbiab
14:00 thd kados: this is not a browser issue the limit is that the IRC channel is truncating your post
14:01 thd kados: Am I mistaken?
14:01 thd s/post/IRC post/
14:02 owen "the amount of form data that can be handled by the get method is limited by the maximum length of the URL that the server and browser can process"
14:02 owen[…]0/forms/form.html
14:02 kados yep
14:02 thd kados: some very old browsers had unreasonable limits on the length of get strings but none of us have those browsers anymore
14:06 thd kados owen: in the real world just about every very large site with complex get queries uses get strings much longer than 100 characters.
14:07 thd kados owen: the only thing that I would not put in a get request is an actual MARC record.
14:11 thd owen: how do I change the height of selection boxes?
14:14 owen size="" I think
14:15 thd owen: is that not for the width?  But i suppose it scales the whole thing.
14:15 owen for a <select> ?
14:16 thd owen: Is there a mans to change the style of the selection box text such as baking it bold?
14:16 thd owen: yes I mean the select tag
14:17 owen The width is determined by the longest <option> unless specified in the stylesheet
14:17 thd s/mans/means/
14:17 owen You can also use styles to make the text bold
14:18 thd owen: these styles cannot be implemented without specification in the style sheet?
14:19 thd owen: i have usually modelled with direct code and then implemented the change on the style sheet.
14:19 owen I'm not sure how it works with <select>s you could try <select style="font-weight: bold">
14:20 owen But the size attribute is definitely part of the HTML.
14:28 thd owen: the size attribute for select specifies the size of  the selection widow for the number of options to present at one time so height must be a font element.
14:30 owen Height of the text in the selection box?
14:31 thd owen: yes height of the text and of the box itself just around that text
14:31 owen Then yes, use styles. <select style="font-size: 110%">
14:32 thd owen: in the case where the selection window is closed and showing only one selection option
14:34 thd owen: thanks, i never explored CSS options well enough historically because i avoided any options that did not function in the oldest of browsers.  Which was most CSS options.
14:42 kados owen: ok ...
14:42 kados owen: now ... try the checkbox
14:43 kados owen: cat as a subject using 'completeness' as 'complete subfield'
14:45 kados owen: and you can create a new way to display the 'scan' results
14:45 kados owen: you've got a new template param called 'scan' to work with
14:45 kados owen: everything comes in as a title
14:47 owen checkbox?
14:48 kados go to the power search
14:48 kados pick 'subject'
14:48 kados put 'cat' in the field
14:49 kados under 'completeness' pick complete field
14:49 owen Was that the problem before?!
14:49 kados maybe
14:49 kados I didn't change that at all
14:49 kados I just made the results show up in the context of normal search results just now
14:51 owen I think before I wasn't getting that I needed to check a checkbox
14:52 kados ahh, yea, that makes sense
14:52 kados how's it work?
14:52 owen I see 10 results!! :D
14:52 kados heh
14:52 kados does it work as you'd expect ?
14:52 thd kados: This provides no distinction between complete subfield or incomplete subfield.  Complete subfield matches do not necessarily even appear at the top and are certainly not clearly differentiated.
14:53 kados thd: what?
14:53 kados thd: what doesn't provide a distinction?
14:53 thd kados: what you are doing is supporting a function other than attended by the attribute type
14:55 thd kados: a search for cat in title as a complete subfield should only match where cat is the whole main, the whole subtitle, or the whole statement of responsibility
14:56 kados thd: not for a scan
14:56 kados thd: scan works differently than search
14:56 kados thd: apparantly :-)
14:56 kados thd: because you're absolutely right when it comes to searching
14:56 kados thd: as is evidenced by not checking the box
14:57 thd kados: oh, i thought you had changed some configuration to achieve this that broke the intended functionality
14:58 kados no
14:58 kados it's how it worked by default
14:58 kados take it up with Index Data or check the Z3950 Spec about how Scan should work
14:58 thd kados: but I did not check any box in power search so you did change something for the default
14:58 kados um
14:59 kados thd: I don't understand
15:00 thd kados: I used power search which has no user option for scan or other search types.
15:02 kados thd: you need to shift-refresh your page
15:02 thd kados: oh
15:06 thd kados: I have the checkbiox now and see what you had done but I still contend that either Zebra is misconfigured or this is not implemented as the standard intended
15:09 thd kados: what I would expect from the actual records that I can see and therefore know are in the database is that a title search on cat as a complete subfield should only match where the record contains cat and no other word in $a which is the only case matching a complete subfield in existing data.
15:09 thd kados: I do also understand and desire what Owen wants
15:10 kados thd: if you can point to where in the standard that is specified for the Scan type, i'd be interested
15:11 thd kados: scan merely returns the index reference rather than complete records
15:11 thd kados: I see the same problem whether the scan checkbox is checked or not
15:12 thd kados: the result set includes more than what it ought to for a complete subfield match
15:13 thd kados: that is fine if the application is running some additional less restrictive search to show more records
15:14 thd kados: my point is that the additional less restrictive search should be clearly distinguished in the result set from the search where only $a is matched completely
15:14 kados thd: no ...
15:15 thd kados: explain
15:15 kados thd: try a complete field search on 'it'
15:15 kados sorry ... 'complete subfield'
15:15 kados actually, they both are the same in zebra
15:15 kados so you are commenting on a limitation of zebra I think
15:16 kados if I understand correctly
15:16 thd kados: I had done and if they are the same something is not right
15:16 kados correct
15:16 kados so like all Z3950 servers, zebra implements a subset of the standard :-)
15:16 thd :)
15:17 thd kados: Z39.50 services are the primary business of Index Data.  I would have thought that they would have done this better.
15:18 thd kados: Not that it is not good but that it is not better.
15:19 thd kados: do you understand at least what I was expecting?
15:27 kados thd: of course :-)
15:27 kados thd: but ... I'm not convinced that you are right
15:27 kados thd: I'll need to re-read the scan service in the Z3950 spec to be sure
15:28 thd kados: scan does not change how the attributes function
15:29 kados thd: are you sure?
15:29 kados how would you indicate that you wanted to return entire subfields in an index scan?
15:29 kados according to the spec?
15:30 thd kados: I would not stake my life on it and I have not rechecked the documentation but I am fairly confident unless some attributes are not meant to be supported with scan at all.
15:31 thd kados: let me explain briefly the general sense I have of what I believe that owen and I are expecting.
15:32 thd kados: user conducts a search which either produces no matches or very few matches.
15:32 thd kados: the system notices, aha, no matches or very few matches
15:35 thd kados: the system application checks whether the user had specified a system preference or checked an option for do not run a secondary search automatically..
15:35 thd kados: no such option had been checked or selected.
15:36 kados thd: some users would expect this option when there were 'too many' results, not too few
15:36 kados thd: the assumption being that you didn't find what you were looking for :-)
15:37 thd kados: the system runs some other search or searches automatically and then presents the results as your original search and each different automatic search with the search specified
15:38 thd kados: yes I agree, the system could also convert a search into a phrase or beginning of field search etc. for fewer matches.
15:39 kados thd: I'm expecting to release 2.3.0 early next week as a 'beta' Zebra version of Koha
15:39 kados thd: I'm hoping I can get libraries to comment on how the system is performing in terms of how the default searches are working
15:39 kados thd: doing what you suggest wouldn't take more than a day or so of coding
15:39 thd kados: the point is to have the search results show explicitly when something was different from the user search.
15:40 kados thd: we could call it 'smart searching'
15:40 kados thd: and it could be a checkbox
15:40 thd kados: if the completeness option does nothing then that should be indicated.
15:40 kados thd: completeness does do something!
15:41 thd s/nothing/nothing meaningfully related to its intended function/
15:43 thd kados: from what i can see, I would suspect that the Index Data completeness option has a bug in implementation unless you have misconfigured the server some how.
15:45 thd kados: before you go too far down the path of inflexible numbered query rows let me send you some code so you can see how looping through an array works exactly to achieve the same function.
15:45 kados thd: i understand it and have already updated my code :-)
15:45 thd kados: I think that looping through an array will prove much more extensible.
15:48 thd kados: I see that your search form is is still numbering the name attribute rows.
15:49 thd kados: I expected that you understood the concept but it does not appear to me that you have actually implemented it yet.
15:51 thd kados: Or at least not in a way that would allow the user to arbitrarily extend any field row type by an unknown number to whatever degree the user desires.
15:55 kados thd: it's not fully implemented in the template
15:55 kados thd: maybe this weekend :-)
15:56 thd :)
15:57 thd kados: this weekend would be starting now would it not :)
15:57 kados hehe
15:57 kados I'm working on the scan currently
15:58 thd kados: I want to get your sense of what I mean by multi-MARC Koha.  When would that be possible.
15:58 thd ?
15:58 kados now's not bad
15:58 kados go ahead ...
15:59 thd kados: OK what I had asked paul about from many months ago was the extent of his research into mapping one MARC system onto another.
16:01 thd kados: That had been an element of my original thinking for mere efficiency of taking advantage of some record content that only existed in one flavour.
16:01 thd kados: He never presumed that such a thing was desirable in the original design of MARC Koha.
16:03 thd kados: Zebra would seem to give a much better opportunity to have the system making use of records in multiple flavours.
16:05 thd kados: Very big libraries which acquire material from many languages have this problem and I of course want too search multiple libraries across flavours.
16:05 thd s/too/to/
16:09 kados right
16:09 thd kados: the usual way big libraries with multilingual acquisitions do this is to either expect their vendor to supply records, or some national libraries such as DDB have conversion subscriptions for supplying MARC 21 converted records over Z39.50 or FTP.
16:10 kados thd: there is nothing stopping us from retrieving other flavors of MARC from other catalogs
16:11 thd kados: I see no obstacle except a bit of code for allowing Koha to be much more flexible in storing multiple databases of records.
16:11 thd kados: different MARC flavours cannot be reliably distinguished by the records themselves
16:12 thd kados: therefore, they would need to be stored in a different database.
16:13 thd kados: there are a couple of simple field presence tests which would allow distinguishing UNIMARC from MARC 21
16:14 thd kados: However, such a simple test relies on the fact that they happen to contain major fields such as main entry in a field not used in the other format presently.
16:16 thd kados: such usage might one day change and that is insufficient to distinguish very closely related formats such as MARC 21 and IBERMARC.
16:17 thd kados: the major obstacle to multi-MARC Koha is that the flavour attribute and framework is an all or nothing thing.
16:19 thd kados: It ought to be possible to choose amongst multiple frameworks flavours depending on the flavour of the record itself and not on the flavour of the one and only one setting in the system preference.
16:19 thd or the one and only one framework flavour.
16:22 thd kados: the problem here is that there are no existing Koha libraries that would care about such a feature except maybe the work Tumer is doing where as understood it his institution was being gifted with large numbers of books from all over the EU.
16:23 kados right
16:23 kados that's my problem with it too
16:23 kados I still maintain that before 3.0 is out
16:24 kados Koha will not delete data that exists in a record just because it's not in the frameowrk
16:24 kados when using the editor
16:24 kados so that's a partial fix
16:24 thd kados: my thought had been that you wanted to have 3.0 out as quickly as could be managed the thought of having to change and debug all the code changes required would be too much.
16:25 thd kados: yes, fixing that in the editor might be a fairly simple matter.
16:27 thd kados: yet changing how MARC flavour is called everywhere throughout Koha is liable to be much more complex.
16:28 thd kados: You had asserted that you did not think multi-MARC Koha would be too many changes for 3.0.  Do you still think so now that I have explained.
16:28 thd ?
16:29 kados it's an interesting proposition
16:30 kados I'll have to give it some more thought
16:33 thd kados: If the system can store multiple formats then it can have progressively better conversion between formats but it should not be necessary to even convert for some libraries.  The framework can do all the work at query and record editing time.
16:37 thd kados: Conversion in advance would save CPU time over querying multiple databases but if they are all local databases that is really little different than where many libraries have separate databases for books, serials, etc. rather than storing all records in one database and distinguishing them by the leader when searching.
16:39 thd kados: Authority systems, except for language differences in personal names, are much less divergent than you might suppose.
16:40 thd kados: However, there are some secrets about MARC authority files that I probably have to keep through about the middle of next year.
16:43 thd kados: The actual differences are often minor so I should correct the assumption presented by mere differences in language.
16:48 thd kados: MARC and the library systems that underlie it had been badly broken for the purposes of record exchange.  However, there is much more that has been done to correct that than the efforts at format unification for MARC 21.
16:50 kados thd: secrets you need to keep?
16:51 thd kados: I am just dying to tell you but I fear that if I reveal all I will lose the ability to obtain funding for a business myself.
16:51 kados heh
16:52 rach morning gentlemen
16:52 kados thd: while we've been talking I have implemented a scan search equally as good as worldcats :-)
16:52 thd good morning rach
16:52 kados morning rach
16:52 kados thd: give it a shot
16:52 thd kados: worldcat does not have a good one :)
16:53 kados thd:[…]query1=cat&scan=1
16:53 kados thd: :-)
16:53 kados thd: show me a good one and I'll make Koha's as good :-)
16:53 rach are you recovered from your international jaunt?
16:54 kados rach: yea, things are finally slowing down
16:54 thd not me rach
16:54 kados rach: it's always hard to catch up after being out of the countr
16:54 kados y
16:54 kados chris: you aroudn?
16:54 kados chris: check out the scan search:
16:54 kados[…]r+1%3D1016&scan=1
16:54 kados chris: you can also do one from the power search
16:55 kados chris: use 'completeness' to return complete subfields
16:55 rach what's the index it's scaning?
16:55 kados rach: the Koha bibliographic index
16:55 kados NPL's data
16:56 thd kados: you are missing the default value in the scan index for textbox
16:57 rach is the index a seperate file, rather than searching the records directly?
16:57 kados rach: the index is the zebra index
16:58 kados thd: what should the default be?
16:58 thd kados: whatever was the search term if you are modelling it on WorldCat
16:58 kados thd: you mean the value the user entered?
16:58 kados ahh
16:59 thd kados: also the selected option for Indexed in should be context sensitive in a similar manner.
17:00 kados thd: ok ... sec
17:02 kados thd: actually, that's not practical at the moment
17:02 kados thd: it at least preseves the original query USE attribute
17:03 kados thd: which is enough context sensitive for me now :-)
17:03 kados thd: now it's time to implement the 'smart search' feature :-)
17:06 thd kados: WorldCat also fills the search form with the selected scan result.  I know that owen thinks that is convoluted for the ordinary user but it should be available as an option because it is certainly valuable if the result set is too large.
17:06 kados thd: actually, that name is trademarked, do you have another idea for a name?
17:07 thd kados: clever search?
17:07 chris the best search in the world
17:07 chris tm
17:08 thd kados: I want a system where simple search is called simpleton search to get more users to investigate fielded searching
17:11 thd I wish more people would think: oh no, I don't want to use simpleton search that is liable give me poor results.
17:11 chris the problem is, they dont know what poor results are
17:12 chris they just think, the library doesnt have what i want
17:12 chris thd, are you on the next gen opac mailing list?
17:13 thd chris: poor results are often too many results so that the most helpful results are buried several pages down
17:13 rach with websites (rather than libraries) surveys have shown that the presence of an advanced search almost garauntees people won't find what they are looking for
17:13 rach if they can't find it using the simple search, they definitly can't using an advanced search
17:14 thd rach: I know this is a world where most people use no more than two word queries which usually guaranties poor results
17:15 rach so I think the scan that you've done there kados looks great
17:15 thd rach the job of smart search [name taken already] is to assist the user in forming better queries.
17:15 rach thd: yep there is an inherant problem that if you know what you're doing well enough to use the advanced search you probably can get good enough resuls out of a simple one
17:16 rach I guess that's why librarians probably won't be out of a job too soon
17:17 thd rach: except that the public perception is now that everyone knows how to search because everyone uses Google or something like it.
17:18 thd rach: I think that librarianship in the public sector actually has a problem of being perceived as valuable by the public
17:18 rach well people aren't going to get any smarter so the programmes have to :-)
17:20 rach by people who use libraries? or public at large?
17:20 rach erg
17:20 rach sorry gotta scoot, nearly feed tim
17:20 thd rach: actually some of the worst searches I ever met were librarians but that is true for any profession that participation and skill are not a necessary combination
17:20 thd chris: what is that mailing list?
17:21 chris 2 secs ill find the url
17:23 chris[…]ng-lists/ngc4lib/
17:23 chris its just new, just started last week
17:24 chris as usual its a pretty much arguing for the sake of arguing, but some interesting things have been said
17:24 thd see what happens when I stop paying attention to the world
17:24 chris hehe
17:31 kados thd: simpleton++ :-)
17:39 thd what does this post mean[…]/200606/0087.html
17:40 thd does it imply that ILS system vendors are not profitable?
17:41 chris yeah
17:42 chris thats how i read it
17:42 chris i think someone is smoking crack :)
17:48 thd chirs: I imagine most ILS vendors beat Amazon by a large degree in terms of profit as a proportion of total revenue since Amazon is paying to run the library too with all the warehousing, materials handling, and individual user support that they do.
17:49 thd s/they do/Amazon does/
17:49 chris yeah exactly
17:49 chris they dont get audi's to drive around in by losing money
17:57 thd chris Google is different because they are not paying to run the library.  Amazon is barely profitable but barely for a huge quantity of revenue which means they do just fine.   However, Amazon's original business model was sustained by a stock scam when they were losing money every time someone bought something.
18:00 thd actually, the first Amazon model was sound but then they started opening expensive warehouses and needed every customer in the world for those to function efficiently enough for profit.
19:34 thd kados: are you there?
19:38 kados thd: hi
19:39 kados I have to go in about 15-20 minutes
19:40 thd kados: i thought of how many existing Koha libraries want multi-standard Koha even if they have no interest in multi-MARC Koha.
19:42 kados right
19:43 thd kados: Although, I may be mistaken abut what existing Koha libraries actually have an interest.
19:44 thd kados: Very many libraries in the world at least have an interest in using non-MARC data.
19:44 thd kados: structured non-MARC data needs frameworks to interpret it.
19:45 thd kados: Do any of your libraries have an interest in non-MARC data?
19:45 kados yes
19:46 chris almost all of ours do
19:46 thd kados:I mean structured non-MAC data.  Even web pages are structured since they at least have title etc.
19:47 thd for every different structured data type we should have a framework for how to manage it for Koha.
19:48 kados agreed :-)
19:48 kados i would like to implement the frameworks in XSLT
19:48 kados or something like it
19:48 thd kados: Do you see how this is related to the issue of what is needed for multi-MARC Koha as I had described it?
19:49 kados yes
19:50 thd kados: Although, you could probably have more non-MARC frameworks even while limiting MARC framework code to one flavour with some less work on code changing.
19:51 kados I think the easiest way to have non-MARC is to adopt Dublin Core or MODS as the non-MARC
19:51 kados since there already exist crosswalks between them
19:51 thd kados: So now this is not completely a minority feature but a feature that most every library wants :)
19:52 chris marc is the minority
19:52 chris if you count special and corporate libraries
19:52 chris far more libraries dont use it than do use it
19:52 thd exactly
19:53 chris and i think koha has a unique place in being one of the few ils to support both
19:53 chris or at least attempting too :)
19:54 thd kados: obviously mapping things onto a particular MARC flavour is one way to the extent that it can be done but other data will necessarily be less differentiated than MARC.
19:55 thd chris: and does he or she have a name?
19:56 chris we wont know if its a he or she until 20 weeks
19:56 thd what is the status of storing, adding, and deleting XML records in Zebra as XML and not as MARC?
19:56 chris thats what we originally started doing
19:56 chris it works
19:57 chris storing marc as marc is faster to index
19:57 chris so i think thats why we are doing that
19:58 thd chris: Oh, I had forgotten it was a speed issue for searching or conversion of the result set?
19:58 chris i think it was for importing speed issue .. searching was the same, but much faster to import marc as marc
19:58 kados chris: congrats!
19:59 chris thanks kados
19:59 thd chris: why is importing speed even a significant issue if that can always be run as a background batch process?
20:00 chris i think its faster to update/delete too, but kados would know that for sure
20:00 thd kados: what is the for sure answer if you are still here?
20:02 thd kados: Is MARC as MARC faster to update and delete in Zebra than MARC as XML or is it only importation that is faster?
20:07 thd chris: did you see what I had posted earlier about true multi-MARC Koha earlier today?
20:11 chris hmm nope, how long ago? ill look in the logs
20:15 thd chris: 16.58 to 17.51 Eastern US time.
20:16 chris k
20:18 thd chris: that is the hour just before rach said "morning gentlemen"
20:18 chris ah yep i see it
20:19 thd chris: most users even in academic libraries with millions of MARC records are looking for content which has no MARC records
20:19 chris yeah
20:21 thd chris: I still think that the OPAC should access that content in a library systems way rather than merely shunt the user over to another system such as Google or a Journal indexing database
20:22 chris so grabbing the results and reformatting them?
20:22 thd chris: which would mean that the OPAC should have a standard way of accessing, indexing, and formatting data in various formats.
20:23 chris yep
20:26 thd chris: obviously no library is going to replace Google by putting web content in Zebra but they could put metadata for even remote web sites which have very great importance to a class of users in Zebra.
20:26 chris yep that would be usefult
20:28 thd chris: regardless of where the data is actually stored and indexed there could be a consistent user interface for navigating between different structured record formats once they have been retrieved in a federated or meta-search.
20:30 chris yep interface is the key
20:33 thd chris: printed information is usually more valuable but if the user cannot be bothered to go to the shelf because he has been spoilt by full text whether it is of high or low quality the user can at least be enticed by a system that uses the metadata available for whatever resource he prefers to use.
20:36 thd chris: yes, metadata extraction and a consistent user interface to support even the instant gratification of the user who has forgotten that he has legs :)
20:36 chris :)

| Channels | #koha index | Today | Search | Google Search | Plain-Text | plain, newest first | summary