IRC log for #koha, 2006-08-18

All times shown according to UTC.

Time S Nick Message
13:05 owen Hi shedges
13:07 owen http://vielmetti.typepad.com/s[…]zon_library_.html
13:07 kados hey shedges
13:07 dewey rumour has it shedges is with us for a few minuts
13:08 kados hehe
13:08 kados wow
13:08 kados that's big news
13:20 shedges hey, folks
13:25 thd For a limited time, we're offering all processing for no additional cost. To take advantage of this offer, simply use the coupon code, AMZTRU733321, when placing your order. (There's no obligation when you sign up, and you control which orders receive processing.)
13:26 thd owen: did you see that part of Amazon processing
13:26 thd ?
13:28 thd owen: they are still missing library searching
13:29 thd owen: Amazon can supply MARC records but they do not know how to search them
13:29 owen to be honest, thd, I didn't look at it in detail
13:30 owen But I'm not surprised...They've built an empire based on real-world data searching. Why bring MARC into it? ;)
13:30 thd owen: I know Amazon has been buying MARC records for years.  I am surprised they have only just now started distributing them with purchases to libraries
13:31 thd owen: approval slips and their search system in general uses book trade data not library records for finding what you want to order
13:31 thd on Amazon
13:33 thd owen: they have no scheme for approval slips or searching by LCSH, LCC, or DDC
03:40 OrangeWindies hello, anyone online?
05:21 thd hdl: are you there?
08:28 hdl hi thd
08:29 hdl thd: is that pines
08:29 hdl oops.
10:35 thd hello hdl
10:36 thd hdl: are you still there?
10:36 hdl yes
10:37 thd hdl: yesterday you seemed to have left with the thought that meta-records would be edited directly by the librarian
10:37 thd hdl: meta-records as tumer had conceived them are mere containers for standard records
10:38 thd hdl: the user or librarian only ever sees the real individual records
10:39 thd hdl: the system silently fills the meta-records in the background using information contained in the standard records
10:40 thd hdl: so no additional frameworks would be required for displaying and editing records
10:42 thd hdl: additional coding effort would need to take advantage of meta-records so that they were filled with the correct standard records and indexed properly with XPath.
10:42 hdl thd: It was not that but the fact that instead of modifying one MARC record at a time, we should modify four and four database.
10:43 hdl Obviously, this will increase speed and security...
10:43 hdl Provided that things are SURE and well coded.
10:43 hdl and also VERY strictly structured.
10:44 thd hdl: well only one record would be modified and then the modifications would propagate at various rates to accommodate either real time or batch performance
10:45 thd hdl: the system should be able to function fairly much as it does now even if the meta-records are mostly unpopulated.
10:46 thd hdl: meta-records, when populated would provide additional indexation
10:46 thd hdl: the only need for them is to overcome indexation limitations
10:48 thd hdl: so the system should be fully functional without whatever advantages meta-records may provide beyond the basic functionality that Koha already has.
10:49 thd hdl: The idea initially would be to have an added platform for experimentation to get the scripts for filling meta-records right but as part of a working system.
10:50 thd hdl: much effort would be required for filling some possible meta-record relationships.
10:51 thd hdl: The structure would not be really absolute because it can be altered by changing the XPaths used in indexation.
10:52 hdl thd : so then, you will double store information if I understand clearly.
10:52 thd hdl: I expect to not to have the best meta-record design at the first attempt.
10:53 hdl When updating biblio level, you will update collections.
10:54 thd hdl: I am afraid that is unavoidable with Zebra indexing limits once you want to index authorities as well as bibliographic records.
10:54 hdl Am I wrong ?
10:54 thd hdl: I am not perfectly clear how tumer intends to use collections.
10:55 thd hdl: I thought I understood that he would be building a Turkish union catalogue but I may have been mistaken
10:56 thd hdl: collection is actually used as the top level tag for a single MARCXML record in the MARCXML schema.
10:57 thd hdl: the collection provides the possibility of containing more than one record.
10:59 thd hdl: what tumer has working already is indexing on a bibliographic record and multiple attached holdings records.
11:00 thd hdl: a holdings record could be a bibliographic record with a linking field and field 995.
11:01 thd hdl: using separate records makes the indexation of multiple holdings easier with XPath than merely repeating the holdings field in a bibliographic record.
11:03 thd as separate records have more easily describable separate XPaths.
11:05 thd hdl: double storing would only start to come in for the case of what tumer has already done if one combined the holdings of multiple institutions which had multiple MARC records for the same manifestation of a work.
11:06 thd hdl: then both duplicate MARC records for the same material that were not necessarily identical records could be stored in the same meta-record biblio.
11:07 thd hdl: if matching bibliographic records for the same material were actually perfectly identical only one would be needed so still no duplication.
11:08 thd hdl: I saw duplication starting with better indexing for authority records.
11:09 thd hdl: if you store authorities with the biblios to which they are related you can have much more efficient retrieval and better search possibilities using authorities.
11:11 thd hdl: authorities could still be stored separately as the primary non-duplicated authority repository which functions as it does now.
11:14 thd hdl: presently, in SQL we could simply join indexes to search using the non-authorised forms for authorities held at a given branch or for two non-authorised forms without choosing them individually before the query is run.
11:14 thd hdl: not that we actually do that we build the query in advance by finding needed authorised forms as separate searches to use in the query.
11:15 thd hdl: the lazy user who did not know any of the authorised forms could have a faster search.
11:15 hdl Are you telling me that if you search for unused form, you will find them in biblios database ?
11:15 thd hdl: yes
11:16 thd if you add related authorities to a biblio meta-record
11:17 hdl How can you manage to get along heading forms and rejected forms in biblios ?
11:17 hdl Is it needed ?
11:18 hdl Is it through the frameworks ?
11:18 hdl Will we have USMARC and USMARC-A in the same database ?
11:19 thd hdl: meta-records could have a place for storing authority records related to the biblios.
11:21 thd hdl: yes we could already put them in the same database and distinguish them by 000/09
11:23 thd hdl: however, since 000/09 has multiple values for one basic type bibliographic, authorities, holdings, etc. all have subtypes with different letters using 000/06 is not efficient.
11:24 thd s/000\/09/000\/06/
11:24 thd leader position 06 distinguishes record type
11:26 thd hdl: there is the further problem that the same field number of a different type has a different meaning
11:26 thd hdl: overlap of field numbers with competing meanings is small for a given syntax, MARC 21, UNIMARC, etc.
11:27 thd yet it is real
11:27 hdl Anyway, zebra is already distingishing them throug the "framework you get them from.
11:27 thd hdl: how does the framework identify which are which?
11:28 hdl thd : you're rigth for the latter problem.
11:28 hdl thd: I don't know precisely how it works.
11:28 thd hdl: that is an indexation problem for example UNIMARC has different positions for 100 depending upon the type of record.
11:29 thd hdl: UNIMARC does not even store encoding in a consistent place in 100.
11:29 hdl But it seems that when you import things in zebra using a peculiar structure, it recognizes the structure and stores it somewhere.
11:30 thd hdl: yet you have to provide indexing files which describe the structure
11:31 thd hdl: in Zebra 2.0 indexing files can themselves be XSL for indexing XML.
11:33 thd hdl: XSL can contain indexing data that would otherwise have been in *.abs and other related files.
11:34 thd hdl: XPath statements describe how indexing is done.
11:35 thd hdl: so if you add some record type to a distinctive XPath you can have distinctive indexing.
11:36 thd hdl: the one thing that Zebra will not do without giving too much money to Index Data is that you cannot join indexes for different records on a common key of linking number and record number.
11:37 thd hdl: therefore if all the information needed for indexing is in one meta-record the lack of index joining in Zebra is overcome.
11:38 hdl ok.
11:38 hdl I just installed zebra2.0
11:39 hdl It crashes my old zebra base.
11:39 thd hdl: Also searching is much faster than joining indexes would be in SQL.
11:40 thd hdl: it is considered beta.
11:42 thd hdl: if you use Debian you can have both Zebra 1.3 and 2.0 simultaneously on the same system with a separately versioned name space.
11:43 thd hdl: Do you usually use Mandreva?
11:44 hdl still.
11:45 hdl But considering moving to ubuntu for some machines some day
11:46 thd hdl: tumer has it working but he has promised everyone not to commit his changes for Koha 3.2 to HEAD until you have synchronised HEAD with Koha 2.4 and 3.0.
11:52 thd hdl: there is no new Zebra 2.0 manual in PDF format, the HTML documentation has been updated for the new features.
11:53 thd hdl: as usual, the Zebra documentation is less detailed than might be helpful.
11:55 thd hdl: however tumer's xsl index files are easy to follow.
11:55 thd hdl: http://library.neu.edu.tr/koha[…]ce/koha2index.xsl
11:57 thd hdl: his XML schema enclosing MARCXML is also easy to follow.
11:58 thd hdl: http://library.neu.edu.tr/koha[…]ce/koharecord.xsd

| Channels | #koha index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary