IRC log for #koha, 2006-08-18

All times shown according to UTC.

Time	Nick	Message
13:05	owen	Hi shedges
13:07	owen	http://vielmetti.typepad.com/s[…]zon_library_.html
13:07	kados	hey shedges
13:07	dewey	rumour has it shedges is with us for a few minuts
13:08	kados	hehe
13:08	kados	wow
13:08	kados	that's big news
13:20	shedges	hey, folks
13:25	thd	For a limited time, we're offering all processing for no additional cost. To take advantage of this offer, simply use the coupon code, AMZTRU733321, when placing your order. (There's no obligation when you sign up, and you control which orders receive processing.)
13:26	thd	owen: did you see that part of Amazon processing
13:26	thd	?
13:28	thd	owen: they are still missing library searching
13:29	thd	owen: Amazon can supply MARC records but they do not know how to search them
13:29	owen	to be honest, thd, I didn't look at it in detail
13:30	owen	But I'm not surprised...They've built an empire based on real-world data searching. Why bring MARC into it? ;)
13:30	thd	owen: I know Amazon has been buying MARC records for years. I am surprised they have only just now started distributing them with purchases to libraries
13:31	thd	owen: approval slips and their search system in general uses book trade data not library records for finding what you want to order
13:31	thd	on Amazon
13:33	thd	owen: they have no scheme for approval slips or searching by LCSH, LCC, or DDC
03:40	OrangeWindies	hello, anyone online?
05:21	thd	hdl: are you there?
08:28	hdl	hi thd
08:29	hdl	thd: is that pines
08:29	hdl	oops.
10:35	thd	hello hdl
10:36	thd	hdl: are you still there?
10:36	hdl	yes
10:37	thd	hdl: yesterday you seemed to have left with the thought that meta-records would be edited directly by the librarian
10:37	thd	hdl: meta-records as tumer had conceived them are mere containers for standard records
10:38	thd	hdl: the user or librarian only ever sees the real individual records
10:39	thd	hdl: the system silently fills the meta-records in the background using information contained in the standard records
10:40	thd	hdl: so no additional frameworks would be required for displaying and editing records
10:42	thd	hdl: additional coding effort would need to take advantage of meta-records so that they were filled with the correct standard records and indexed properly with XPath.
10:42	hdl	thd: It was not that but the fact that instead of modifying one MARC record at a time, we should modify four and four database.
10:43	hdl	Obviously, this will increase speed and security...
10:43	hdl	Provided that things are SURE and well coded.
10:43	hdl	and also VERY strictly structured.
10:44	thd	hdl: well only one record would be modified and then the modifications would propagate at various rates to accommodate either real time or batch performance
10:45	thd	hdl: the system should be able to function fairly much as it does now even if the meta-records are mostly unpopulated.
10:46	thd	hdl: meta-records, when populated would provide additional indexation
10:46	thd	hdl: the only need for them is to overcome indexation limitations
10:48	thd	hdl: so the system should be fully functional without whatever advantages meta-records may provide beyond the basic functionality that Koha already has.
10:49	thd	hdl: The idea initially would be to have an added platform for experimentation to get the scripts for filling meta-records right but as part of a working system.
10:50	thd	hdl: much effort would be required for filling some possible meta-record relationships.
10:51	thd	hdl: The structure would not be really absolute because it can be altered by changing the XPaths used in indexation.
10:52	hdl	thd : so then, you will double store information if I understand clearly.
10:52	thd	hdl: I expect to not to have the best meta-record design at the first attempt.
10:53	hdl	When updating biblio level, you will update collections.
10:54	thd	hdl: I am afraid that is unavoidable with Zebra indexing limits once you want to index authorities as well as bibliographic records.
10:54	hdl	Am I wrong ?
10:54	thd	hdl: I am not perfectly clear how tumer intends to use collections.
10:55	thd	hdl: I thought I understood that he would be building a Turkish union catalogue but I may have been mistaken
10:56	thd	hdl: collection is actually used as the top level tag for a single MARCXML record in the MARCXML schema.
10:57	thd	hdl: the collection provides the possibility of containing more than one record.
10:59	thd	hdl: what tumer has working already is indexing on a bibliographic record and multiple attached holdings records.
11:00	thd	hdl: a holdings record could be a bibliographic record with a linking field and field 995.
11:01	thd	hdl: using separate records makes the indexation of multiple holdings easier with XPath than merely repeating the holdings field in a bibliographic record.
11:03	thd	as separate records have more easily describable separate XPaths.
11:05	thd	hdl: double storing would only start to come in for the case of what tumer has already done if one combined the holdings of multiple institutions which had multiple MARC records for the same manifestation of a work.
11:06	thd	hdl: then both duplicate MARC records for the same material that were not necessarily identical records could be stored in the same meta-record biblio.
11:07	thd	hdl: if matching bibliographic records for the same material were actually perfectly identical only one would be needed so still no duplication.
11:08	thd	hdl: I saw duplication starting with better indexing for authority records.
11:09	thd	hdl: if you store authorities with the biblios to which they are related you can have much more efficient retrieval and better search possibilities using authorities.
11:11	thd	hdl: authorities could still be stored separately as the primary non-duplicated authority repository which functions as it does now.
11:14	thd	hdl: presently, in SQL we could simply join indexes to search using the non-authorised forms for authorities held at a given branch or for two non-authorised forms without choosing them individually before the query is run.
11:14	thd	hdl: not that we actually do that we build the query in advance by finding needed authorised forms as separate searches to use in the query.
11:15	thd	hdl: the lazy user who did not know any of the authorised forms could have a faster search.
11:15	hdl	Are you telling me that if you search for unused form, you will find them in biblios database ?
11:15	thd	hdl: yes
11:16	thd	if you add related authorities to a biblio meta-record
11:17	hdl	How can you manage to get along heading forms and rejected forms in biblios ?
11:17	hdl	Is it needed ?
11:18	hdl	Is it through the frameworks ?
11:18	hdl	Will we have USMARC and USMARC-A in the same database ?
11:19	thd	hdl: meta-records could have a place for storing authority records related to the biblios.
11:21	thd	hdl: yes we could already put them in the same database and distinguish them by 000/09
11:23	thd	hdl: however, since 000/09 has multiple values for one basic type bibliographic, authorities, holdings, etc. all have subtypes with different letters using 000/06 is not efficient.
11:24	thd	s/000\/09/000\/06/
11:24	thd	leader position 06 distinguishes record type
11:26	thd	hdl: there is the further problem that the same field number of a different type has a different meaning
11:26	thd	hdl: overlap of field numbers with competing meanings is small for a given syntax, MARC 21, UNIMARC, etc.
11:27	thd	yet it is real
11:27	hdl	Anyway, zebra is already distingishing them throug the "framework you get them from.
11:27	thd	hdl: how does the framework identify which are which?
11:28	hdl	thd : you're rigth for the latter problem.
11:28	hdl	thd: I don't know precisely how it works.
11:28	thd	hdl: that is an indexation problem for example UNIMARC has different positions for 100 depending upon the type of record.
11:29	thd	hdl: UNIMARC does not even store encoding in a consistent place in 100.
11:29	hdl	But it seems that when you import things in zebra using a peculiar structure, it recognizes the structure and stores it somewhere.
11:30	thd	hdl: yet you have to provide indexing files which describe the structure
11:31	thd	hdl: in Zebra 2.0 indexing files can themselves be XSL for indexing XML.
11:33	thd	hdl: XSL can contain indexing data that would otherwise have been in *.abs and other related files.
11:34	thd	hdl: XPath statements describe how indexing is done.
11:35	thd	hdl: so if you add some record type to a distinctive XPath you can have distinctive indexing.
11:36	thd	hdl: the one thing that Zebra will not do without giving too much money to Index Data is that you cannot join indexes for different records on a common key of linking number and record number.
11:37	thd	hdl: therefore if all the information needed for indexing is in one meta-record the lack of index joining in Zebra is overcome.
11:38	hdl	ok.
11:38	hdl	I just installed zebra2.0
11:39	hdl	It crashes my old zebra base.
11:39	thd	hdl: Also searching is much faster than joining indexes would be in SQL.
11:40	thd	hdl: it is considered beta.
11:42	thd	hdl: if you use Debian you can have both Zebra 1.3 and 2.0 simultaneously on the same system with a separately versioned name space.
11:43	thd	hdl: Do you usually use Mandreva?
11:44	hdl	still.
11:45	hdl	But considering moving to ubuntu for some machines some day
11:46	thd	hdl: tumer has it working but he has promised everyone not to commit his changes for Koha 3.2 to HEAD until you have synchronised HEAD with Koha 2.4 and 3.0.
11:52	thd	hdl: there is no new Zebra 2.0 manual in PDF format, the HTML documentation has been updated for the new features.
11:53	thd	hdl: as usual, the Zebra documentation is less detailed than might be helpful.
11:55	thd	hdl: however tumer's xsl index files are easy to follow.
11:55	thd	hdl: http://library.neu.edu.tr/koha[…]ce/koha2index.xsl
11:57	thd	hdl: his XML schema enclosing MARCXML is also easy to follow.
11:58	thd	hdl: http://library.neu.edu.tr/koha[…]ce/koharecord.xsd