IRC log for #koha, 2006-02-19

All times shown according to UTC.

Time S Nick Message
11:00 kados one, connection management like dbh
11:00 kados two, a single ZOOM::Package method on Koha that we can pass options too for handling all extended services in zebra
11:00 kados (because there are not very many of them)
11:02 paul not a bad idea, but if mike did not write it, maybe it means it's not that useful ?
11:03 kados sub zebra_package {
11:03 kados        my ($Zconn,$options,$operation,$biblionumber,$record)
11:03 kados ;
11:04 kados $Zconn contains the connection object
11:04 kados $options contains a hash like:
11:04 kados action => "specialUpdate"
11:04 kados record => $xmlrecord,
11:04 kados etc.
11:05 kados $operation is what we put in the $p->send($operation)
11:05 kados (create, commit, createdb, etc.)
11:05 kados $biblionumber and $record are obvious :-)
11:05 kados paul: does that make sense?
11:05 paul yes, that does.
11:06 paul & i think it's a good suggestion.
11:06 kados I'm still gun shy with my programming :-)
11:06 kados OK ... great, I will add a new sub zebra_package
11:19 kados paul: in fact, could 'biblionumber' and $record' be contained within $options?
11:20 paul I'm not strongly for putting everything in a hash in parameters.
11:20 paul it usually makes the API simpler at 1st glance, but it's less clear.
11:20 paul thus, I would put in a hash all what you can't be sure you'll have at every call.
11:20 paul and only this.
11:21 kados I'm not sure every call has a biblionumber and a record
11:21 kados for instance, a 'delete' would only have a biblionumber
11:22 kados createdb would have neither
11:23 paul thus i'm not sure having only 1 sub is a good solution.
11:24 paul maybe a package with some subs, each specialized, could be better.
11:25 kados I see
11:25 kados hmmm
11:26 paul kados : nothing to answer to my mail to koha-devel subject : "testing tool ?" and "yahoo tool" ?
11:26 kados I looked at them
11:27 kados I think they could be very useful
11:27 kados we need to have a meeting soon about template design
11:27 kados to discuss katipo's plans for the new default templates
11:33 paul OK. will be here next week (monday 9PM for me)
12:05 thd paul: It is good that Yahoo is sharing so as not to be left behind by Google.
12:07 thd paul: are you still there?
12:07 paul yep
12:07 paul (4pm in france)
12:08 thd kados: Had been curious about whether any large vendors known in the US market library systems in France.
12:09 thd s/kados:/paul: kados/
12:10 paul ?
12:11 thd paul: Do Sirsi/Dynix , Innovative Interfaces etc. have systems marketed in France?
12:11 paul of course
12:11 paul with the same name as in US I think
12:13 thd paul: All of them and do they all use Recommendation 995 for the software that they market there?
12:13 paul in fact, their market don't need reco 995. I explain
12:13 paul in France, all major libraries (including ALL universities) MUST use SUDOC
12:13 paul (
12:14 paul they use sudoc tool (a proprietary sofware)
12:14 paul to catalogate theirs biblios in the sudoc centralized DB.
12:14 paul sudoc unimarc don't use 995 for items.
12:14 paul it uses 906/907/908 iirc.
12:15 paul then, every night, the sudoc ftp the new biblios on the library server
12:15 paul for integration into local ILS.
12:15 paul NO cataloguing is done in local ILS
12:15 paul (except for books outside the sudoc, but there should be none)
12:15 paul thus, every vendor has a "moulinette" to do SUDOC => local ILS
12:16 paul no need to support 995 or even sudoc unimarc.
12:16 paul .
12:16 thd paul: Do you have a link to the 906/907/908 standard that SUDOC uses?
12:18 kados paul: wow ... that is very efficient!
12:18 paul kados joking today ?
12:18 kados paul: France only catalogs items ONCE
12:19 paul except that the system has many critics :
12:19 kados ahh
12:19 paul * the libraries are paid when they create a biblio, but paid when they just localize an item in an existing biblio. Everybody pays at the end.
12:19 paul * the tool to catalogate is quite outdated.
12:20 paul * you must wait at least 1 day to see an item in your catalogue
12:20 kados ahh
12:20 kados paul: I have a quick question
12:20 paul in fact, ppl want now a tool to upload their biblios from their local ILS to sudoc.
12:20 paul but sudoc refuses for instance.
12:20 paul thd : no, no link
12:20 kados zebra allows updates to be performed with:
12:20 kados record, recordIdNumber
12:20 kados (sysno) and/or recordIdOpaque(user supplied record Id). If both
12:21 kados IDs are omitted internal record ID match is assumed
12:21 kados right now, we use internal record ID to match
12:21 kados do you anticipate us ever using recordIdOpaque or recordIdNumber for future kohas?
12:21 kados (maybe when we are using more than one record syntax?)
12:24 thd kados: do I understand correctly that the system is not now using a preset explicit record ID?
12:24 kados hmmm
12:24 kados I think we are currently just using the 090$c field in MARC21 $a in UNIMARC
12:25 kados paul: correct me if I'm wrong
12:25 thd paul: should we not use 001 now finally?
12:25 kados thd: is that the standard ?
12:25 thd yes
12:25 kados thd: (for MARC identifiers)
12:25 thd yes
12:25 kados thd: for UNIMARC and USMARC?
12:25 thd all MARC
12:26 kados that sounds reasonable
12:26 paul kados & thd : since Koha 2.2.3 (iirc), you can put biblionumber in 001 without harm
12:26 thd kados: Furthermore it works very well with authorities
12:26 paul in fact that's what I did for IPT
12:27 paul what had to be done was removing biblionumber/biblioitemnumber in same MARC field constraint that existed previously
12:27 paul I did it, so you can put biblionumber in 001 if you want.
12:27 paul but if you import biblios from an external source, you may want to let original 001 and put biblionumber anywhere else
12:28 thd paul there is a place already for the original system number.
12:29 kados is that why zebra distinguishes between recordIdOpaque and recordIdNumber?
12:29 paul yes, in unimarc also
12:29 kados I'm not sure when these are different
12:32 thd paul: MARC 21 035 is a repeatable field for Original System Number
12:34 thd paul: It is identical in UNIMARC
12:34 thd hello dce
12:34 dce htd: I have an answer to your question.  Follett book pricing is pretty consistent with most distributors I'm told
12:35 thd dce: I would imagine that it is.  What does that mean in relation to the publisher's list price?
12:37 dce No clue.  If you really want to know I can ask though.
12:40 thd dce: I have wanted to know the answer to both the question you answered before and that one for a long time but had never asked the right person.
12:42 thd dce: I assume your answer was US$ for 0.26, was it not?
12:44 thd kados paul: The SUDOC approach is similar to how OCLC works except that OCLC is supposed to be a cooperative owned by the member libraries.
12:44 paul (on phone)
12:46 kados thd: right
12:47 thd paul: SUDOC is totally private not a cooperative member organisation in theory?
12:50 thd paul: OCLC behaves as though it were a greedy private company even though it is actually a non-profit or not for profit membership cooperative.
12:52 dce thd: nod price was US$.  I'll ask about pricing compared to list and let you know
12:54 thd dce: Thank you.  Whenever I ask industry people who know the market as a whole they have told me that they are not allowed to say.
12:58 thd paul: Are you still on phone?
12:59 paul (yes)
12:59 thd paul: Let me know when you are back?
13:15 paul back
13:16 paul thd : you're right : SUDOC is something strange, private in fact.
13:16 thd paul: I am interested in behaviour where a record can added to the system, removed from the system for external manipulation if ever needed, and then added back to the system all the while preserving the same number that the system uses to track the record.  The system would not then increment a new number if it was an old record.
13:16 paul this is impossible in Koha for instance.
13:17 thd paul: Impossible now but what about in 3.0
13:17 thd ?
13:18 paul nothing planned on this subject.
13:18 kados thd: so currently, everytime a record is edited its id changes?
13:19 kados thd: or are you talking about something other than editing?
13:20 thd kados paul: Most library systems do this.
13:20 kados thd: i don't understand the functionality ...
13:21 kados thd: are you implying that when a record is edited or its status changed the record's ID increments?
13:21 kados thd: what is a practical use case for your desired behaviour?
13:23 thd kados: You can export a set of records and send them to OCLC or wherever for enhancement character set changes etc and then re-import them using the same record number.  It would be better if all systems could do this internally but they do not presently.
13:23 kados thd: it should be possible with ZOOM
13:23 kados you can specify both a user/client-defined ID and a system ID
13:23 kados as well as just pull the ID out of the MARC record
13:24 thd kados: Not if Koha does not support it.
13:24 kados thd: the subroutine I'm building to handle extended services will support this feature
13:24 kados as well as zebra's ILL :-)
13:25 kados I can't find a case scenerio for the client-defined ID or the system ID
13:25 kados thd: do you know when these would be used?
13:25 kados
13:25 kados search for 'recordIdOpaque' on that page
13:25 thd kados: Also, item identification numbers do not change when importing and exporting the same record on standard systems.
13:26 kados it's the only reference that explains those IDs that I can find
13:26 kados thd: right ... and they don't with Zebra either
13:26 kados thd: won't in Koha 3.0
13:27 kados thd: do you have any ideas for when we would use recordIdOpaque and when we would use recordIdNumber?
13:29 thd kados: user supplied IDs would be useful to preserve the 001 for its intended use between systems when migrating to Koha.
13:30 kados thd: right, I wonder if that's what they are intended for
13:30 thd kados: As well as the scenarios that I described above.
13:30 kados perhaps I should ask the Index Data guys
13:31 thd kados: Better to ask than to find out later there is another preferred purpose or means for what you need.
13:31 kados yep
13:34 thd paul: Do any of your customers catalogue with SUDOC now?
13:34 paul nope
13:35 thd paul: Would you have the potential to acquire those customers if you had a conversion for their 906/907/908?
13:35 paul me probably not. But ineo, yes, for sure.
13:36 thd paul: you are not ambitious? :)
13:36 paul and pierrick begins it's work in Ineo in 2 weeks. and it's 1st goal will be to write a "moulinette" to do things like this.
13:36 paul it's just that :
13:37 paul * I know the limit of my 2 men large company
13:37 paul * I don't want to become larger (even with pills ;-) )
13:37 paul * Ineo want to work on this market, and also want me as a leader with them.
13:37 paul so, no reason to be more ambitious than this ;-)
13:38 thd paul: You have an intern as well do you not?
13:38 paul an intern ???
13:46 kados now ... some breakfast :-)
13:49 paul now some week end ;-)
13:56 kados thd: I wrote a mail to koha-zebra
13:56 kados thd: asking some questions I had about ZOOM::Package
13:56 thd paul_away: sorry I had timed out without realising
13:57 thd paul_away: maybe I misread a #koha log.  An intern is a student or recent graduate usually who works on different circumstance than a regular employee and usually for a limited time.
13:59 thd paul_away: I will find you next week.  Have a pleasant weekend.
14:01 paul still here in fact.
14:01 paul right thd, but the student will be here only for 2 months.
14:01 paul maybe 2+2 in the summer.
14:01 paul so I don't count him.
14:02 thd paul: Google's scheme for scanning the pages of library books has interns doing most of the labour.
14:03 thd paul: You could have an army of interns :)
14:04 thd paul: I have met French interns in the US who fulfil their national service requirement by working at a foreign branch of a French company.
14:04 paul yes, but that was before our president decided to go for professionnal army => no national service (since 1997 iirc)
14:05 thd paul: In Anglophone countries interns are low paid or unpaid.  They are there to obtain the experience.
14:06 paul same in France.
14:06 paul (they CAN be paid up to 30% of SMIC -the minimum income for anyone in france-)
14:06 thd paul: Does SUDOC maintain a proprietary interest in their UNIMARC records?
14:06 paul if I don't mind, yes.
14:07 paul ok, this time, i really must leave ;-)
14:07 kados bye paul
14:07 kados have a nice weekend
14:07 kados thd:[…]ZOOM%3A%3APackage
14:07 thd see you next week paul
14:09 kados thd: check out the options for updating there
14:10 kados thd: I think that will answer your questions earlier
14:10 thd kados: One of my questions for Mike was about "Extended services packages are not currently described in the ZOOM Abstract API at They will be added in a forthcoming version, and will function much as those implemented in this module."
14:11 kados right
14:11 kados Mike is on the committee
14:11 kados currently, ZOOM is read-only
14:11 kados :-)
14:11 kados 'official ZOOM that is'
14:12 kados but since Mike is on the committee ... and Seb is influential in Z3950 as well
14:12 kados I'm sure they will adopt it
14:12 thd kados: This option is nice "xmlupdate
14:12 thd    I have no idea what this does.
14:12 thd "
14:12 kados yea, one of my questions to koha-zebra :-)
14:13 kados this is what I thought you would like:
14:13 kados "the action option may be set to any of recordInsert (add a new record, failing if that record already exists), recordDelete (delete a record, failing if it is not in the database). recordReplace (replace a record, failing if an old version is not already present) or specialUpdate (add a record, replacing any existing version that may be present)."
14:15 thd kados: That is exactly what I was hoping.  Now Koha needs to support that as well.
14:17 kados I"m looking at right now :-)
14:17 thd kados: Koha also needs to manage items in such a way that item ID is persistent.
14:17 kados right
14:17 kados right not it's not?
14:17 kados what is a case scenerio where item ID needs to be persistent?
14:18 kados (ie, why can't we just handle that with different statuses?)
14:18 thd s/persistent/persistent across imports and exports/
14:19 kados imports/exports ... hmmm
14:20 thd kados: you need a standard means of linking MARC holdings records with whatever the Koha DB is doing with holdings.
14:20 kados I see
14:20 kados will barcode work?
14:21 thd kados: Are you asking if barcodes would work as the persistent ID?
14:21 kados yes
14:23 thd kados: Maybe but an item needs an ID before a barcode has been assigned or even in the absence of a barcode for material that is not tracked with barcodes.
14:23 kados hmmm
14:23 kados itemnumber then
14:23 kados that will work right?
14:25 thd kados: some sort of itemnumber must work.  There needs to be a means to protect against its automatic reassignment when exporting and re-importing records.
14:26 thd kados: Also, it would seem advantageous to preserve item numbers between systems when migrating.
14:27 kados right
14:27 kados so I don't think zebra's equipped to handle ids at the item level
14:28 kados so what you're saying is:
14:28 kados we need to be able to export and import records and items
14:28 kados without affecting our management of them in Koha
14:28 kados ie, we don't want to delete a record every time we export it or import a new version
14:28 kados same with items
14:28 kados right?
14:28 thd kados: MARC uses a copy number concept but there can be items at the sub-copy level.  A serial title may have only one copy but many constituent items.
14:29 thd exactly
14:30 kados interesting
14:30 kados so we now have
14:30 kados record
14:30 kados copy
14:30 kados item
14:30 kados looks a lot like
14:30 kados biblio
14:30 kados biblioitem
14:30 kados item
14:30 kados :-)
14:32 kados I think zebra can handle everything we need to do at the record level
14:32 kados ie, that paragraph I posted above
14:32 kados so now all we need to do
14:33 kados is make sure copy-level and item-level import/export doesn't remove the item's id
14:33 kados can't we simply store the id somewhere in the record?
14:33 thd kados: yet if you export a record with fewer items and re-import with more items Koha should preserve the old item numbers and add item numbers for what needs them.
14:33 kados ie, can't we base our 9XX local use fields based on that model?
14:33 kados with record-level data, copy-level data, and item-level data?
14:34 kados or do the frameworks in koha not support this?
14:34 kados so Koha just needs to pay attention to certain item-level and copy-level fields before assigning item numbers
14:34 kados that shouhld be literally a 3 line change!
14:35 thd kados: you can do anything with MARC and the frameworks allow you to create whatever you need except that they need better support for indicators, fixed fields, etc.
14:36 thd kados: Remember the problem about breaking the MARC size limit when there are too many items in one record.
14:38 thd kados: Libraries that need to track every issue of a serial forever need multiple interlinked holdings records in MARC.
14:39 kados how was that acomplished?
14:39 thd kados: Koha could integrate everything in your grand non-MARC design for a holdings record but then you need to be able to export into MARC.
14:40 kados which wouldn't be hard once we had the functionality built in
14:40 kados it's the first step that's hard :-)
14:41 kados I don't understand copy-level vs record-level
14:41 thd kados: There are various linking points in MARC holdings records and various means for identifying the part/whole relationships.
14:41 kados thd: should I read the cataloger's reference shelf?
14:41 thd kados: Do you mean copy level vs. item level?
14:42 kados well ... I'm not really sure what the difference between the three are in terms of MARC holdings
14:42 kados I understand koha's heirarchy, just not the MARC holdings one
14:44 thd kados: Records have an arbitrary level that is distinguished by information referring to other records which may be textual information that is difficult for a machine to parse or maybe more machine readable and difficult for humans to parse and program.
14:45 thd kados: Machine readable can be difficult to program because human programmers have to parse it correctly first :)
14:46 kados right :-)
14:46 kados I'm reading the introduction to holdings now
14:46 kados OK ... right off the bat I can tell we're going to need leader management
14:47 thd kados: copies can be at whatever level the cataloguer wants to distinguish a copy.
14:47 kados the leader determines the type of record we're dealing with
14:47 kados whether single-part
14:47 kados multi-part
14:47 kados or serial
14:48 kados yikes, they're putting addresses in the 852!
14:48 kados that's really bad practice
14:49 kados so every time a library changes its address you need to update all the records?
14:49 kados or if an item is transfered to another address you need to update the address for that item?
14:49 kados that's absurd
14:49 thd kados: yes I lapsed a day or so ago I think and wrote 006 for 000/06 000/07.
14:50 kados
14:50 thd kados: remember this was designed in the mid sixties.
14:50 kados so 004 is used for linking records
14:51 thd kados: Do not expect libraries to start with MARC holdings records.  Mostly they will have bibliographic records with holdings fields included.
14:52 thd kados: The largest libraries with the best up to date library systems will have MARC holdings records.
14:53 kados An organization receiving a separate holdings record may move the control number for the related bibliographic record of the distributing system from field 004 to one of the following 0XX control number fields and place its own related bibliographic record control number in field 004:
14:53 kados what is the normal form of that control number?
14:53 kados ie, say we have a parent record
14:54 kados and three children
14:54 kados the parent record doesn't have a 004
14:54 kados and we need to 'order' the three children, right?
14:54 kados ie, first, second, third
14:54 kados does their 004 field contain whatever is in the parent 001?
14:55 thd kados: 001 from the related record is used.
14:56 kados so there is no way to order them
14:56 kados ok ... next question
14:56 kados so we get a record from oclc
14:56 kados we grab their 001 and put it in 014
14:56 kados then, say we want to check to see if that record is up to date
14:56 thd kados:  why is there no way to order them?
14:57 kados we query oclc for the number in 014 (in their 001) to find if there is an updated record?
14:57 kados is that the reason we preserve the incoming 001?
14:58 thd kados: I should immediately send you my draft email from months ago.
14:59 thd kados: It is all instantly relevant.
14:59 kados please do
15:02 thd kados: This weekend certainly, but to answer you want to preserve numbers from a foreign system, usually in 035 for bibliographic records to be able to refer back to the original record for updating if the original foreign record is changed or enhanced in some manner.
15:03 thd kados: If you have the original number, then perhaps the system can automatically check for an update on the foreign system.
15:05 thd kados: Existing systems do not actually do this to my knowledge but a human can query the original system with the unique record number from that system.
15:06 thd kados: Cataloguers have no time for this now but that is the purpose and time would not be a factor in an automated system.
15:07 kados thd: on the holdings page it says to preserve it in 014
15:07 thd or 035.  There options.
15:09 thd kados: sorry, you are right for the linkage number.  001 goes in 035.
15:11 thd kados: Although, I suspect 014 may be more recently specified than 035.
15:12 thd kados: Both are repeatable and could be filled with numbers from multiple systems.
15:15 kados right
15:16 thd kados: The utility of preserving holdings numbers from foreign systems would be primarily for libraries that catalogued on OCLC or some other remote system.  Or even added their own records to OCLC which SUDOC disallows.
15:21 thd kados: to continue about records : copies : items
15:22 thd kados: a record may have one or many copies associated.
15:23 thd kados: copies may be distinguished on any level at which the record is specified.
15:24 kados what does that last point mean?
15:24 thd kados: Within a copy level there may be subsidiary items.
15:25 thd kados: Provided the items are in the same record as the copy number.
15:26 kados I usually need examples rather than abstract descriptions :-)
15:27 thd kados: Copies are distinguished by number.  Yet a copy may be for a serial title and individual items may be for issues or bound years etc.
15:28 thd kados: Imagine a serial title where there are 3 copies.
15:29 kados meaning there have been three issues?
15:29 kados Vol. 1 No. 1, No. 2, No. 3 ?
15:29 dce thd: Here is the reply I got about pricing: "I think it depends.  Some vendors are cheaper for certain things.  Sara says she likes to order from Baker & Taylor because they have a better price.  Most of the school librarians I know do business with Follett--most of the Public Librarians I know do business with Baker & Taylor."
15:29 thd kados: 3 copies of the whole title for several years.
15:29 kados if a copy isn't an issue, what is it?
15:33 thd dce: Does Sara have info on the degree of discount Baker and Taylor gives to libraries and how much for Follett in relation to list price?  What things are liable to have better prices than others and what is the degree of variance?  This is something I have wondered about for years and I have worked for over 15 years in the retail book trade so I could tell you all about those discounts.
15:34 kados thd: in order to replicate MARC holdings support in Koha we need two things:
15:34 kados 1. a list of elements of MARC holdings
15:34 thd kados: a copy can be at any level including the entire run of a serial for many years as a single copy.
15:34 kados 2. a list of behaviours of those elements
15:35 kados and these lists should be short laundry lists
15:35 kados bare minimum needed to explain what they are and what they do
15:35 kados the third thing we will need
15:35 kados is a mapping between the list and MARC fields
15:35 kados thd: follow me?
15:35 thd yes
15:36 kados thd: so, I'd be happy to chat with you for the next few hours if we could compile such lists
15:36 kados I fear we're going to get lost in the forest without any clear objective :-)
15:36 kados because the standard is quite large :-)
15:37 thd kados: that list would be as long as the specification when it comes to the tricky parts.
15:37 kados well ... can we abstract out a subset that will cover 99% of cases?
15:38 kados thd: how do you recommend specing out full support for MARC holdings in Koha?
15:38 thd kados: you need to have a bidirectional mapping for standards compliance
15:38 kados thd: of course
15:39 kados and fortunately for us, zebra includes support for .map files so that shouldn't be too hard to do
15:39 thd kados: and encompassing a one to one mapping to the extent that nothing should be lost in conversion.
15:39 kados we just need a
15:39 kados right
15:39 kados it's definitely possible
15:40 kados we just need to invest the time into writing up the lists and how they interact
15:40 kados and write that into our framework
15:40 kados as well as our import/export routines
15:40 thd kados: certainly possible and certainly you can devise a better flexible structure for managing the data than MARC.
15:40 kados yep
15:42 thd kados: but if there was a high degree of translation need between MARC and Koha in real time the system would become bogged down under a heavy load with a large number of records.
15:43 thd kados: The presumption must be that real time translation will be only for the few records currently being worked on.
15:45 kados thd: exactly
15:45 thd kados: Translation is CPU intensive.  Your better record system needs a performance gain exceeding the overhead of translation.
15:47 thd kados: MARC was designed in the days when almost everything ran as a batch process.  People had very modest real time expectations.
15:49 thd kados: shall I continue with the record : copy : Item distinction?  If you miss that you miss the most fundamental element.
15:50 kados yes please do
15:50 kados but lets start making those lists
15:50 kados as soon as you're done
15:50 kados so we can then put things into practical terms
15:52 thd ok
15:53 thd kados: a single copy can be for a whole run of many years of a serial title containing many issues within just one copy.
15:56 kados ok ... strange choice of terms but I think I get it
15:56 thd kados: A copy can be assigned at an arbitrary point.
15:57 kados I see
15:57 thd kados: So imagine 3 copies in a single record for the whole run of a serial
15:58 kados would there be 3 because the MARC would run out of room for one?
15:58 thd kados: copy 1 is a  printed copy of loose issues in boxes
15:58 kados or for comletely arbitrary reasons?
15:59 kados ok
15:59 thd kados: We will presume everything fits for my example but that is practical problem that needs to be addressed for creating MARC records when the threshold is reached
16:00 kados ok
16:01 thd kados: copy 2 has each year bound separately.
16:03 thd kados: copy 3 is an a variety of full text databases providing coverage but they might have been linked to the hard copy.
16:04 thd kados: copy 1 designates individual issues as items but without a separate copy number.
16:05 thd kados: copy 3 designates individual years as items but again without a separate copy number.
16:06 thd kados: copy 3 is a jumbled mess because that is the world of agreements that often cover electronic database rights :)
16:07 thd kados: Individual items may or may not have separate barcodes.
16:07 thd kados:  Just for fun we may imagine that all items do.
16:07 kados hmmm
16:07 kados copies 1-3 cover the same years?
16:08 thd kados: wait the fun has only just begun
16:09 thd kados: the electronic database coverage is unlikely to be identical for full text unless the hard copies are relatively recent and even then gaps should be expected.
16:09 kados question:
16:09 kados copy 1 designates individual issues as items but without a
16:09 kados             separate copy number
16:09 thd kados: If you are lucky your vendor will tell you about all the gaps.
16:09 kados what does that mean?
16:11 kados what do you mean that it doesn't have a separate copy number?
16:11 kados why would it?
16:11 thd kados: In our record example MARC using $6 distinguishes copy numbers but can distinguish items at a lower level using $8 if I remember but we can check later.
16:12 kados I literally don't understand what it would mean to have a _separate_ copy number
16:12 kados it is in copy 1 right?
16:12 kados so 1 is the copy number
16:13 thd kados: yes all in copy 1
16:13 thd that is the copy number
16:13 kados so this sounds quite a lot to me like biblio,biblioitems,and items
16:14 kados at least where holdings is concerned
16:14 thd kados: except that it is arbitrary
16:14 kados or do we sometimes need more than three levels?
16:14 kados in our tree?
16:14 thd kados: Serials can be very complex and may use many levels
16:15 kados ok
16:15 kados so we need a nested set then
16:15 kados to handle holdings
16:15 thd kados: yes
16:16 kados do we need to do more than just map child-parent relationships?
16:16 thd kados: MARC has theoretical limits within one record.  Koha can do better for other types of data.
16:16 kados right
16:16 thd kados: siblings
16:16 kados relationships between siblings? more complex than ordering?
16:18 kados so we have:
16:18 kados one record-level ID which cooresponds to MARC 001
16:18 kados an arbitrary number of copy-level IDs which coorespond to which MARC field?
16:19 kados and an arbitrary number of item-level IDs which map to owhich MARC field?
16:19 kados welcome joshn_454
16:20 kados thd: then, we have relationships ... child-parent in particular
16:20 thd kados: a publication with a bound volume and a CD-ROM could be all described in one record or could have separate sibling records linked together.
16:20 kados thd: would they be linked to each other?
16:20 kados ie, each one of them would refer to the other?
16:20 kados or would they both refer to the parent?
16:20 thd kados: I have seen reference to sibling linking.
16:21 kados thd: sibling linking could become quite complex
16:21 kados thd: because they we have exceeded the abilities of a nested set
16:22 thd kados: Plan on finding cataloguing records in existing systems that require system interpretation to uncover the true relationships that you would want to map in Koha.
16:22 kados thd: I can track an arbitrary hierarchy with a nested set and even do sibling ordering
16:22 kados so for example
16:22 kados grandfather
16:22 kados father uncle
16:22 kados child1 child2 child3
16:22 kados where child1 is older than child2, etc.
16:23 kados and it would be easy to query to find who your siblings are
16:23 kados but I'm assuming we dont' want to do that every time we pull up a record :-)
16:23 kados (and father is older than uncle in the above case)
16:24 kados thd: your thoughts?
16:24 thd kados: They can always be mapped but you want to put them in a scheme that does a better job of making the relationships explicit than the cataloguer may have done.
16:25 kados I think that approach could be dangerous
16:25 kados ie if the relationships aren't pre-defined
16:25 kados I don't want to interpret what they should be
16:25 kados that's the realm of FRBR
16:25 thd kados: existing cataloguing records in the real world are often a mess of inconsistent practises.
16:26 kados thd: it's not the job of the ILS to fix cataloging inconsistancies
16:26 kados thd: my goal is to allow complex cataloging
16:26 kados thd: not to intuit it :-)
16:27 thd kados: Perl is for making easy jobs easy and hard jobs possible
16:27 kados thd: so let's stick to the here and now
16:27 kados thd: that's outside the scope of 3.0
16:27 kados thd: I have to draw the line somewhere :-_
16:27 kados thd: :-)
16:27 thd back to my example where the fun is just starting
16:27 kados thd: so far, it's just a simple hierarchy
16:28 kados thd: nothing particularly complex about it
16:28 akn hi guys, I'm back for more help; we had the z3950 daemon running about 2 months ago, now we can't seem to get it.  The processz3950queue doesn't give any output...seems to be waiting like tail -f (?)
16:28 kados thd: you have a table with a key on recordID
16:28 kados akn: there is a config file in that directory, did you edit it?
16:28 thd kados: this is largely a question of nomenclature so the documentation does not confuse the terms.
16:29 kados akn: and you should not be starting processqueue from the command line
16:29 akn kados: no
16:29 akn kados: how do you start it?
16:29 kados akn: use
16:30 akn kados: we did, with no visible results
16:30 thd kados: so we started with one record describing 3 copies
16:30 kados akn: if you're running it on the command line
16:30 kados akn: use
16:31 kados akn: if from system startup use
16:31 kados akn: and you need to edit the config file and add your local settings
16:32 kados thd: right, and it seems like a simple hierarchy to me
16:32 kados thd: with child-parent relationships
16:32 kados thd: nothing else
16:32 kados thd: sibling relationships can easily be derived
16:32 thd akn: also it needs to be started as root with the Koha environment variables set and exported.
16:34 thd kados: yes for nomenclature distinction though consider where we started with one record and then add subsidiary records.
16:34 kados thd: so here's our table:
16:34 kados holdings_hierarchy
16:34 kados columns:
16:34 kados itemId
16:35 kados (a unique ID for every element within the hierarchy)
16:35 kados recordId
16:35 kados (the 001 field in MARC, or another identifier in another record type)
16:35 kados lft
16:35 kados rgt
16:35 thd kados: copy2 in the parent record is still copy2 in that record but each volume covering one year can have its own copy number in subsidiary records.
16:35 kados actually, that might get hard to manage with such a large set
16:37 kados thd: that's still just a simple tree
16:37 kados thd: the MARC holdings folks have just managed to hide that fact
16:37 kados thd: by making it needlessly wordy :-)
16:38 thd kados: the level at which copy numbering operates depends upon whatever bibliographic level it is contained within or linked to in the case of a separate holdings record.
16:38 thd kados: one further problem for my example
16:38 kados thd: it is still and yet nothing more than a simple tree :-)
16:39 akn thd: variables are set/exported; we did have it running before
16:39 thd kados: someone has the clever idea of rebinding copy1 with loose issues in boxes.
16:40 kados thd: no problem, you just restructure the parent-relationship for those records
16:40 kados thd: it's not nearly as complex as it sounds
16:41 joshn_454 thd: what evironment vars need to be exported?
16:41 joshn_454 for the z3950 daemon
16:42 kados joshn_454: KOHA_CONF
16:42 kados joshn_454: PERL5LIB
16:42 thd kados: The structure itself is merely branches in a hierarchy the problem is reading in and out of MARC for correct interpretation, particularly when either there is only human readable text or the machine readable elements are so difficult for the programmer to interpret.
16:43 joshn_454 kados: how does KOHA_CONF differ from the KohaConf setting in the options file
16:43 joshn_454 ?
16:43 kados thd:  the first thing we need to do is create a framework for the hierarchy that works for managing holdings data
16:43 kados thd: ie a table :-)
16:44 kados joshn_454: KOHA_CONF is wherever your koha.conf file lives
16:44 joshn_454 okay
16:44 kados joshn_454: PERL5LIB should include the path to your C4 directory
16:45 thd joshn_454: maybe /etc/koha.conf for KohaConf
16:46 joshn_454 thd: right, that's what it's set to in the options file
16:47 thd joshn_454: Did you have it working previously?
16:47 kados joshn_454: notice I didn't mention KOHA_CONF :-)
16:47 kados thd: here's a table design that should work:
16:47 kados CREATE TABLE holdings_hierarchy (
16:47 kados        itemID, --unique
16:47 kados        recordID, -- this is the 001 in MARC
16:47 kados        parentID, -- the parent node for this node
16:47 kados        level,  -- level in the tree that this element is at
16:48 kados        status, -- we need to decide how to handle this element in Koha
16:48 kados status is a bit limited
16:48 joshn_454 kados: yes, it was working before
16:49 kados joshn_454: did you upgrade or something? what has changed?
16:49 thd joshn_454: had you changed any config files?
16:49 joshn_454 I'm not aware that anything's changed :-/
16:51 thd joshn_454: Test that root has the koha environment variables set.
16:51 kados joshn_454: you sure it's not already running?
16:52 thd joshn_454: echo $KOHA_CONF
16:52 kados ps aux |grep z3950
16:52 thd joshn_454: echo $PERL5LIB
16:53 kados thd: what do you think about that hierarchy chart
16:53 thd joshn_454: both commands as root
16:53 kados thd: the status would be used to determin how Koha manages that node
16:53 joshn_454 I copied the C4 directory into perl's vendor directory; do I still need the $PER5LIB?
16:54 kados joshn_454: no
16:54 joshn_454 k
16:54 kados joshn_454: but why you would do that is beyond me
16:54 joshn_454 I doubt $KOHA_CONF is set
16:54 joshn_454 bc I didn't know what I was doing
16:54 kados joshn_454: and will lead to confusing issues when you upgrade
16:54 kados joshn_454: delete it
16:54 kados joshn_454: then just do:
16:54 joshn_454 alright, I'll nuke it
16:54 kados export KOHA_CONF=/path/to/koha.conf
16:55 kados export PERL5LIB=$PERL5LIB:/path/to/real/C4
16:55 kados where the /path/to/real/C4 is the parent directory for C4
16:55 thd kados: The table needs more elements for tacking to MARC
16:55 kados thd: no, that's done with a mapping file
16:56 thd ok
16:56 kados thd: keep MARC out of my Koha :-)
16:57 kados now ... here's the real trick
16:57 thd : - )
16:57 kados what if we could come up with an ecoded scheme for representing the hierarchy
16:58 kados and put it somewhere in the MARC record
16:58 kados I need to do some more thinking about this
16:59 thd kados: Do you mean an encoding generated by a script?  Not something a cataloguer would be meant to edit?
16:59 joshn_454 okay, did that.  Now z3950-daemon-shell runs, but the search still doesn't work
16:59 kados thd: exactly
17:00 thd joshn_454: was search working previously?
17:00 kados joshn_454: what sources are you searching, are they set up correctly, and are you searching on something that is contained in them?
17:01 thd joshn_454: search for new titles each time but known to be contaned in the target database
17:04 joshn_454 didn't know about trying a new search every time
17:06 joshn_454 ah.  That works!  Tx!
17:10 thd joshn_454: if you search for the same ISBN Koha may have code that assumes you found that already.
17:11 thd joshn_454: Koha 3 will take the pain out of Z39,50 searching.
17:12 joshn_454 eta for koha 3?
17:12 thd Line 103 should not be as follows:
17:12 thd refresh => ($numberpending eq 0 ? 0
17:12 thd :"$bibid&random=$random")
17:12 thd If you find that,with an '0' after the '?' change it to the following:
17:12 thd refresh => ($numberpending eq 0 ? ""
17:12 thd :"$bibid&random=$random")
17:12 thd Now you should have empty quotes '""' instead of '0'.
17:15 thd joshn_454: See above for fix for line 103 for
17:15 thd /usr/local/koha/intranet/cgi-bin/z3950/ or wherever you put
17:15 thd z3950/
17:16 thd joshn_454: maybe may but it will not be production stable for a few months later.
17:17 thd joshn_454: I expect to commit at least an external to Koha Z39.50 client for 2.X prior to that time.
17:23 thd kados: are you still with us?
17:31 thd kados: your holdings table needs both a holdings and a bibliographic record key in case they are not identical as with a separate MARC holdings record.
17:33 thd kados: you need holdings_recordID and bibliographic_recordID.
17:35 thd kados: If that is not too much MARC in your Koha ( :-D
17:40 thd kados: without more MARC in your Koha, where do the semantic identifiers go to signify what the item represents as distinct from who its parent is?
18:01 kados thd: I'm here now
18:02 thd kados: MARC has 3 types of representations for holdings
18:02 kados yep, and all three can easily be done in our forest model
18:02 thd kados: the first and oldest being textual data
18:03 thd kados: The other two being allied with machine reability but not necessarily
18:05 thd kados: There are captions such as 1999, June, July, etc. and 1999, issue no. 1, no. 2, etc.
18:06 kados at this point, I'm not really worried at all about import/export issues
18:06 kados that's something to do when we have a client that needs this complexity
18:06 thd kados: Then there are publication patterns that show the relationships without having common names like captions.
18:06 kados what I want to focus on is the framework for supporting the complexity
18:06 kados so that I can sell the system to one of the libraries that can afford me to build the import script :-)
18:07 kados thd: does that make sense to you? :-)
18:08 thd kados: you asked about whether you can look in one place in MARC consistently for the hierarchy did you not?
18:08 kados yes
18:08 kados i do have another question too
18:09 kados internally, what holdings data do we need indexes on?
18:09 kados I can think of two:
18:09 kados status
18:09 kados scope
18:09 kados status could be a simple binary 1/0 to determine visibility
18:10 kados scope refers to the physical location of the items
18:11 thd kados: If holdings are all contained within one bibliographic record and not separate records then you have no nice parent child 001 and 004 values to consult.
18:11 kados it's an integer that's relevant to a given library system's branch hierarchy
18:11 kados thd: it's not going to be posisble to relate holdings to one another in zebra
18:11 kados thd: unless there's something in zebra I don't know about
18:12 kados thd: we'll have to do queries to relate the data
18:12 kados thd: (ie, give me all the records where the 004 is X)
18:12 thd kados: paul has a more complex status value than just 0/1 for UNIMARC Koha although the extra values are currently only for human reading.
18:12 kados I see what you mean though
18:13 kados it would be nice to do some kind of subselect in zebra
18:13 thd kados: What do you mean by subselect exactly?
18:14 kados so you go: find me all the records with the phrase 'harry potter', and check all the related holdings records and tell me which ones are available
18:14 kados I assume that's what you mean right?
18:15 thd kados: That is what every system has to do some how.
18:15 kados but I think I see one of the problems
18:15 kados suppose we import a bunch of records from one of these big libraries
18:15 kados say 20% of the MARC records are JUST holdings records
18:15 kados we have a problem :-)
18:16 kados because even if we do pull them into our shiny new forest
18:16 kados they will still be in zebra
18:16 kados ie, they are still just MARC records
18:17 kados so we'll have holdings data in Koha in our forest
18:17 kados and we'll have holdings records in the MARC data in zebra
18:18 kados unless we delete all the holdings data in the MARC on import
18:18 thd kados: You have to pull the data out of the dark MARC forest and put it into  your superior shiny Koha representation forest.
18:18 kados in which case we'll have a bunch of strange-looking MARC records stranded in our zebra index :-)
18:18 kados devoid of any substance :-)
18:18 kados thd: right!
18:19 kados thd: and also be able to put it back into MARC on export
18:19 thd kados: Those records will be fine there Koha, needs to know how to update them as necessary.
18:20 thd and create new ones as needed so that MARC is available for sharing with the world that does not know how to read the shiny forest format.
18:22 kados is it fair to say that we only need to update the status of the root MARC record (bibliographic) if there are no copies or related copies available?
18:22 thd kados: When the shiny forest format becomes the lingua-franca for holdings data then you will have less need to share in MARC.
18:23 kados thd: I can't think of a reason we'd need to store the status of each and every item in the MARC in zebra
18:23 kados thd: I think we could just set a visibility flag
18:24 kados thd: that would be turned off when the system detects that all the items of are lost or deleted
18:24 kados thd: does that make sense?
18:24 thd kados: Why do you need any stausin MARC had we not settled that months ago?
18:25 kados there is a reason
18:26 thd kados: Or do you need one boolean to inform Zebra not to return a record for a search.
18:26 kados some librarians don't want MARC records to show up if their status is lost or deleted
18:26 kados ie, people don't want to find a record that has no items attached to it
18:26 kados (well, that obviously doesn't apply to electronic resources)
18:27 kados you just need to tack on an extra 'and visible=1' to your CQL query
18:27 thd kados: There are also secret collections and material at many libraries.
18:27 kados to only retrieve records for items that the library wants patrons to see
18:29 kados thd: are their any standards for 'status'?
18:29 kados thd: that you know of?
18:30 thd kados: do you not need levels of privileges as to what is searched and not.  Not for the simplest cases but for the special collections that require privileged access to search.
18:30 kados hmmm
18:30 kados good point
18:30 kados some kind of access control would be nice
18:30 thd kados: There are MARC fields for restrictions and visibly already.
18:31 kados how are they structured? anything useful?
18:31 thd kados: I expect actually knowing what is in the collection or not is a big issue at many corporate libraries.
18:32 thd kados: Although the public library in the city where I grew up did not want just anyone even to know about the older material they had because of theft problems.
18:33 kados that is a good point
18:34 thd kados: 506 - RESTRICTIONS ON ACCESS NOTE (R) is one.
18:38 thd kados: Somewhat related are MARC fields designed to be hidden from patrons showing who donated the money or whatever.
18:39 kados thd: 506 is anything but machine readable
18:39 thd kados: Fields that may or may not be public depending upon an indicator.
18:39 kados thd: I can't believe they even bother calling it machine readable cataloging
18:40 thd kados: you can fill it with only authorised values and its repeatable.
18:41 kados here's the example they give:
18:41 kados 506 Closed for 30 years; ‡d Federal government employees with a need to know.
18:41 kados that looks like free-text to me :-)
18:41 thd kados: you can make it machine readable or at least additional repeatable fields to what may already exist.
18:41 kados we could do the authorized values thing, but this is getting absurd
18:42 kados MARC really must die
18:42 kados I'll be back in an hour or so
18:43 thd kados: Where MARC has not defined values you can define your own in conformity with the standard.
18:43 thd kados: See you in an hour or so.
20:41 thd now we are both back kados
20:41 kados yep
20:41 kados I just read through OCLC's descriptions for the 506
20:42 thd kados: does OCLC do something special with them?
20:42 kados
20:42 thd s/them/it/
20:43 kados is OCLC's goal to become the world's largest ILS?
20:43 thd kados: OCLC has sometimes added their own extensions to MARC using unassigned subfields
20:43 thd kados: They are already ;)
20:43 kados hehe
20:44 kados I think we've got a solid idea in mind for how to represent the relationships
20:44 thd kados: at least they have fewer extensions to MARC that are outside the standard now
20:45 kados a table with the structure of an arbitrary forest using a combo of nested sets and adjacency lists will do the trick nicely
20:45 kados now ... I'm interested in figuring out how to represent the "entities"
20:45 kados and by 'entities' I mean all the components (levels) of a MARC record
20:45 thd kados: what is an adjacency list
20:45 thd ?
20:46 kados thd: what I showed earlier
20:46 kados that table was a simple adjacency list
20:46 kados it's a table that has:
20:46 kados nodeId
20:46 kados parentId
20:46 thd for the nearest adjacent node
20:47 kados anyway ... we need to come up with a way to represent all the information we'll need to know about one of the entities
20:47 thd kados: what aspect of the levels do you mean?
20:47 kados if we were designing a hierarhical tree based on the sctucture of a corporation
20:48 thd kados: all the semantic values for volume etc.?
20:48 kados what we have now is a way to represent the relationships (CEO, president, vice president, etc)
20:48 kados what we need is a way to represent the 'entities' that fill those roles
20:49 kados ie, name, address, salary, etc.
20:49 kados alas, we're not designing such a tree :(
20:49 kados so ... what kinds of information do we need to know about our entities?
20:50 kados I suppose things like volume probably weigh in
20:51 thd kados: we can fill all the standard factors but users need to be able to add others that we can never enumerate in advance
20:52 thd s/users/script needs to read from some unknown data/
20:55 kados yep
20:55 thd kados: we have enumeration representing various periods
20:57 kados we need a framework whereby we can build 'groups' of holdings information
20:57 kados and then attach those groups to levels in the holdings relationship hierarchy
20:59 kados we also need to be able to attach the group to specific holdings records
20:59 kados so we need a 'groupId' in our relationship hierarchy
21:01 kados if it's defined it takes precidence over the group assigned by level
21:02 thd kados: how do groups differ from hierarchy levels themselves?
21:02 thd your last sentence confuses me.
21:02 kados ok ...
21:02 kados so you have the representation of the relationship
21:03 kados but that's different than information about each element in the hierarchy (each node)
21:03 kados for instance
21:03 kados you want to be able to move copies from one bib record to another
21:03 kados if the information and relationship are seperate you only need to update one field
21:04 kados to make that transformation
21:04 kados the entity itself will contain things like:
21:04 kados status
21:04 kados barcode
21:05 thd kados: what chris refers to as groups in non-MARC Koha that MARC Koha killed.
21:05 kados item-level call number
21:05 kados maybe, I'm not too familiar with how that works
21:06 kados from what it sounds like the only reason MARC killed the groups
21:06 kados was because the default framework in Koha wasn't set up completely
21:06 kados it sounds like we could have easily defined another level within the record
21:06 thd kados: you could reassign items to groups for various purposes.
21:07 kados yep
21:09 thd groups is not the table name but chris calls it that for understanding how it was envisioned
21:10 thd Unfortunately, a fixed set of bibliographic elements were tied to groups so MARC Koha killed them.
21:11 thd s/killed them/killed their flexible use/
21:12 thd kados: my revised holdings doc provided a way to fix this in 2.X
21:14 thd kados: you saw that suggested change before with virtual libraries.  Very convoluted hack on what was currently in MARC Koha.
21:15 thd kados: continue with your example. I think I interrupted you.
21:17 thd kados: what information would be in the group table?
21:21 thd kados: Is the group table the entity table?
21:24 thd kados: are you there?
21:25 kados yep
21:25 kados I'm unclear
21:26 kados I need to talk to a real database designer :0(
21:26 thd kados: do you mean chris?
21:27 kados yea, just sent him an email
21:28 thd kados: About what are you unclear?
21:29 thd kados: you were the one describing :)
21:31 kados well, I begin to see the problem
21:31 kados and I have a notion that there is a solution :-)
21:31 kados but I'm not sure if I've landed on it
21:31 thd chris fulfilled his contract and is now off enjoying his summer weekend in NZ.
21:33 thd kados: There is certainly a solution much better than what MARC provides.
21:35 thd kados: performance is related to the number of joins required to fulfil a query.
21:39 thd kados: there may be multiple solutions that function.  You need the one that scales best for arbitrarily large data sets where the data is not nice and regular the way it never is in the real world.
21:46 thd kados: I have the answers for you.
21:46 thd kados: Or at least the bibliography that I had compiled.
21:46 thd Celko, Joe. Chapter 28 Adjacency List Model of Trees in SQL ; Chapter 29
21:46 thd Nested Set Model of Trees in SQL : Joe Celko's SQL for smarties : advanced
21:46 thd SQL programming. 2nd ed. San Francisco : Morgan Kaufmann, 2000.
21:46 thd Celko, Joe. Joe Celko's Trees and hierarchies in SQL for smarties. San
21:46 thd Francisco : Morgan Kaufmann, 2004.
21:46 thd Kondreddi, Narayana Vyas. Working with hierarchical data in SQL Server
21:46 thd databases : Narayana Vyas Kondreddi's home page. (Aug. 12, 2002)
21:46 thd[…]ver_databases.htm .
22:03 kados thd: I've got the Celko books
22:04 thd kados: I am too poor to have them now.
22:04 kados :-)
22:06 thd kados: I won big prizes for book buying from O'Reilly and then could not get my employers to pay me after that.
22:08 thd kados: At least you have all the right references.  I had divested myself of SQL books a few years ago when I realised that SQL was the wrong model for bibliographic databases.
22:10 thd divested means I sold them in my bookshop and saw little reason to retain them at that time.  Yet I never had the Celko books.  I had some unstudied good books by Date.
22:11 pez hello
22:11 thd hello pez
22:12 thd hello jOse
22:12 j0se : )
06:30 osmoze hello

| Channels | #koha index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary