11:00 pierrick_ OK, I try your documentation and tell you later where I'm blocked ;-)
11:00 kados sounds good :-)
11:01 kados (of course, you must replace references to sourceforge with savannah)
11:01 pierrick_ (I've already done the correct checkout)
11:02 |hdl| pierrick_: for zebraserver, you have to change the call : zebrasrv localhost:2100/yourbase
11:02 |hdl| rebooting
11:04 kados morning owen
11:04 owen Hi
11:05 kados owen: this afternoon I'll be setting up the new Koha server for NPL
11:45 |hdl| hello again.
11:45 |hdl| héhé
11:45 paul ca marche !
11:47 |hdl| je shuis en franchais chuiche, mais an utf-8 :)
12:00 |hdl| kados :
12:00 |hdl| DBD::mysql::db do failed: Can't create table './kohazebra/#sql-cf8_d.frm' (errno: 150) at ../../updater/updatedatabase line 973.
12:01 kados |hdl|: what is this caused by?
12:02 |hdl| my $sql="alter table $table ADD FOREIGN KEY $row->{key} ($row->{key}) REFERENCES $row->{foreigntable} ($row->{foreignkey})";
12:02 |hdl|                        $sql .= " on update ".$row->{onUpdate} if $row->{onUpdate};
12:02 |hdl|                        $sql .= " on delete ".$row->{onDelete} if $row->{onDelete};
12:02 |hdl|
12:02 kados you have mysql 4.1?
12:02 |hdl| yes
12:02 paul mmm... sounds like a mysql corruption
12:02 paul (or permission)
12:03 kados maybe myisamchk?
12:04 |hdl| is not a Myisam DB :/
12:05 |hdl| InnoDB :)
12:06 |hdl| only bibliothesaurus is MyISAM
12:07 paul don't worry :
12:07 paul drop table bibliothesaurus !
12:07 paul it's useless
12:17 |hdl| paul kados : didyou know convmv ?
12:17 |hdl| convmv is a PERL utility which converts filenames to a certain char set for me UTF-8
12:43 |hdl| and get it into zebra..
12:47 kados I didn't know of convmv
12:48 kados but hdl, the latest MARC::File::XML will automatically convert your MARC records to utf-8
12:48 kados and since it's the primary way we currently bulkmarcimport this should already happen
12:49 kados I am 90% sure that that utf-8 problems are in fact just the system locale
12:49 kados though I haven't had a spare moment to test this
12:58 |hdl| I spoke about this utility 'cause it is PERL and deals with encoding.
13:06 kados |hdl|: try changing your system locale to utf-8
13:06 kados |hdl|: i suspect it will work perfectly after this
13:06 |hdl| I did.
13:06 kados still not working?
13:06 |hdl| Always my warning Wide character in print.
13:07 kados why does it work fine on
13:07 |hdl| when using
13:07 kados using rel_2_2
13:07 kados the only difference that I can tell is that head uses utf-8 encoding in mysql
13:07 |hdl| Americans donot have the same fancy characters as french :)
13:07 kados and I don't know why that is necessary
13:08 kados |hdl|: but it works with chinese or french in
13:08 kados |hdl|: try it yourself
13:08 kados |hdl|: add a new branch with a french name (with fancy characters)
13:09 kados[…]admin/
13:11 |hdl| It is not a problem of base nor characters in base.
13:12 |hdl| It is when writing UTF-8 files on disk for import into zebra :)
13:12 |hdl| In my case, at my point :)
13:16 kados back
13:19 |hdl| open F,">:utf-8","$filename";
13:26 |hdl| no.
13:28 kados writing from where?
13:28 kados what are you doing exactly?
13:33 paul update_zebra_idx
13:34 paul to generate XML biblios in files, to be able to reindex zebraidx update biblios/
13:40 kados why would you want to run zebraidx from the command line?
13:41 kados if biblios already exist in Koha, it should just be a matter of exporting as MARC::Record, converting to a MARC::XML::File object and indexing directly in zebra
13:41 kados no need to write the files to disk
13:42 paul kados : |hdl|tried that because the zoom update_zebra was very slow.
13:42 kados right
13:42 paul maybe he should try the new one (and copy your zebra.cfg improvements to unimarc)
13:42 kados i will commit the new
13:42 paul (like shadow register)
13:43 kados shoudl be 5 times faster
13:43 kados but still quite slow in fact
13:43 kados we are still working on this
13:43 kados (shadow registers are probably required in fact)
13:45 kados committed
13:45 kados my load tests indicate that the slowest part of the import
13:46 kados is converting from MARC::Record to MARC::XML::File
13:46 kados especially the conversion from MARC-8 to UTF-8
13:46 paul which xml parser do you use ?
13:46 paul (as the pure perl one is really slow :-( )
13:46 kados I only use MARC::XML::File which uses MARC::Charset
13:47 kados can I change the default xml parser somehow?
13:49 paul I installed another one (search a mail about this on koha-zebra, or zebra, or perl4lib, from me) and it has been automatically choosen
13:50 kados ok ... I will
13:53 paul holidays / news commited on HEAD
13:55 kados great! I will check it out asap
13:58 paul (wait a minut, some missing files)
14:06 kados thd-away: are you present?
14:11 |hdl| kados : where are you zebra imporvements ?
14:13 kados |hdl|: just to
14:13 kados |hdl|: commented out the 'search' when checking for a Zconn
14:14 paul_away bye & see you on monday
14:14 |hdl| bye paul_away
14:14 kados bye paul_away
14:14 osmoze i m following paul, bye all
14:27 |hdl| Why is there always Record 0 Type XML ? Is this normal ?
14:28 |hdl| in zebrasrv log.
14:31 kados I don't know
14:31 kados |hdl|: I can't remember if I tested with the new subroutines
14:31 kados |hdl|: I will check
14:32 |hdl| seems it works. But I am waiting for the results.
14:32 kados it does not use new routines
14:33 kados in fact, it should be fixed
14:33 kados I will do so and commit immediately
14:33 pierrick_ I'v encountered many problems installing ZOOM (because the available version was not compatible with my yaz version) and less than 50% of test during "make test" were satisfied :-/ I've run "make install" to finish the installation. I haven't done symbolic links yet. I'll do them on monday (or this week end if I feel like working on Koha ;-) See you on monday, enjoy your week-end
14:34 kados pierrick_: you will need the most recent versions of yaz and zebra
14:34 kados pierrick_: and the tests will fail
14:34 kados pierrick_: because currently the test server that it runs queries on is down
14:35 kados pierrick_: but dont' worry, it works fine
14:35 pierrick_ (I realised that and downloaded the last version of yaz, ZOOM and zebra and compiled them from source)
14:35 kados pierrick_: you MUST upgrade to latest version of yaz and zebra for perl-ZOOM to work
14:35 kados pierrick_: if you run debian this is quite easy
14:35 kados pierrick_: as index data maintains a deb repo
14:36 pierrick_ I don't really run Debian at work (because my laptop graphic chipset was not very well recognized), so I installed Ubuntu
14:37 pierrick_ ... installing from source is not a problem. updating may be one...
14:38 kados pierrick_: I should be around most of the weekend if you need help
14:38 kados pierrick_: Monday I'll be in and out
14:38 kados pierrick_: but here for the Koha meeting
14:38 kados pierrick_: (will you attend it?)
14:38 kados pierrick_: (it's great to have you on board btw!)
14:38 kados pierrick_: (I look forward to working with you!)
14:38 pierrick_ (Paul told me it was at 21h, french hour)
14:39 pierrick_ (I hope Erwann, my 7 months son, will be sleeping)
14:40 pierrick_ (very happy to work on the project, meeting with Paul was great, I learned many things at once)
14:41 pierrick_ bye :-)
15:06 thd kados: I am present again
15:07 kados thd: I'm working on a opac authorities search
15:07 kados[…]
15:07 kados still quite early in the dev :-)
15:08 kados my goal is to get it to be as functional as /
15:08 thd kados: did you understand the problem that I had identified with subjects in MARC 21 that could need multiple $9s ?
15:08 kados not fully
15:09 kados I am currently trying to expand the thesaurus frameworks
15:09 thd kados: Did you read the full MARC 21 authority doc?
15:10 kados not all of it, but much of it
15:10 thd kados: What are you adding to the thesaurus?
15:10 kados leader and fixed fields :-)
15:11 kados (I hope they are supported
15:11 thd kados: Oh, I had imagined that you were adding columns to the controlling table
15:11 kados do we need more columns?
15:12 thd kados: We do for all frameworks to work very well but that is not a problem for minimal working.
15:13 kados I'd like to see how far we can take the current model
15:13 kados so that 2.2.6 at least supports what Koha is capable of now
15:13 thd kados: It would still be the current model but we should get to minimal working first
15:13 kados of course, once I understand what it can currently do and what the limitations are, that will enable me to decide on a path for 3.0
15:13 thd with the current columns if possible
15:14 kados sounds like a plan
15:14 kados[…]03130510&PID=3574
15:14 kados I'm going to add that record to Koha
15:14 kados and see what I can do with it
15:14 kados but first I need to add the frameworks support
15:15 kados then, maybe tomorrow, I will begin working on
15:17 thd kados: how would you get an external record in without doing something?
15:18 kados i will hand code it :-)
15:18 kados well ... copy/paste
15:19 kados thd: in your opinion, should the tag 003 be marked mandatory? or should the subfield @ be marked mandatory?
15:19 thd kados: I cannot seem to get LC to open it for me.  What is in the 1XX for that record?
15:19 kados 953716
15:20 thd kados: if the field is a control filed then both should necessarily be mandatory.
15:21 thd kados: I think that you gave me the 001 but what is the 1XX?
15:21 kados Lewis, C. S.
15:21 kados sorry
15:25 kados thd: is the marc21 leader plugin going to work for authorities too?
15:25 thd kados: well, with modification
15:25 kados what modification will I need to make?
15:26 thd kados: 000/06-07 at the very list are not the same
15:26 kados ok ... I'll create a new plugin right now
15:27 thd kados: you do not need to know the media type of a person or a concept
15:28 thd kados: the starting point should be the bibliographic plugin
15:28 kados yep
15:32 kados I wonder why a tag would ever be given an authorized value
15:41 kados owen-away: when you get back ... I'm still having trouble adding an authority type using the npl templates
15:51 thd kados: the default should be 000: #####nz##a22#####o##4500
15:52 thd where # is blank and ought to be filled by MARC::Record for most cases.
15:52 thd s/cases/positions
15:52 thd kados:
15:52 kados k
15:53 thd kados: The authorised values that you were most likely looking at were for the indicators
15:53 thd kados: That is something where more columns are needed
15:54 kados why?
15:54 kados[…]
15:54 kados I have cataloged it to the best of my ability given the limitations in the current auth editor
15:55 kados (seems it's in an even worse state than the MARC editor)
15:55 thd kados: More columns to support separation of the indicators, and plugins instead of merely an authorised value list collectively for both independent indicators
15:55 thd kados: No one uses it therefore no fixes have been applied
15:55 kados I will fix it :-)
15:56 kados ok, so now I will find all books by C.S. Lewis in this collection
15:56 kados and change the author to use the authorized value
15:57 thd kados: paul uses building from the bibliographic record and no one has the time to create their own references and tracings, especially on a buggy editor.
16:00 thd kados: remem,ber that the authorised value should be $a Lewis, C. S. $q (Clive Staples), $d 1898-1963  not merely $a Lewis, C. S.
16:00 kados ?
16:01 kados I believe that's what I entered
16:01 kados[…]
16:01 thd kados: I was just reminding you about the $a limitation currently.
16:01 kados or do you mean that the thesaurus plugin does not currently put in the $q and $d?
16:01 kados ahh ... right
16:01 kados so I'll need to fix that
16:02 kados there is so much to fix :/
16:02 kados how can any library use this?
16:02 kados it baffles me :-)
16:03 kados so first, can tag 100 in a bib record have a $q and a $d?
16:03 thd kados: remember, they are using this where one library is not even using 200 which is the UNIMARC equivalent of 100
16:04 kados ok ... $q and $d added to default MARC framework on
16:05 thd kados: that library is only entering the statement of responsibility in the title field, which is not supposed to be an authority controlled field.
16:05 kados thd: it works!
16:06 kados thd:[…]
16:06 kados it filled in $q and $d automatically
16:06 kados !!!
16:06 kados wohooo!
16:06 kados this IS a nice feature :-)
16:06 kados I can't wait to show it to a client :-)
16:06 kados but first we must prettify it :-)
16:07 thd kados: $q and $d are not the only considerations
16:07 kados of course not
16:08 thd kados: you should not necessarily have it fill every subfield, some could be special
16:09 thd kados: sorry I wrote that incorrectly
16:09 kados thd: 100$a should 'search also' 100$q right?
16:10 thd kados: you should have it fill every subfield but you may need protection for the value returned used in searches except that it would find $9 and not matter.
16:11 thd kados: yes it should but it does not matter if $9 is there and populated.
16:11 thd kados: oh except that the OPAC is not checking authorities.
16:12 kados it can now
16:12 kados well ... it will very soon
16:12 thd kados: so until the OPAC always uses authorities for every Koha install those other things are important
16:12 kados[…]
16:13 kados do a search on Lewis
16:13 kados yay, it's working! :-)
16:15 thd kados: you have not adjusted the code enough for the OPAC
16:16 kados of course not
16:16 thd kados: obviously the edit authority record should become view authority record in MARC
16:16 kados yep
16:17 kados don't worry, it will :-)
16:17 thd kados: but the 6 bilio link must still be linked to the intranet file
16:17 thd kados: I get a file not found for
16:18 kados 'Used in' is working now
16:18 kados owen:[…]
16:18 kados owen: search for 'Lewis'
16:19 owen Excellent. Summary and 'used in' both working!
16:20 thd owen: Koha may surpass Sirsi while you blink :)
16:21 kados I chose CS Lewis because that's such a problematic search currently
16:23 thd kados: There are much mre problematic ones than him
16:24 owen Funny... the authority search is basically a 'browse' search.
16:24 thd kados: He at least has an English name
16:24 thd kados: Variant name transliterations are where the worst problems happen without authorised names
16:25 kados owen: it introduces something we don't have currently in collections like NPLs: a relationship between different records
16:26 kados owen: once you have an authorities catalog you can start doing all your author searches using the authorities search
16:26 kados owen: same with subjects
16:26 kados owen: so the patron then selects a given author/subject and then they get all the results that have that exact one
16:27 thd kados: Your collections do not seem to be old enough to see what has happened to transliterations of famous Russian names over time.
16:28 owen thd: we have problems enough with names like John Le Carre, with or without the accent on the E
16:34 kados w00t!
16:34 kados 'View Authority Record' now working
16:35 kados[…]
16:42 owen ?
16:55 kados owen: not working for you?
16:55 owen Where is view authority record?
16:56 kados search for Lewis
16:56 kados needs major template work :-)
16:56 owen Oh, okay. I swear it still said 'edit' last time I looked :)
16:57 owen How does this authority stuff fit into the general scheme of the opac?
16:59 kados well it should eventually be on the main search page
16:59 kados as a search type
16:59 kados
16:59 kados that's the goal
17:00 kados also, I think we'll need a syspref for 'OpacSubjectAuthorities' and 'OpacNameAuthorities'
17:00 kados those will let us turn on/off a true authorities search when clicking on a Name in the OPAC
17:00 kados rather than just a normal author search as is done now
17:01 kados (I think in fact that people are surprised when they click on the name now and get results for items without that name)
17:03 kados bbiab
17:04 thd kados: is useful but left anchored searches are a big limitation
18:19 owen Long drive ;)
18:20 kados heh
18:20 kados well I've been here a while actually
18:20 kados having some problems with installing the server
18:24 kados thd: you consorting with the enemy? :-)
18:25 thd kados: yes
18:26 thd kados: I cannot get an answer from LC CDS for two weeks now but I want to solve this well
19:04 thd kados: did you see the problem that I was describing on code4lib?
19:05 thd kados: I mean the subject authority problem
19:09 kados no I didn't
19:10 kados thd: I'll read my log
19:13 thd kados: more authority records at[…]nes-authority.mrc
19:13 kados thd: I don't see any discussion of problems with subject authorities
19:13 kados thd: maybe I didn't read back far enough
19:13 kados thd: yea got that already :-)
19:13 thd kados: busy channel
19:13 kados yep
19:14 kados this:
19:14 kados I am having trouble finding subdivisions for subject authority records from .  I find no 180 fields for example only 150.
19:14 kados edsu: I want to find the records with 180 etc. as opposed to 150.
19:15 thd kados: that and posts before and after
19:15 kados right
19:15 kados I don't quite comprehend the problem
19:15 thd kados: mostly after
19:15 kados but I think I see your goal
19:16 kados currently Koha's authorities system will only allow you to fill values within a single tag
19:16 kados unless I'm wrong
19:16 thd kados: the problem that I did not relate on code4lib is just what you said.
19:17 kados cool
19:17 kados I don't think it will be that hard to fix that
19:17 thd kados: With repeatable $9 ?
19:17 kados the trick will be how to map which tags in the auth record match which tags in the bib record
19:18 kados one way to do it is to hard-code the mapping in the thesaurus plugin
19:18 kados this would probably be the quickest solution
19:18 thd kados: it is multiple authority records for a single bibliographic field with subdivided subjects
19:18 kados but long-term I think we would want a way to easily configure it
19:18 kados I dont' really understand subjects in marc
19:21 kados so what possible tags in the bib records should get their values from the authority record for subjects?
19:21 kados ie, what is the mapping?
19:21 thd kados: Subjects can be subdivided. in MARC 21 as in UNIMARC.
19:22 thd kados: There is a difference in the definition for authority records relating to subjects between MARC 21 and UNIMARC.
19:24 thd kados: MARC 21 seems to specify different authority record types for various subdivisions of the main subject while UNIMARC has just one type of subject authority record no matter how a subject may be subdivided.
19:25 thd kados so given 650  #0$aArchitecture$zIllinois$z​Chicago$xHistory$vPictorial works.
19:27 thd kados: there ought to be a heading topical term authority using $a in 150
19:27 thd kados: that much is trivial
19:28 kados hmm
19:30 thd kados: then the two $z would be repeated geographic subdivisions in a 150 or maybe two separate 150 authority records.
19:31 thd if they were two separate then $z would not be repeated within the 181
19:31 thd s/150/181/ on the line before
19:37 thd kados: the general subdivision in $x would be in a separate 180 authority record that might include the $y as well, otherwise one more authority record is needed for the $y in a 155
19:41 thd kados: although I do see that the 155 examples have it all in one such as 155
19:41 thd ##$aDictionaries$xFrench$y18th century
19:44 thd kados: how do I search a MARC file for the presence of records containing a particular field?
19:48 kados use
19:48 kados it's in misc/ directory
19:48 kados -file marc.mrc |more
19:49 kados then you can use the / to search once it's there
19:52 thd ok
19:52 thd kados: I may have been looking in the wrong place
19:53 thd kados: I know MARC bibliographic well but I have not spent many years looking at authority records
19:53 thd kados: the answer may be in 7XX rather than 1XX
20:08 thd kados: The question is how to find the correct authority records to match a subdivided subject like the example and I think I know now.
20:13 thd kados: there is also the search help qualification that I had guessed for  "This release does not include ... Search access to form, genre, and topical subject subdivisions"
21:34 thd kados: are you around?
21:55 kados thd: I am now
21:55 kados thd: just got back from dinner
21:56 thd kados: the pines records do not have the right data for subject subdivisions
21:56 thd kados: Do you any from your authorities client?
21:57 kados let me see
21:58 kados here is an example:
21:58 kados NUMBER 28 =>LDR 00102nz   2200037o  4500150  0 _aChildren       _xPreparation for medical care       _xJuvenile literature
21:59 kados another:
21:59 kados NUMBER 32 =>
21:59 kados LDR 00089nz   2200037o  4500
21:59 kados 150  0 _aWitchcraft
21:59 kados       _zTexas
21:59 kados       _zTexas Hill Country
21:59 kados       _xFiction
21:59 kados another:
21:59 thd kados: records with type 180, 181,182,185
21:59 kados NUMBER 36 =>
21:59 kados LDR 00139nz   2200037o  4500
21:59 kados 151  0 _aWest (U.S.)
21:59 kados       _xSocial life and customs
21:59 kados       _xStudy and teaching
21:59 kados       _xActivity programs
21:59 kados       _vJuvenile literature
21:59 kados ahh
21:59 kados I see no 180s
22:00 thd kados: although I see $x and $v
22:00 kados I still don't fully understand subjects in MARC :-)
22:00 thd kados: and $z
22:00 kados yep
22:01 kados and repeated $x
22:01 kados though I don't really understand the significance of this
22:01 kados or what the expectations are for how an ILS will treat them
22:03 thd kados: I understand LCSH reasonably well for bibliographic records but I have a gap for how that applies perfectly well to authority files.
22:05 thd kados: There is someone from Canada on the autocat list who knows systems.
22:05 thd kados: These questions need to be asked Monday through Wednesday to get a good answer
22:07 thd kados: Very few to know systems actually do what we are intending so knowledge about this is scarce.
22:07 thd s/know/no/
22:09 thd kados: This really could be a great leap forward from how even the most sophisticated systems manage authorities.
22:09 kados cool
22:10 kados I've got about 20 minutes of work to do setting up a new cvs repo for openncip
22:10 kados then i intend to spend the rest of the evening on authorities
22:10 kados so bear with me for a bit
22:10 thd kados:I will fetch food
22:23 kados hey rach
22:23 kados good to hear from you
23:57 thd kados: is your cvs server merely down or are you struggling with how Debian has packaged cvs files?
23:58 thd s/files/related software/
00:07 kados I just can't get the server to bind to the proper port
00:09 thd kados: Are you trying to change from the default port?
00:11 kados nope
00:11 kados s'all good ... I'll just register a new project at SF
00:11 kados i've wasted too much time on it :-)
00:11 kados ok ... so ... authorities
00:12 thd kados: Is it as easy to register at Savannah?
00:13 kados I did ... but that was about two weeks ago
00:13 kados and I haven't heard back :/
00:13 kados ok ... first thing I'm going to do is commit my authorities work thusfar
00:13 thd I guess that means it is much more difficult
00:15 thd kados: Why not just create a test tag in the Koha cvs?
00:15 kados cause it's for openncip
00:19 thd kados: I had confused your authorities work thusfar with opencip
00:21 thd will be back shortly
00:26 kados k
01:07 thd kados: so I found what may be part of the solution for subject authorities
01:08 thd kados: are you still there?
01:11 thd kados: It is necessarily still multiple authorities but I have seen an applicable authority for my example 650
01:35 kados thd: I'm here
01:35 kados thd: been doing some cleaning in rel_2
01:35 kados so what is the solution (and first, in simple language, what is the problem?)
01:38 thd so it requires subdividing the 1XX
01:38 thd kados: chopping it right in the middle
01:42 kados I don't understand what you mean and I don't understand the problem fully
01:44 kados there are apparantly 7 types of authority records:
01:44 kados
01:46 thd kados: now I am off phone
01:46 kados though I do see that LOC has 'Title' authorities
01:46 kados but I don't see that on the MARC Authorities pages of the Cataloger's reference
01:47 thd kados: uniform tittles and series
01:47 kados thd: do you know why LOC has only four 'types'?
01:47 kados ahh ... types of headings
01:48 thd kados: by LOC you mean ?
01:48 kados yep
01:48 kados so should Koha support all types of headings?
01:48 thd kados: there is also the search help qualification that I had guessed for  "This release does not include ... Search access to form, genre, and topical subject subdivisions"
01:49 thd Koha should support authorities completely.
01:49 kados we currently don't distinguish between established and unestablished headings
01:49 kados i don't think
01:50 thd kados: by established you mean authority controlled and not authority controlled?
01:51 thd for established and not established?
01:53 thd kados: what are you distinguishing with established and unestablished?
01:54 thd kados: authority control applied, NACO heading used, authorised form used, or something else?
01:55 kados Established heading: A heading that is authorized for use in other MARC records as a main entry (1XX), added entry (700-730), or series added entry (440 or 800-830) field or as the lead element in a subject access (600-655; 654-657) field.
01:55 kados Unestablished heading: A heading that is not authorized for use in other MARC records as the lead element of a main, added, series, or subject access field. An unestablished heading may be a reference to a variant form of the established heading, a form of the heading used only for authority file organizational purposes, or a subject subdivision that is authorized for use with an established heading in an extended subject heading.
01:55 kados
01:59 kados is the unestablished heading where the 'see also' comes from?
02:00 thd kados: I believe that is a distinction between a heading conforming to the cooperative authorities database maintained by NACO for most AACR2 users and a local system heading
02:01 kados but it says that 'An unestablished heading may be a reference to a variant form of the established heading'
02:03 thd kados: well can you find an unestablished heading in records that you have?
02:03 kados no
02:03 thd kados: did you search for one already?
02:06 kados no :-)
02:08 kados I wouldn't know how to look for that
02:15 thd Iam trying to construct the regeex for the MARC dump but it stops one character past the 008
02:15 thd kados: does vim not use greedy matching?
02:17 thd kados: I already have my 17k copy of pines authorities open but you could use grep
02:18 thd kados: or awk or Perl if you like
02:23 thd kados: in any case you want to find the tenth position past ^008 and then [find whatever the confused documentation claims is correct]
02:23 thd the tenth position is 008/09
02:26 thd kados: well reading the documentation more closely, using the heading as the lead element is an established heading.
02:27 thd kados: searching 008 is no help in that case since about every possibility applies
02:27 thd kados: established headings start with $a
02:28 thd kados: unestablshed headings do not start with $a
02:29 thd kados: The lead element used for an authorised heading is always $a
02:32 thd kados: I had forgotten to escape my \+ in vim.  Why cannot every program agree on the one true regex standard?
02:34 thd kados: If you have the following regex in your MARC dump then you have an unestablished heading
02:36 thd kados: ^[14]\d\d....[^a]
02:37 thd kados: that matches nothing in the pines authority file
02:38 thd kados: oops should have been ^[14]\d\d...._[^a]
02:38 thd typo
02:39 thd as I originally typed it also matched nothing
02:40 thd kados: you do however have extended subject headings
02:40 kados is that grep ^[14]\d\d...._[^a] AUTH.mrc ?
02:41 kados if so I have none
02:42 thd kados: yes that regex is grep compatible
02:44 thd kados: although do you not need grep /regex/ AUTH.mrc with the '/' for using a regex instead of a string match?
02:51 thd kados: this will find your extended subject records with a vim regex ^[147]\d\d...._a[^_]\+\n[^_]\+_[vzxy]
02:52 thd kados: grep probably does not need escaping the \+ but I do not no if it can match across the newline
02:54 thd kados: there is an easier search for 008 that should match those in 1XX
03:03 thd kados can you find any matches for ^008..............[^a]   ?
03:04 thd kados: I have no matches for the previous regex they are all for ^008..............[a]
03:05 thd kados: which means that every record is an established heading record
03:07 thd kados: are you still awake?  I can tell you how this would work in my 650 example with architecture in Chicago
03:08 kados I'm here and listening
03:08 thd kados so given 650  #0$aArchitecture$zIllinois$z​Chicago$xHistory$vPictorial works.
03:09 thd kados: If we want to create that record ...
03:10 thd kados: we search for architecture history
03:10 thd kados: you can succeed in finding that at
03:10 kados lets actually do it in Koha
03:11 kados isn't that 'history of architecture'?
03:11 kados 'in illinois and chicago'?
03:11 thd kados: yet we do not have the authority records there although we could create them
03:11 kados lets create them
03:11 thd ok
03:12 thd kados: what is the test server?
03:12 kados go ahead and use
03:12 kados or
03:12 kados (but if you use it will not be in the live demo
03:13 kados and also, I haven't fully fixed authorities on
03:13 thd kados: well let us see if it will work where you have fixed them
03:13 kados k
03:14 thd kados: this cannot work yet.
03:14 thd kados: this requires 3 authority records
03:15 thd kados: I will create the first one and describe what should happen where the second and later ones should be used
03:15 kados why three?
03:17 thd There is no single authority record for $aArchitecture$zIllinois$zC​hicago$xHistory$vPictorial works.  except maybe in UNIMARC authorities where there is only one type of subject authority
03:17 thd kados: That is built from information contained in 3 authority records
03:19 thd kados: we can build all the required authority records but a change is needed to manage all 3
03:19 thd within a single 650 for the bibliographic record
03:20 thd kados: shall I build the 3 authority record that would be needed?
03:21 thd kados: or should I describe the process and then build them?
03:22 kados describe the process first
03:22 thd kados: ok
03:23 thd kados: so the first record needed is 150 $aArchitecture$xHistory
03:24 thd kados: Koha can now add both subfields to the 650 in the bibliographic record and link with $9 to the authority record
03:25 thd the framework may need $x for 150 if it is not here yet
03:26 thd kados: so I fill 650 with $aArchitecture$xHistory
03:26 kados ok
03:26 kados why not with $aArchitecture$zIllinois$zC​hicago$xHistory$vPictorial works.?
03:26 thd kados: now for the fun and confusion
03:26 kados (really, you don't fill 650, you fill 150, right?
03:27 thd kados: there is no such complete authority in the NACO database
03:28 thd kados: I believe that if NACO worked the way UNIMARC authorities must do that would be there in an authority record
03:29 thd s/NACO/MARC 21 authorities
03:30 thd kados: Instead we move our field position location to between the $a and $x in the 650
03:31 kados wait ... I'm confused
03:32 kados where in the auth record are you storing the values?
03:54 thd kados: I will repost what only the ether saw
03:55 thd <thd> kados: so right in the middle of  $aArchitecture  [right here]   $xHistory   we need a link to add more
03:55 thd <thd> kados: actually we should have links before and after every subject subfield if not for many other types of fields as well
03:55 thd <thd> kados: so now we will add the geographic subdivision in the correct place after $a
03:55 thd <thd> kados: so we search for Chicago (ill.)
03:55 thd <thd> kados: and we find it but instead of adding 151 $aChicago (Ill.)
03:55 thd <thd> kados: in that same 151 geographic authority record is the form when used as a geographic subdivision
03:55 thd <thd> kados: that appears as 781 $zIllinois$zChicago in that same 151 $aChicago (Ill.) authority record
03:55 thd <thd> kados: so the system must know from the context when we are filling for a subdivision
03:55 thd <thd> kados: then it will use the 7XX form
03:55 thd <thd> kados now we have 650  #0$aArchitecture$zIllinois$zChicago$xHistory with one more subfield to go and who knows how the authority records is tracked except by one $9 for each subfield applied
03:55 thd <thd> so we go to the end of the subfield and search for picture books or something as a form subdivision
03:55 thd <thd> kados: and it returns the form subdivision that is not searchable at
03:55 thd <thd> kados: that should been end of th field after the $x not end of the subfield one line above
03:55 thd <thd> kados: our search returns the form subdivision authority 185 $vPictorial works
03:55 thd <thd> kados: we append that to the end and we are done
03:55 thd <thd> kados: the system probably has to supply the final full stop to the last subfield
03:57 kados hmmm
03:58 kados I think what you describe is possible in Koha
03:59 kados but I'm not sure why or how to fully use it
03:59 thd kados: from your question 650 in the bibliographic record was filled from 3 different types of authority records 150 topical, 151 geographic, and 185 for subdivision
03:59 kados ie, why would you _ever_ want three separate subject authority records for a single biblio?
04:00 kados those three are separate records?
04:00 kados (are they separate _types_ of subject auth records?)
04:01 thd kados: You would want the UNIMARC way if you were designing this from nothing but we have NACO with MARC 21 authorities
04:01 thd kados: yes those were three separate authority records
04:02 thd kados: before we say that MARC 21 is all bad consider this problem
04:03 thd kados: UNIMARC authorities would need a system generating all possible authority records n advance or require the user to build them when they are missing much the way we would have done to fill our biblio in this example
04:06 thd kados: MARC 21 systems will match against the pre-existing supply of 650 fields in biblio records but there are no authority records for very many common cases.
04:07 thd kados: 3 authority records referenced would be uncommon but 2 would be common.
04:08 thd 3 would not be unusual merely not prevalent
04:12 thd kados: having a $9 for each subfield could work with quite a bit of code change
04:12 thd kados: so you would have 5 $9 linking to 3 authority records for my example
04:13 kados hmmm
04:13 kados right now it is only possible to have a single authority record for a single bib tag ... right?
04:14 kados that's what you're saying?
04:14 thd kados: yes
04:14 kados I think you're also saying that it's only possible to have an authority record add values within a single tag -- whereas it should allow us to add values outside of a given tag
04:14 kados right?
04:16 thd kados you mean outside of a single subfield do you not?
04:17 kados no ... because currently it will already add multiple values within a single tag
04:17 kados ie if I have a subject authority that contains $a and $x in 150
04:17 kados when I add it to the 650
04:17 kados it will populate $a and $x
04:18 kados take a look at the auth record for Lewis, C. S.
04:18 kados to see that in action
04:18 kados (well ... look at the linked bib records off of that auth record)
04:18 kados you can do so from the opac now
04:18 kados
04:19 thd kados: I understand that you made that change earlier and that is all that is needed for the easy non-subject authorities
04:19 kados the first thing I must do
04:20 kados is to fix the authorities editor
04:20 kados so it is at least on par with the bib record editor
04:20 thd kados: that will also work for about half of subject fields maybe even a little more than that in the world of bibliographic records.
04:20 kados so that's good ... but I think we can do better
04:21 kados but like I said, let me see how paul has it set up now
04:22 thd kados: so this is for the subject headings of all the interesting books and all the extremely boring books unless you are specialist in whatever and find them extremely interesting
04:23 thd kados: there is another aspect of how paul has it set up
04:24 thd kados: Currently geographic 151 authority records would go with 651 subject headings but we needed to use them in our 650 as well.
04:25 thd kados: The framework design would need extension to accommodate that change
04:26 thd kados: Did you get enough funding for the generalised solution?
04:27 kados I still don't know what the generalized solution is :-)
04:27 thd kados: I just gave it to you in a vague directional outline
04:28 kados I'm still digesting it :-)
04:29 thd kados: the thing troubles me is dividing $a from $x derived from a single 150 when adding the geographic qualification
04:32 kados I don't quite understand that (having trouble parsing that sentence)
04:32 kados you mean that we currently have to divide $a from $x because of Koha's limitations?
04:32 kados or that we should be able to divide them but Koha can't?
04:34 thd kados: It does not seem much of a great problem in the bibliographic record editor
04:38 thd kados: but when using to match the 150 $aArchitecture$xHistory to a record that had divided those two subfields with a geographic subdivision seems understandable but requires a level of search matching that requires extra thought.
04:40 thd kados: I mean when importing authorities to set up $9 for records that do not have $9 yet and for newly copy catalogued bibliographic records
04:47 thd kados: just to be clear separating $a from $x follows the practise used in existing records.  My description had removed all the limitations from Koha.
04:50 kados thd: let's examine
04:51 kados thd: and make one that works for
04:51 kados BEFORE RUNNING this script, you MUST edit it & adapt the %whattodo hash to fit your needs. It contains :
04:51 kados * as key, the code of the authority to be created. It's the one you've choosen (or will choose) in Koha >> parameters >> thesaurus structure >> add). It can be whatever you want. NP/CO/NG/TI/NC in CVS refers to UNIMARC french RAMEAU category codes.
04:51 kados * in values a sub-hash with the following values :
04:51 kados \ttaglist : the list of MARC tags using this authority
04:51 kados \tkey : the list of MARC subfields used as key for authority. 2 entries in biblio having the same key will be considered as the same.
04:51 kados \tother : the list of MARC subfields not used as key, but to be copied in authority.
04:51 thd kados:do you men the poor man's way? :)
04:51 kados \tauthtag : the field in authority that will be reported in biblio. Remember that all subfields in tag "authtag" will be reported in the same subfield of the biblio (in MARC tags that are in "taglist")
04:51 kados don't forget to define the itemfield. In UNIMARC, it should be 995, in MARC21, probably 852
04:52 kados yea
04:52 kados I just want it to work on the demo
04:52 kados (for now)
04:52 kados so it doesn't seem like a broken feature :-)
04:52 kados so we have two codes right now:
04:52 kados SUBJECT
04:52 kados AUTHOR
04:52 kados should we create others?
04:53 thd kados: that script combined with would be the starting point for as the existing code in is useless.
04:53 kados right
04:54 thd kados: we should also have UNIFORMTITLE
04:54 kados # the list of MARC tags using this authority
04:54 kados                                taglist => "700|701|702",
04:54 kados                                # the list of MARC subfields used as key for authority. 2 entries in biblio having the same key will be considered as the same.
04:54 kados                                key             => "a|b|c|d|f|x|y|z",
04:54 kados                                # the list of MARC subfields not used as key, but to be copied in authority.
04:54 kados                                other   => "j",                                # the field in authority that will be reported in biblio. Remember that all subfields in tag "authtag" will be reported in the same subfield of the biblio (in MARC tags that are in "taglist")
04:54 kados                                authtag => "200",
04:54 kados do you understand what 'other' is?
04:55 thd kados: and SERIESTITLE
04:55 kados (notice also that multiple tags can be specified in 'taglist'
04:55 thd kados: where is the 'other' ?
04:56 kados # the list of MARC subfields not
04:56 kados               used as key, but to be copied in authority.
04:56 kados 23:53 < kados>                                 other   => "j",
04:59 kados thd: notice the last one in the has has comments that I pasted above
04:59 thd kados: what cvs dir is this in?
05:00 kados misc/migration_tools
05:00 kados thd: what bib tags should use a SUBJECT authority?
05:00 kados I know 650 ... but what others?
05:01 kados thd: ?
05:01 thd kados 6XX except that we actually have multiple types of subject authorities even for $a
05:02 thd so 150 fills 650 $a for topical headings
05:03 thd 151 fills 651 $a for geographic headings
05:05 thd 100 fills 600 $a for personal name subject headings
05:05 kados so it seems like we need a different authority framework for each of these
05:06 thd 110 fills 610 $a for corporate name subject headings
05:06 kados but the problem is, there is no way to search across multiple authority frameworks is there?
05:06 kados so lets make a quick list of all the types of headings we'll need
05:06 thd kados: In what contest are you wanting to search across multiple subject headings
05:07 kados I don't know yet
05:07 kados lets just get the data in and then we can see what it does :-)
05:07 kados cause frankly I'm still confused by how subjects are supposed to work
05:07 kados should we do a minimal test case?
05:08 thd kados: under current behaviour if you are filling 650 it will or should only search 150 authorities
05:08 thd although it needs to search others for subdivisions
05:08 kados thd: do you know what 'other' is for in the hash?
05:09 kados here is what I have so far:
05:09 kados SUBJECT =>      {       taglist => "650",
05:09 kados                                key             => "a|i|x|k|l|m|n|q|y|z",
05:09 kados                                other   => "",
05:09 kados                                authtag => "150",
05:09 kados                        },
05:09 kados AUTHOR =>       {
05:09 kados                                taglist => "100",
05:09 kados                                key             => "a|b|c|d|f|x|y|z",
05:09 kados                                other   => "j",
05:09 kados                                authtag => "100",
05:09 kados                        },
05:09 thd kados: I was looking for that when you pinged about the subsequent question for the multiple frameworks needed
05:09 kados thd: what do I put in 'other'
05:09 kados thd: and what subfields should 650 have and what subfields should 100 have?
05:11 thd easy answer first
05:11 thd 650 should have at least $a $z $x $y $v
05:14 thd kados: both are easy after reading
05:15 thd except maybe I will read some more to be sure I am right
05:15 thd kados: did you see an example from UNIMARC?
05:16 thd kados: an example of the key and an example of the other?
05:17 thd kados: I see an example and now check the UNIMARC documentation
05:20 thd kados: having checked the UNIMARC documentation the other used in the example makes no sense
05:21 thd kados: other should be empty or a numeric field
05:21 thd s/field/subfield/
05:21 kados I'm running it right now on just SUBJECT
05:21 kados with other as empty
05:22 thd kados: the example shows a key for an or boolean operating on all letter subfields
05:25 thd kados: I imagine it will be awhile building 650 for your 50k records :)
05:26 kados thd: it's not creating 'summary'
05:26 kados I wonder if that's what 'other' is for
05:26 thd kados: other is for excluded subfields
05:27 thd kados: other is empty in most examples given for UNIMARC
05:28 kados thd:[…]
05:28 thd kados: summary is for the framework not the authority records themselves
05:28 kados thd: do a search on Frontier
05:30 kados summary isn't getting populated for some reason
05:31 thd kados: maybe the templates were not fixed on this system or summary was empty all along for the subject authority framework
05:32 osmoze hello
05:33 kados hi osmoze
05:33 kados osmoze: are you familiar with paul's authorities system?
05:36 osmoze not really
05:36 thd kados: it seems to be working well except for the value of the 1XX from the authority to appear in the template
05:37 kados ?
05:37 kados ahh ... you mean the summary
05:37 kados I have no idea why it's not
05:37 kados since it is set up the same way as NAME
05:38 thd kados: summary actually was from the authority framework originally
05:39 kados ?
05:39 kados what do you mean?
05:39 thd kados: I see that for Lewis, C.S. only 100 $a appears in the summary column
05:39 kados right, but I could change that
05:39 kados what else do you want to show up there?
05:40 kados thd: ?
05:40 thd kados: summary is a framework column, it is not the right name for what ought to be called the authorised heading column
05:41 thd or something like authorised heading
05:41 kados ok I'll change it
05:41 kados what fields should show up for the NAMES authorized heading?
05:41 thd kados: summary as a column may have been mean to show the framework type originally
05:42 kados $a $q $d according to LOC
05:42 kados ok ... they should show up now
05:43 thd kados: more than that although that was there in the case of CS Lewis
05:44 thd kados: there is some code for cremating the correct HTML in that can be adapted from 6XX use
05:46 thd kados: that will capture all the subfields in that may be present in the correct order.
05:48 kados thd: got it!
05:48 kados thd: so lets talk about what the authorized heading should look like for SUBJECT
05:48 kados what subfields should it have in what order?
05:54 thd kados well that is easy  do not need to even change the code really
05:55 thd kados: you do not inform the system what order the subfields should be in you read that from the system
05:55 thd s/from the system/from the record/
05:55 kados we don't have that choice unfortunately
05:56 thd kados we do have a choice and the code is already written
06:01 thd sorry not but
06:01 thd kados: getMARCsubjects
06:04 kados ok ...
06:04 kados so the authorized heading should be built using that SQL?
06:05 thd kados: obviously you need only one known 6XX to match one authority framework starting at $a
06:05 kados that will require re-writing paul's use of ISBD for display of the authorized heading
06:06 thd kados: oh yes the whole ISBD system in Koha is backwards
06:07 kados :-)
06:07 kados how so?
06:07 thd kados: everything throughout Koha should follow the model of getMARCsubjects
06:08 kados for display you mean
06:08 thd kados: order should be read from the record not set by the system
06:09 kados before we start that ... what should the subfields be for a NAME authorities record?
06:09 kados so I can restart the batch process
06:10 kados then I will take a look at ISBD
06:10 thd kados: the system should only display which fields and subfields are included not their relative order within a field or repeated set of fields
06:14 thd abcdefghjklmnopqrstvxyz
06:14 kados thd: are you getting that from here:
06:15 kados[…]head.html#mrca100
06:15 kados ?
06:15 thd kados: well that is one place but most of those would never be found in a record
06:18 thd kados: abcqd are the most common
06:19 thd with e for good measure
06:19 kados do all of the heading types listed on that page corospond to the tags in bibliographic records?
06:19 kados ie, do they map exactly?
06:20 thd kados: it is a one to many mapping
06:21 kados er?
06:21 kados so you mean that tag 100 in an authority record maps to may bib record tags?
06:22 thd kados: authority 100 maps to bibliographic 100, 600, 700 commonly and maybe others less commonly
06:23 kados but the value in 100 and 600 and 700 is always the same right?
06:23 kados it's that problem of MARC not being normalized?
06:23 kados (ie, the same value is in three places)
06:24 thd kados: not the same value in the same record unless it is an autobiography
06:24 thd kados: and then there would be no 700
06:25 kados so it only goes in one place then
06:25 kados how do we know which place it goes in?
06:26 kados ie, personal names ... do they always go in 100 $a?
06:26 thd kados: authority 100 is for a personal name and goes in at least 100, 600, and 700 if applicable for that bibliographic material being catalogued
06:27 kados so it _does_ put the same value in multiple places
06:27 thd kados: 700 is for an additional author if there is a co-author
06:28 kados
06:28 thd kados: the same value would only be the case where the author 100 and the subject 600 were the same
06:28 kados thd: so should we have a separate authority type for each individual type?
06:28 kados ie for name there are many types:
06:28 kados    * Personal names (X00)
06:28 kados    * Corporate names(X10)
06:28 kados    * Meeting names (X11)
06:28 kados    * Names of jurisdictions (X51)
06:28 kados    * Uniform titles (X30)
06:28 kados    * Name/title combinations
06:28 thd kados: yes
06:29 kados should we try to pack them all into NAME? or should they all be separate authority types?
06:29 thd what is NAME? ?
06:29 kados
06:29 kados NAME would be a higher-level grouping of all of those types
06:30 kados and SUBJECT would be a higher-level grouping of the types:
06:30 kados    * Chronological terms (X48)
06:30 kados    * Topical terms (X50)
06:30 kados    * Geographic names (X51)
06:30 kados    * Names with subject subdivisions
06:30 kados    * Terms and names used as subject subdivisions
06:31 thd kados: name is an authority concept that is not helpful to OPAC users
06:31 kados or should all of those types have their own auth types?
06:31 thd kados: OPAC users often expect to search all author types or all title types
06:32 thd kados: although it can be useful for searching names as subjects
06:33 kados so we need 'auth group'
06:33 kados so we can group together the individual types for searching
06:33 thd kados: so name authorities can be useful to OPAC users
06:34 thd kados: this grouping that you are describing is not part of Koha now is it?
06:34 kados no
06:35 thd kados: except for branches
06:35 kados how do we tell when to put the values from 100 into 600 and 700?
06:35 thd kados: in what context?
06:36 thd kados: for ?
06:36 kados yes
06:36 kados (so I will delete NAME and SUBJECT and create the many types i listed above ... sound right to you?)
06:38 thd kados: you search each 100, 600, 700 in the bibliographic record for a matching 100 authority record or the other way around
06:39 thd kados: yes we need several types in the current flat arrangement that could be hierarchical with a few more columns for the framework
06:42 thd kados: I had thought that there may have been a problem with not showing the subject because the key had an or connector.  Try building where the key is only 'a'.
06:44 kados thd:[…]dmin/
06:44 kados thd: does that look right?
06:44 kados thd: is that what you had in mind?
06:47 kados thd: ?
06:48 thd kados: series is missing
06:48 kados I don't see it in the concise authorities list
06:50 thd kados: no and I do not see jurisdiction name
06:51 thd kados: there is genre/form though
06:51 kados
06:51 kados Names of Jursdictions
06:51 kados and also very tired
06:54 thd kados: authority uniform tile must also be used for series title when applied to the bibliographic record
06:56 kados I will have to continue working on this tomorrow
06:57 kados I still don't understand how our framework even comes close to providing what we need
06:57 kados but we can discuss it tomorrow :-)
06:57 kados good night thd
06:57 thd kados: that name jurisdiction is what the geographic name is mapped to but I think it only need the 151 geographic name authority unless we need one authority type for every controlled bibliographic field
06:58 thd kados: you never had so much fun
06:58 thd kados: the fun will be spoilt if you are tired though :)
06:59 thd kados: I will see you at some mutually awake hour which will probably not still be the morning :)
06:59 thd good night kados

