IRC Logs

Time  Nick  Message
12:01 owen  I think my wireless router was built by monkeys.
12:01 paul  good luck with your jungle owen.
12:01 paul  & bye all, dinner time for me
12:01 kados bye paul
12:03 owen  kados: so WIPO reports that recent acquisitions is working from opac-search.pl but not opac-main.pl?
12:04 owen  No--the other way around?
12:05 kados working on opac-main
12:05 kados not on opac-search
12:05 kados wipoopac.liblime.com
12:06 kados question about your sysprefs
12:06 kados 1. opacheader, Textarea, 30|10, Enter HTML to be included as a custom
12:06 kados header in the OPAC
12:06 kados what's 30|10?
12:06 owen  that's the colums/rows numbers for the system preference setup
12:07 owen  "variable options" under "Koha internal"
12:07 kados ahh ... never done one of those before
12:07 owen  Yeah, I think opaccredits needs to be updated that way too.  I think right now it's just a single line entry
12:07 kados right
12:08 kados owen: do you feel comfortable making the change to updatedatabase?
12:08 kados owen: all you need to do is add one block
12:08 owen  I can look at it, but I've never touched it before.
12:08 kados owen: for an example, take a look at the Amazon sysprefs i created
12:08 kados it's not hard at all
12:09 kados bbiab
12:11 ToinS bye all....!
16:01 kados thd: you there?
16:02 kados thd: i thought I remembered you saying that the native alaskan scripts we had been working on recently were mapped on the LOC website
16:02 kados thd: but I can't find that chart anywyere
16:19 thd   kados: Well, at least the Cyrillic scripts are working.  The 11th and 12th records had been in Cyrillic and not a native Alaskan language.
16:20 thd   kados: I may not have native Alaskan in the Sample that I have been testing.  I see something that where I cannot interpret the characters in vim and I know it is not English :)
16:24 thd   kados: I had been having XML::Parser errors bringing processing to a halt on the first record encountered with Cyrillic previously.
16:35 kados thd: aha!
16:35 kados thd: so then, the solution is to first conver to utf-8
16:36 kados thd: before doing anything else
16:36 kados thd: that can most easily be done like this:
16:36 kados my $uxml = $record->as_xml;
16:36 kados         my $newrecord = MARC::Record::new_from_xml($uxml, 'UTF-8');
16:37 thd   kados: yes, that is the solution and if the native Alaskan records have problems they can be imported in MARC-8
16:38 thd   kados: your XML solution would not work for me when it came to the 11th record.
16:38 kados thd: really?
16:39 kados are you sure it was those instructions failing?
16:39 kados and not something else?
16:39 kados I'll do a test case on my machine
16:40 thd   kados: as soon as XML::Parser met those instructions for the 11th record everything died.
16:40 kados hmmm
16:40 kados so that _does_ indicate that the encoding mapping wasn't working
16:40 kados but I need a test case to prove it to myself
16:40 kados and so I can show the error to Ed Summers
16:41 thd   kados: I know that you did not seem to be able to reproduce the XML::Parser error on your system.
16:41 thd   kados: At least I worked around it for any system :)
16:45 kados thd: my test fails on record 11 also
16:46 thd   kados: I do have one Cyrillic character in the 11th and 12th records for which I have no glyph in UTF-8 using whatever font is default for Koha.
16:46 kados interesting
16:46 thd   kados: I suspect that display within Koha is merely a font issue.
16:48 thd   kados: if I open the corresponding Z39.50 client pages using fonts set by my own client style sheet and UTF-8 conversion in YAZ then all looks well.
16:49 kados interesting
16:49 thd   kados: Records 11 and 12 correspond to original records 15 and 16 in the full set.
16:49 kados right
16:50 thd   kados: look at 15.html and 16.html saved by LWP .
16:52 kados why don't they match up to 11.html and 12.html?
16:53 thd   kados: if the automated script had found every one of the original records then they would match.
16:54 thd   kados: however the 11th record has an original record ID of 15 recorded in the extra values file.
16:55 kados thd: the error I get is :
16:55 kados utf8 "\xEC" does not map to Unicode at /usr/local/lib/perl/5.8.4/Encode.pm line 167.
16:55 kados when I attempt to do the marc8->utf8 using the new_from_xml routine
16:57 thd   kados: I could not run the script far enough to see that error because I had the XML::Parser error stopping everything.
16:58 kados thd: i just sent a message to Mike and Ed with the test case
16:58 kados thd: hopefully they'll have a chance to take a look soon
16:59 thd   kados: although I have not removed it yet, does saving the file in UTF-8 by opening it in that mode not seem to be a possible source of difficulty.
17:00 kados is it opened in utf8 mode?
17:00 kados the outfile is, but the infile is non-specific
17:00 thd   kados: Perl should just be saving whatever is in the record converted already or not.
17:00 thd   kados: I was referring to out-file.
17:00 kados I don't think that should matter
17:01 kados I presume that the data is in utf-8 before it is saved to the filehandle
17:01 thd   kados: you mean that behaviour would be no different without opening outfile in UTF-8?
17:02 kados it would be pretty simple to test ;-0
17:02 thd   kados: yes, the conversion is done before saving.
17:03 thd   kados: Yet what is opening the outfile in a different format supposed to actually do?
17:03 kados thd: just tested it, the behavior is the same
17:03 kados thd: it sets perl's utf8 flag
17:04 thd   kados: Do you mean that it stores meta information about encoding in a non-existent file meta-bit.
17:04 thd   ?
17:04 kados thd: which ensure the data is written as utf-8 and not mangled by perl's internal
17:05 kados I have no clue how perl stores utf8 internally
17:05 kados but I do know there is a flag that marks data as utf8 or not
17:05 kados to properly write out utf-8 that flag must be set when opening a filehandle
17:06 thd   kados: Perl ought to be able to write any arbitrary encoding that I just invented today by writing whatever characters I tell it to write.
17:07 thd   s/characters/bytes/
17:08 kados right
17:09 thd   kados: Perl should not mangle anything unless I am counting string lengths in bytes and not characters but we are not counting string lengths of non-ASCII data in this code.
17:10 kados it still fails on the 11th record when removing that specification
17:10 kados so it's a moot point IMO
17:11 thd   kados: Ok, I was just curious to be sure it was not doing extra encoding or something.
17:12 thd   kados: I could imagine it encoding each byte in UTF-8 but I would certainly have expected to see different output from what I have were that the case :)
17:13 kados right
17:13 kados yea, I think I tried that before, because I was worried it was double-encoding or something
17:18 thd   kados: a had felt perfectly awake when we were communicating this morning even though I should felt tired.  Inability to see came over me within a couple of hours and I slept until a short time before you pinged me just now.
17:18 kados ahh sleep :-)
17:19 thd   kados: It is strange how I can go from feeling perfect to not being able to function very rapidly, especially if I eat something :)
17:20 kados heh
23:15 rach  although get the carbo crash when eat too many carbohydrates in one go