Time  Nick      Message
10:30 hdl       Since at each connection, there is the charset problem.
10:29 hdl       The problem is now to make good use of your results and of your remark on our bases and in Koha.
10:29 pierrick_ thank you :-)
10:28 hdl       But you did smart  testing from scratch.
10:28 pierrick_ hello osmoze
10:27 pierrick_ the only thing I made was to start from scratch with full UTF-8 from the beginning to the end
10:27 pierrick_ hdl: no no, I didn't found anything more than what Paul found on usenet
10:13 hdl       pierrick_: Congratulations for set names feature ;)
10:13 osmoze    hello
08:49 pierrick_ sorry, I was not reading IRC, mail was sent
08:31 hdl       can you send a private copy to me... Will be faster : same mail as you, except henridamien
08:18 hdl       I shall tell you, if it would be the case.
08:17 pierrick_ maybe I miss the point, we'll see
08:17 hdl       "La vie est belle" ;)
08:17 hdl       Anyway, thanks to your soon email, I will be able to work in UTF-8. :)
08:12 pierrick_ I understand
08:11 hdl       This is why DADVSI matters for me.
08:11 hdl       And DADVSI is a matter of culture. French librarian are concerned about it. FSF france is concerned too, since it would make Open-source Software, which CANNOT have DRMs by construction, unless Sun makes his Open-Source DRM a success and a standard, on the fringe.
08:08 hdl       but they are lurking around.
08:06 pierrick_ and I don't really feel implicated since I only listen to radio... my concern is about free software, and software patents were rejected months ago
08:05 pierrick_ not at all, too much confusing for me
08:05 hdl       Have you followed DADVSI ?
08:04 pierrick_ maybe I didn't finally understand what was not working
08:04 pierrick_ (it works so nice I don't understand why Paul has spend so much time on this issue)
08:04 hdl       (that is : did you get the problem)
08:03 hdl       Are you clear ?
08:03 hdl       Dos it work better ?
08:03 hdl       Still on zebra import.
08:03 pierrick_ I'm writing an email to koha-dev telling what tests I've made about Perl/MySQL/UTF-8
08:03 hdl       quite good.
08:02 pierrick_ Hi hdl, I'm fine, how do you do ?
07:59 hdl       How are you ?
07:59 hdl       pierrick_: hi
06:38 hdl       chris : In fact it is not kind of DMCA, it is WORSE :5
05:41 chris     (babelfish doesnt do a very good job of translation, and the french i learnt when i was 12, i have all forgotten :-))
05:41 chris     is it kind of like the DMCA in the US?
05:39 chris     from what I can tell, it removes any rights to make a private copy ?
05:39 hdl       They retired a law article at the beginning of the vote. Then reintroduced it only to vote against and to vote amendments which promote THEIR vision. AWFUL !
05:37 chris     ohh, that is bad news
05:37 hdl       Nationla chamber is voting this without objections nor listening counterparts.
05:36 hdl       yeah.
05:35 chris     ohh DRM stuff
05:34 chris     hmmm, ill have to babelfish that
05:28 hdl       http://www.ratiatum.com/news2931_DADVSI_la_defaite_socialiste_dans_le_calme.html
05:26 hdl       hi
02:47 thd       good night
02:47 thd       your welcome kados
02:47 thd       kados: you need documentation to write a non-blocking asynchronous client
02:46 kados     thanks for your help thd
02:46 kados     ok ... so I need to rewrite the z3950 client tomorrow :-)
02:46 thd       :)
02:46 kados     which doesn't yet support the old syntax
02:46 kados     just Net::Z3950::ZOOM
02:46 kados     Net::Z3950 isn't installed :-)
02:45 kados     ahh ... that's my problem
02:44 kados     Execution of ./processz3950queue aborted due to compilation errors.
02:44 kados     Bareword "Net::Z3950::RecordSyntax::UNIMARC" not allowed while "strict subs" in use at ./processz3950queue line 262.
02:44 kados     Bareword "Net::Z3950::RecordSyntax::USMARC" not allowed while "strict subs" in use at ./processz3950queue line 261.
02:44 kados     # ./processz3950queue
02:44 kados     interestingly:
02:43 kados     looks like it just died about a week ago and noone noticed
02:43 thd       kados: what is different about the 2 servers?
02:43 kados     and this server was working too
02:42 kados     strangely, I have it running fine on another server
02:41 thd       kados: If they had continued then I would actually know hat to put in a proper FAQ
02:41 thd       kados: about 5 messages later most users would give up.
02:40 kados     yea
02:40 thd       kados: It was mostly the starting point for identifying the problem
02:39 kados     yep
02:39 thd       kados: did you receive my message?
02:39 kados     hmmm ... I still cant' get it going
02:35 thd       kados: you should have it now
02:34 kados     thd: got that faq handy?
02:31 kados     what am I forgetting?
02:30 kados     No directory, logging in with HOME=/
02:30 kados     Koha directory is /home/nbbc/koha/intranet/scripts/z3950daemon
02:30 kados     sam:/home/nbbc/koha/intranet/scripts/z3950daemon# ./z3950-daemon-launch.sh
02:28 kados     well if _I_ can't get it going ... :-)
02:28 kados     heh
02:28 thd       s/server/client/
02:28 thd       kados: No I was going to make it into a FAQ but very few users who had trouble had the patience to get the server running
02:27 kados     is it on kohadocs?
02:26 thd       s/server/client/
02:26 kados     maybe
02:26 thd       kados: do you need my Koha Z39.50 server hints message?
02:25 thd       kados: I wish mutt or SquirrelMail would allow reading mail in any encoding and sending in UTF-8 but that is an either or choice for the present.
02:25 kados     very strange
02:25 kados     can't get it going
02:24 kados     I run it so rarely that I forget if I'm doing it correctly
02:23 kados     in fact, I'm troubleshooting getting the z3950 daemon running on one of my servers
02:23 thd       kados: no not tonight :)
02:23 kados     but I don't think I can fix that tonight :-)
02:23 kados     good points
02:22 thd       kados: Almost no Western European language user wants to send UTF-8 email because it will look like junk to most recipients.
02:19 thd       kados: Users of western European languages do not have UTF-8 locales on their home systems nor do many of your potential customers on their office systems.
02:16 thd       kados: If I type C?zanne, I should find something from my ISO-8859 locale with query normalisation.  I should also find something if I type Cezanne, even though the authority controlled values will always have C?zanne in UTF-8..
02:02 thd       s/quarry/query/
02:02 thd       kados: and we also need query normalisation and index normalisation
02:01 thd       kados: we need a routine for detecting and converting the home user locale on quarry submission
02:00 thd       kados: She could create records fine but could not import them.
01:59 thd       s/UTF/MARC/
01:59 thd       kados: Carol had suspected her fonts and did not understand about UTF-8.  I guess she was at least partly right.
01:58 thd       kados: I forced a font change in Firefox itself
01:57 thd       kados: that cured it for LibLime but not my system.  I will try my own system again later
01:54 kados     what's next
01:54 kados     lets move on
01:54 kados     but yes, that should work too
01:54 kados     seems easier to just change the default font in Koha
01:53 thd       kados: so if I force firefox to display a different font it will be cured?
01:53 kados     soon as i switched to sans-serif it worked fine
01:53 kados     that seemed to be the problem
01:53 kados     no we have to fix the Veranda font :-)
01:52 kados     weird
01:52 thd       kados: we have to fix Firefox now :)
01:52 thd       kados: that was only before you fixed it
01:50 kados     does it still work on your system?
01:50 kados     dunno ... I never tried it
01:49 thd       ?
01:49 thd       kados: when you searched on Liblime with Cezanne no accents did you not find records or was that just my system
01:49 kados     you mean the search?
01:49 kados     what?
01:48 thd       kados: maybe but was it not working previously with no accents?
01:47 kados     thd: are you convinced?
01:46 kados     that's what I wanted you to search with in the first place but I coudln't paste it in correctly :-)
01:46 thd       kados: now I will try searching with UTF-8
01:46 thd       kados: I have no results from my search
01:44 kados     can we close this topic once and for all? :-)
01:44 kados     like I said before, it's a font / browser issue
01:44 kados     do a search on CeÃŒzanne
01:44 kados     http://opac.liblime.com/cgi-bin/koha/opac-search.pl
01:43 kados     thd: i changed the font on the results screen
01:42 thd       kados: your locale is utf-8?
01:42 kados     properly accented characters
01:41 thd       kados: what do you see in the source?
01:39 kados     thd: view source on the opac-detail page
01:37 thd       kados: I will see if I can produce a link for you from my system
01:37 kados     hmmm
01:37 thd       not on Google
01:37 kados     I don't see any corruption
01:34 kados     where?
01:33 thd       kados: code is corrupting the values as in the complaint about a Portuguese letter
01:33 kados     and that's a font or browser issue ... let's move on :-)
01:33 kados     it's right everywhere but the links
01:33 kados     and it's right on the normal view for the heading and it's right in the marc view
01:32 kados     it's right in the marc tables and it's right in the koha tables
01:32 thd       kados: that was my first though but my eyes are better than that or do I need glasses for my perfect vision now?
01:32 kados     the font only makes it look that way when it's a link
01:32 kados     (yea, this channel is iso-8859)
01:32 thd       kados: I do not see the accent on #koha
01:32 kados     for some reason, the font we're using makes it look like the accents are on the 'z'
01:31 kados     I think it's just a trick of the eyes
01:31 thd       kados: then the links are bad from a Firefox bug for you?
01:31 kados     accent is on the e
01:31 kados     |        23783 | NULL   | Ce�zanne & Poussin : | NULL     | NULL  |   NULL | NULL        |          1993 | 20060309192428 | NULL     |
01:30 kados     mysql> select * from biblio where biblionumber='23783';
01:30 kados     the koha tables look fine to me:
01:29 thd       Kados:  marc_subfields_table looks fine
01:28 kados     what about the marc tables?
01:28 kados     er?
01:28 thd       kados: I can see the data in the original Koha tables and it is wrong
01:27 kados     thd: dunno
01:27 thd       kados: that is a Firefox bug?
01:27 kados     thd: when there are accented chars in them
01:27 kados     thd: it's just that firefox is formatting links strangely
01:27 kados     thd: they aren't
01:26 thd       kados: yes except we only need to track down the problem where the original Koha tables must now be getting improperly converted MARC-8
01:25 kados     we don't need MARC-8 to work now that we convert everything to UTF-8 right?
01:25 thd       kados: huge number of bug fixes for show stopping bugs if only MARC-8 worked correctly
01:24 kados     thd: what's your impression of the progress we've made between 2.2.5 and 2.2.6?
01:24 kados     good news
01:24 kados     ahh
01:24 thd       kados: My claim about ghost data even after the delete option was applied came from an old Koha MARC export that I had in the import directory but have subsequently deleted so it is no more.
01:23 kados     none of us can import utf-8 correctly using utf-8 encoded tables
01:23 kados     good, so we are all on the same page
01:22 thd       kados: yes I reimpported with the delete option afterwords
01:22 kados     yes
01:22 thd       kados: Do you mean bulkmarcimort.pl that we had fixed by your suggestion?
01:21 kados     (ie, are the bad values from previous imports, or are they from current imports?)
01:21 thd       kados: They do not even look like UTF-8 values
01:20 kados     thd: could you tell me whether the new bulkimport.pl correctly inserts data into those tables?
01:20 thd       kados: my Koha tables are UTF-8 but they have bad values
01:18 kados     so the last test I'd like to try tonight
01:17 thd       for MARC 21
01:16 thd       kados: I have seen Z39.50 servers claiming to have records in ISO-8859
01:16 kados     anyway ... we digress
01:16 thd       kados: well I suggested that they did earlier
01:15 kados     hehe
01:15 kados     yea but they exist right?
01:15 thd       kados: what kind of criminal do you think I am :)
01:15 thd       kados: those are illegal
01:14 kados     thd: if you have any MARC records in iso8859 (MARC21) I'd be interested in seeing what happens when they are imported under the new scheme
01:13 thd       kados: not knowing Chinese I could not tell
01:13 thd       kados: it was working fine for Carol Ku except for problems that I had supposed to be related to the previous lack of MARC-8 support
01:12 kados     i don't understand why utf-8 works fine on my mysql since I haven't changed the tables to handle utf-8 :-)
01:10 thd       kados: so I could not see expected problems because MARC-8 data was still MARC-8 inside Koha until I fixed the CVS update path this morning
01:09 thd       kados: then I dropped the original ISO-8859 database and had been very happy except that I had confused the CVS update path for a few weeks
01:08 kados     (how did you convert them?)
01:08 kados     are all your tables utf-8?
01:08 kados     so what you're telling me is that mysql utf-8 works fine for you right?
01:08 kados     hmmm
01:07 thd       kados: then I imported that into a database built with UTF-8 defaults
01:06 thd       kados: i had taken my original rel_2_2 dump and changed all the encodings from ISO-8859 to UTF-8
01:06 thd       kados: certainly marc_subfields_table with the MARC data had been fine
01:05 thd       kados: actually everything should have been UTF-8 except for update changes from CVS
01:04 thd       kados: In fact I rebuilt the current Koha DB with UTF-8 default
01:03 kados     do they work properly?
01:03 thd       kados: Some are on my system
01:03 kados     the marc tables aren't in utf-8 currently
01:03 thd       then reimport
01:03 kados     no table in koha 2.2 is in utf-8
01:03 thd       ALTER TABLE whatever to UTF-8
01:02 kados     convert to what?
01:02 kados     convert?
01:02 thd       kados: If we convert the original Koha tables all will be fine and happy.
01:01 kados     except the strange fact that links shift the accents (which I bet is a browser problem)
01:01 kados     thd: so have we solved all your problems?
01:00 thd       s/9/8/
01:00 thd       MySQL does not know the difference between UTF-9 and ISO-8859 except in search indexing
00:58 thd       kados: that is where the MARC record data live
00:57 thd       s/subfield/subfields/
00:57 thd       s/subfields/
00:57 thd       kados: yes
00:57 kados     thd: seem right to you?
00:57 thd       kados: I know that the values were correct previously before your fixes in marc_subfield_table
00:56 kados     the accent only shifts if the text is a link
00:56 kados     new theory
00:55 thd       s/9/8/
00:55 thd       kados: my Koha MARC tables have been UTF-9 for months
00:55 kados     none of the tables in koha 2.2 are utf-8
00:54 thd       s/tables/original Koha tables/
00:54 kados     I
00:54 kados     :-)
00:54 kados     none of the tables are utf-8
00:54 thd       Oh and the tables are not UTF-8?
00:54 kados     the marc view from the marc* tables
00:53 kados     the 'normal' view is pulling the records from the koha tables
00:53 kados     I bet I know why
00:53 thd       yes correct
00:53 thd       s/record/letter/
00:53 kados     correct even?
00:53 kados     ?
00:53 thd       kados: I did not pay close attention as it was just one record in Portuguese
00:53 kados     thd: do you see that in the MARC view the accents are corect/
00:52 kados     thd: let's focus on our issue
00:52 kados     I missed it
00:52 thd       kados: it was a few weeks ago
00:52 kados     recent?
00:51 kados     (and I'm about to fix the leader 'length' setting too)
00:51 thd       kados: did you see the mailing list post maybe on koha-devel about Koha UTF-8 code causing problem in Portuguese
00:51 kados     the leaders for records going in to koha should be automatically fixed now from bulkmarcimport.pl and addbiblio.pl
00:49 thd       kados: was your addBiblio.pm fix for the leader?
00:47 kados     wtf
00:47 kados     also, in the MARC view the accent is in the right place! :-)
00:46 kados     looks right to me
00:46 kados     02131cam a2200409 a 4500
00:46 kados     http://opac.liblime.com/cgi-bin/koha/opac-MARCdetail.pl?bib=23783
00:46 kados     thd: the accent's in the right place
00:46 thd       kados: leaders were not correct before I will check now
00:46 kados     thd: check the MARC view
00:46 kados     leaders are correct
00:45 thd       kados: yes UTF-8 strangeness
00:45 kados     are the leaders correct?
00:45 kados     and now they are in utf-8
00:45 kados     ok ... so they were converted to utf-8
00:45 thd       kados: although I did not check each one
00:45 thd       kados: the records were in MARC-8 before import into Koha
00:44 kados     thd: what's an example?
00:44 kados     thd: really?
00:44 kados     thd: or don't you know?
00:44 thd       kados: now I have wandering accents which are clearly UTF-8 but Koha keeps changing the charcters depending on the type of view
00:44 kados     thd: are these records MARC-8 or ISO-8859?
00:44 kados     thd: I need some clarity
00:43 thd       kados: It would have looked fine if the header had been ISO-8859
00:43 kados     now what do you have?
00:42 kados     it's also really weird that we can search for it :-)
00:42 thd       kados: before your fix for bulkmarcimport.pl I had consistent ISO-8859 content in the Koha XHTML page with a UTF-8 header
00:42 kados     it's really weird
00:42 kados     no idea
00:40 thd       kados: what is with the wandering accent though
00:39 thd       kados: It saves the records in raw encoding which is MARC-8 for those 5
00:39 thd       and was for the past few months
00:38 thd       kados: my YAZ/PHP client page is in UTF-8 now
00:37 thd       hehe spell checker corruption again
00:37 thd       kados: Ce\x{0301}zanne
00:27 kados     I can search on Ce�zanne as well
00:27 kados     http://opac.liblime.com/cgi-bin/koha/opac-detail.pl?bib=23783
00:25 kados     thd: these are all marc-8 ... or at least claim to be
00:23 kados     thd: ?
00:20 kados     can I look at it?
00:20 kados     so what'd it do?
00:20 kados     heh
00:20 thd       kados: My spell checker corrupted my post
00:19 thd       kados: I think that did not go through right but it looks right with the acute accent except on the wrong character
00:18 thd       kados: I have the accent on character over now C?zanne is now Ce\x{017a}anne
00:16 thd       s/wired/weired/
00:16 thd       kados: that did something wired
00:13 thd       reimporting now
00:12 kados     see if that fixes it
00:11 kados     and re-import
00:11 kados     save the file
00:11 kados         $record = MARC::Record::new_from_xml($uxml, 'UTF-8');
00:11 kados         my $uxml = $record->as_xml;
00:11 kados     while ( my $record = $batch->next() ) {
00:11 thd       that is line 79
00:11 kados     yep ... right after that line
00:11 thd       while ( my $record = $batch->next() ) {
00:10 kados     try that
00:10 kados         $record = MARC::Record::new_from_xml($uxml, 'UTF-8');
00:10 kados     my $uxml = $record->as_xml;
00:10 kados     add the following after it:
00:10 kados     thd: line 79
00:10 thd       open
00:08 thd       kados: ok one moment
00:08 kados     open bulkmarcimport.pl
00:08 kados     thd: I have a fix for you
00:07 thd       nothing has been fixed after re importing the records.
00:06 thd       kados: I will be fast
00:06 kados     thd: but I don't have much time tonight, so you better make it quick
00:06 thd       kados: Let me get a better copy of the records they are full of redundancies from multiple targets making it difficult to search them for one
00:05 kados     thd: I'll fix it :-)
00:05 thd       kados: the correct diacritics failed even when I suppled the UTF-8 string
00:05 kados     thd: send me the records
00:04 thd       kados: no that would be a benefit if I could find records by searching with C?zanne as well
00:03 kados     is that your only problem?
00:03 kados     that's mysql being smart
00:03 thd       kados: I found before records that I should not have found when searching Cezanne instead of C?zanne
00:02 kados     thd: ?
00:02 kados     so i can fix it?
00:02 kados     could you send me the records you're importing
00:02 kados     well ... encoding that is?
00:02 kados     so you're saying that rel_2_2 doesn't currently handle imports from bulkmarcimport ?
00:01 kados     normalizing?
00:01 kados     ahh
00:01 thd       kados: who has been partially normalising the searches?
00:00 thd       kados: not with existing data I am using bulmarcimport.pl -d right now
23:59 kados     thd: did my fix work for you?
23:59 thd       kados: I tried one of those methods in PHP just for fun but I was not feeding it good data
23:57 kados     I certainly wouldn't want to have that be default behavour
23:57 kados     for special cases
23:57 kados     well ... it might be a good customization
23:56 thd       kados: I assume those methods are not foolproof
23:56 kados     you're nuts :-)
23:56 kados     hehehe
23:55 thd       kados: I have seen routines that essentially search for  question marks where they should not appear.
23:55 kados     heh
23:55 thd       kados: I posted before about guessing the encoding and then checking to see if an error is produced after temporary parsing
23:54 kados     we can't be expected to detect the record encoding
23:54 thd       kados: yes in fact there may be a check
23:54 kados     there's really nothing that can be done about that
23:53 thd       kados: They can hardly set it to an undefined value
23:53 kados     yowser
23:53 thd       kdaos: Except that BNF does not change that when sending records in ISO-8859-1
23:53 kados     hmmm ... it can't go to utf-8
23:52 kados     to fix the encoding?
23:52 kados     thd: so I check character position 26-27 and 28-28 in UNIMARC, if 26-27 are set to '01' I should use Encode::MAB2?
23:51 thd       http://search.cpan.org/~andk/MAB2-0.06/lib/Encode/MAB2.pm
23:50 kados     ahh ... Encode::MAB
23:50 thd       not that I really looked
23:49 thd       kados: It is described as Alpha but I have never seen actual problem reports
23:49 kados     don't see it
23:49 thd       kados: CPAN
23:49 kados     thd: where's MAB::Encode?
23:49 thd       ASCII is ISO-646
23:48 kados     t
23:48 kados     righ
23:48 kados     thd: well ... I guess ASCII is a subset of 8859
23:48 thd       kados: ISO 8859 has Latin characters past 128 which makes it more than ASCII
23:47 thd       kados ASCII is in there as an ISO standard
23:46 kados     thd: ascii _is_ iso-8859
23:46 kados     thd: not ascii according to the list you posted
23:46 thd       kados: there is MAB::Encode or whatever it is called for ISO-5426
23:45 kados     so UNIMARC's in trouble :-)
23:45 kados     MARC::Charset only knows how to deal with MARC-8 and UTF-8
23:45 kados     at least in terms of encoding
23:45 kados     well, so Koha's far from supporting UNIMARC
23:45 thd       kados: French users only have to worry about ASCII and ISO-5426
23:44 kados     I note that 8859's not on that list
23:44 kados     shit ... that's a lot of encodings
23:44 thd       kados: I know but I have to put it in the right place so this is faster
23:44 kados     cvs update addbiblio.pl
23:43 thd       50 = ISO 10646 Level 3 (Unicode)
23:43 thd       11 = ISO 5426-2 (Latin characters used in minor European languages and obsolete typography)
23:43 thd       10 = [Reserved]
23:43 thd       09 = ISO 8957 (Hebrew set) Table 2
23:43 thd       08 = ISO 8957 (Hebrew set) Table 1
23:43 thd       07 = ISO 10586 (Georgian set)
23:43 thd       06 = ISO 6438 (African coded character set)
23:43 thd       05 = ISO 5428 (Greek set)
23:43 thd       04 = ISO DIS 5427 (extended Cyrillic set)
23:43 thd       03 = ISO 5426 (extended Latin set)
23:43 thd       02 = ISO Registration # 37 (basic Cyrillic set)
23:43 thd       01 = ISO 646, IRV version (basic Latin set)
23:43 thd       Two two-character codes designating the principal graphic character sets used in communication of the record. Positions 26-27 designate the G0 set and positions 28-29 designate the Gl set. If a Gl set is not needed, positions 28-29 contain blanks. For further explanation of character coding see Appendix J. The following two-character codes are to be used. They will be augmented as required.
23:43 kados     just update the one file
23:43 thd        $a/26-29 Character Sets (Mandatory)
23:43 thd       UNIMARC 100 $a a fixed field defining the character sets in a manner similar to 000/09 in MARC 21
23:42 thd       kados: I interrupted the update to bring you
23:40 kados     anyway, did the fix work for you?
23:40 kados     that's really tricky
23:39 thd       kados: paul's users as I had said have been obtaining records encoded in ISO-8859 that should have been ISO-5426
23:37 kados     just uft-8 then?
23:37 thd       kados: no MARC-8 in UNIMARC
23:37 kados     or uploaded into the reservoir as
23:36 kados     but I mean: what character sets (other than utf-8 and MARC-8) are UNIMARC records likely to be download in
23:36 thd       kados: It is many but at least full unicode is defined
23:35 thd       ok updating now
23:35 kados     thd: also, if you're around, could you explain to me the different character sets that UNIMARC uses?
23:35 kados     thd: I just committed a fix for your issue (I think)
23:35 kados     thd: update your addbiblio.pl
22:43 thd       kados: the data in the XHTML is ISO-8859 but the data in MySQL is UTF-8.  Apache cannot be responsible.  Apache is in fact using UTF-8 encoding as directed but the data is ISO-8859.
22:35 thd       kados: Koha is responsible for sending the characters in conformance to my locale encoding using some feature of Perl most likely.  This is what I had proposed to develop as part of a configurable page serving feature.  MARC::Charset would not be required for that.
19:18 thd       kados: I strongly suspect Perl generally because there is a design issue that prevents it from working with Unicode as flexibly as more recently introduced or recently modified languages deriving from its origin without any thought to multi-byte character sets.  That was one thing that Perl 6 is intended to remedy.
19:06 thd       kados: The problem could be specific to Koha, I will have to test later if I create a UTF-8 page in Perl outside of Koha to see whether that works correctly.
19:04 thd       kados: I am probably not describing it correctly but PHP web applications works fine for UTF-8 on my system but Perl would seem to be the problem.
19:01 kados     thd: I don't see how you can interact directly with perl on a browser
19:01 kados     thd: you mean apache is telling perl that you're sending iso-8859
18:51 thd       kados: Yet PHP does not care what my locale is for sending the data to Apache correctly
18:50 thd       kados: Everything should at least look fine except that Perl is telling Apache that I am sending ISO-8859
18:49 thd       kados: No I used bulkmarcimport.pl
18:48 thd       kados: I can confirm that the contents of the data in MySql is correctly encoded in UTF-8
18:48 kados     but you're manually copy/pasting them into the Koha editor?
18:47 thd       kados: No these are records captured with my test YAZ/PHP Z39.50 client
18:47 kados     upgrade to 0.82 and test again
18:47 kados     ahh ... what version of MARC::File::XML are you using?
18:47 kados     it didn't change the leader to UTF-8?
18:46 kados     so you're doing original cataloging?
18:46 thd       kados: Koha translated them
18:46 thd       kados: However my problem is probably that Perl refuses to send them to Apache as UTF-8 without changing my locale
18:46 kados     who translated them?
18:46 kados     interesting
18:45 thd       kados: I should not have looked now but the source records were MARC-8 and they have been translated into UTF-8, although nothing changed the leader encoding for 000/09.
17:56 thd       kados: I have seen code that tests for question marks and then is able to try to guess the encoding.
17:56 thd       kados: The solution is not especially difficult even when the record encoding value does not match the actual encoding
17:54 thd       kados: However, it is an additional problem for migration to UTF-8
17:54 thd       kados: that is a great help for systems that are UTF-8 challenged
17:53 kados     if you can't rely on the leader there's no way I can think of to auto-sense what charset you're working with
17:53 kados     strange
17:53 thd       kados: BNF can export records in the illegal ISO-8859-1 character set while the encoding still shows ISO-5426
17:51 thd       kados: In fact all of paul's customers may have that problem
17:49 kados     as to create MARC in iso-8859 :)
17:49 kados     foolish :-)
17:49 kados     I didn't think that anyone would be so foolist
17:48 thd       s/another/a legal/
17:46 thd       kados: That will be a little tricky because the leader will always claim the encoding is in another character set
17:45 thd       kados: now that  I identified that problem I know there will be a need for a routine to translate the illegal ISO-8859 records people have into UTF-8
17:30 kados     thd: kinda
17:30 thd       kados: are you there?
16:56 kados     yea
16:56 owen      i.e. "this page has been disabled by your administrator"
16:55 owen      So I can hide the link to the reading record page. Should I disable the display of reading history within the reading record page itself?
16:55 kados     but might make more sense to have a separate one for the intranet
16:55 kados     yep it can ... needs support in the templates, that's all
16:54 owen      I guess so
16:47 owen      Can that be used in the intranet too?
16:47 owen      Oh... opacreadinghistory.
16:46 owen      Did you create a new syspref to show/hide the reading record?
16:46 kados     I'd leave it in for now ... hopefully we can fix it
16:45 kados     it's a major flaw in the current design
16:45 owen      No problem. I'm just trying to clean up the templates where possible (answer: not here)
16:45 thd       owen: there is a workaround that requires a lot of work to setup but unfortunately I do not have time to explain it to you at the moment
16:43 thd       kados: that is not supported on any MARC version of Koha
16:43 owen      I'm just trying to figure out whether we still need the option to choose an item type when making a reserve (it's one of the things hidden in NPL's production template). So it's just us that can't use it.
16:42 kados     so ... thanks for reminding me :-)
16:42 kados     it's one of those things on my list to check out
16:42 kados     I'm not sure why we got jipped
16:42 kados     it is in the non-MARC and the UNIMARC
16:42 kados     however, this behavior isn't supported in the MARC21 version of KOha
16:41 kados     yes, strictly speaking, you can have more than one itemtype attached to a biblio
16:41 kados     this is the tricky bit
16:40 owen      Is it still possible to have more than one itemtype attached to one biblio?
16:40 owen      Is it even possible with Koha now to have more than one checkbox in that list of items?
16:40 owen      http://koha.liblime.com/cgi-bin/koha/request.pl?bib=18398
16:39 kados     what's up?
16:39 kados     owen: yea ... kinda
16:32 owen      kados: you around?
15:10 thd       thank you owen
15:10 thd       owen: I assumed the changes were there but not working as I had expected.
15:09 thd       owen: I had a mistaken pathname in my update script recently and had been missing changes which I have only just seen today
15:08 thd       I see them
15:08 owen      and colors.css for opaccolorstylesheet
15:08 owen      for the default, use opac.css for opaclayoutstylesheet
15:08 thd       owen: what are the preferences called?
15:07 owen      We need to handle this better somehow in the case of new installations, I think
15:07 thd       owen: oh, yes I looked for that but did not see them
15:07 owen      There are new system preferences for defining those stylesheets
15:07 thd       </style>
15:07 thd       	@import url(/opac-tmpl/npl/en/includes/);
15:06 thd       <style type="text/css">
15:06 thd       <link rel="stylesheet" type="text/css" href="/opac-tmpl/npl/en/includes/" />
15:05 owen      I'm not sure what you mean
15:05 thd       owen: why does stylesheet have no file name in rel_2_2 ?
14:38 pierrick_ I'm going back home now, diner outside
14:38 pierrick_ I have to read IRC log in details to understand what remains problematic, but I'll do it tomorrow morning
14:37 pierrick_ Paul has already done a deep investigation
14:37 pierrick_ ded to me and associated web links
14:37 pierrick_ hdl: I've finished reading the mails you forwar
14:29 pierrick_ thanks
14:29 kados     journey is a bit archaic
14:29 kados     trip I think
14:28 pierrick_ kados: should I say "journey" or "trip"?
14:28 kados     bye paul_away
14:28 pierrick_ oh OK, enjoy your 50kms trip :-)
14:27 paul_away (Ouest Provence to say everything...)
14:27 paul_away (i'm with a customer tomorrow. not WE !)
14:26 pierrick_ have a good long WE paul :-)
14:26 paul_away see you on monday, for a new week of Koha hack !
14:22 Sylvain   as far as I remeber this change removed the pb with null values
14:22 Sylvain   but maybe the second one creates another problem
14:21 Sylvain   line 69 only
14:21 kados     Sylvain: did you replace both?
14:20 kados     Sylvain: I see two instances of this
14:20 Sylvain   but it's not tested a lot :)
14:20 kados     ahh ... nevermind
14:20 kados     Sylvain: what line?
14:19 kados     Sylvain: I'll try this
14:18 Sylvain   and for me it works
14:18 Sylvain                           {
14:18 Sylvain                           if (($issuelength ne '') and ($maxissueqty ne ''))^M
14:18 Sylvain   I've rempalced it by :
14:18 Sylvain   in admin/issuingrules.pl there's a line with a # which is if ($maxissueqty > 0)
14:16 hdl       And of course, if it is needed.
14:16 hdl       s/worl/work
14:16 hdl       Not yet. But If Sylvain sends his patch, I could get it worl.
14:15 kados     hdl: is there a syspref for this?
14:15 hdl       Yes, but default value could also be a syspref, so that ppl could give it for once, and 21,5 is only an example.
14:14 kados     Sylvain: that'd be great!
14:13 kados     it makes Koha appear buggy
14:13 kados     it is seemingly small problems like this that give us a bad name
14:13 Sylvain   I think I had done a patch, have do search
14:13 kados     and issuingrules have been broken for several versions :-)
14:13 Sylvain   I agree that it doens't work :)
14:13 kados     so in my view, something is broken if it doesn't work the way it says it works :-)
14:12 hdl       except fees.
14:12 Sylvain   generating null values in issuingrules table
14:12 hdl       They fil in all the cells ;)
14:12 Sylvain   kados I've got problems with issuing rules and empty cells
14:12 kados     hdl: do your clients not experience this behavior?
14:11 kados     issuing will just fail
14:11 kados     also, if values are left out, there are no hardcoded defaults
14:10 kados     The biggest bug is that filling in values in the * column doesn't work--it should set default values all patron types, but it doesn't.
14:10 hdl       Not for us.
14:08 kados     paul: it's been broken for several versions now
14:08 paul      (on phone)
14:08 kados     paul: do your clients use the 'issuingrules' aspect of Koha?
14:08 kados     paul: are you still here?
14:01 pierrick_ (just to say in the wind : how easy it was to convert and use UTF8 with Java/Oracle, that's what we made in my previous job... but making C talk with Oracle UTF8 was hard... and I hate Oracle anyway)
13:26 pierrick_ thank you hdl :-)
13:24 hdl       (filter on utf in devel list.)
13:24 hdl       pierrick_: (I sent you them via email)
13:21 pierrick_ s{from where}{since when}g
13:20 kados     koha.org/irc is the irc log
13:20 kados     pierrick_: list archives are on savannah ... but google will find them better for you
13:15 pierrick_ from where do I re-read IRC and koha-devel to summarize the "MySQL, Perl and UTF-8 issue", I will summarize it on koha-devel
13:13 pierrick_ hehe
13:13 paul      (it was to see if you were following us ;-) )
13:13 paul      sorry
13:12 paul      you're right.
13:11 pierrick_ Auth.pm means authorities ? Why not in Context.pm where the database connection is made ?
13:09 paul      that's why i added it to Auth.pm
13:09 paul      I thought too.
13:08 pierrick_ (once database correctly converted to utf8)
13:08 pierrick_ I thought "set names 'UTF8';" was a the solution
13:08 paul      no, there is a collection of mails.
13:07 pierrick_ is there a mail on the mailing list explaining clearly the initial problem ?
13:07 paul      exactly : "ca tombe en marche"
13:06 paul      me too. that's why I tried to understand.
13:06 pierrick_ I hate not understanding :-/
13:06 pierrick_ "ça tombe en marche" (sorry kados, don't know how to translate)
13:05 paul      it's dangerous to get something working through 2 things not working, but that's the only solution I see atm
13:05 paul      I don't know. I just see that it work under 2.2 and for Tümer in Turkey !
13:04 paul      but hides 2 problems.
13:04 paul      the result is 0, as expected.
13:04 pierrick_ when do you convert from iso to utf8 for display ?
13:04 paul      that's what I call the 2x1 million $ problem :
13:04 paul      * we do nothin in Perl
13:04 pierrick_ if your data are not stored in UTF-8, you'll have display problems
13:04 paul      * we keep mysql in iso
13:04 paul      it seems that utf8 works better if :
13:03 paul      that's what we want.
13:03 paul      yep.
13:03 pierrick_ paul: in PROG template, I read charset=utf-8 hardcoded
13:03 kados     brb
13:01 paul      (in PROG templates)
13:01 paul      (that's already done in head pierrick_)
13:01 pierrick_ and change the HTML headers...
13:00 kados     right
13:00 paul      * updatedatabase & see if it work
13:00 paul      * comment the utf8 move in updatedatabase
13:00 paul      * get a 2.2 working
13:00 kados     but I think eventually we will need to fix the underlying problem
13:00 paul      to test :
12:59 paul      as in 2.2 !
12:59 paul      * keep mysql collate NOT in utf8
12:59 pierrick_ because PostgreSQL charset is based on system locale... and under Windows, you only have foo1252
12:59 paul      the last possibility being to stay with our 2x1 000 000 problem.
12:59 kados     this is a true dilemma then :-)
12:58 kados     interesting
12:57 pierrick_ my front co-worker tells me PosgreSQL in UTF8 is not working very well under Windows
12:56 kados     pierrick_: (i understood)
12:56 kados     I think we need to proceed carefully though
12:56 pierrick_ sorry, I wanted to say "we can't make"
12:55 kados     the more I think about it the more I like the idea
12:55 kados     right
12:55 pierrick_ but I can't believe we can make Koha work in full UTF-8 using same technnologies (Perl and MySQL) as in 2.2
12:49 pierrick_ and if I remember well, it was quite easy in fact
12:48 pierrick_ my internship was about making the CMS talking to MySQL or Oracle or PostgreSQL. In unicode because the customer was asian
12:47 pierrick_ my PostrgreSQL experience is quite old in fact, I was working on it in 2002 on a Java CMS
12:46 pierrick_ If Koha uses MySQL InnoDB as table engine and utf8 as charset, I would say that it's worth switching to PostgreSQL
12:45 pierrick_ So, my opinion about PostgreSQL ?
12:43 owen      kados: I think your liblime color stylesheet is missing some CSS relating to the patron image. That might be why it's getting overlaid by the patron details
12:42 hdl       c<
12:42 hdl       When I tried to install Koha on a window box, data had to be utf-8.
12:40 kados     almost :-)
12:40 pierrick_ (I'm back... sorry kados, you asked a question, I'm going to answer)
12:40 paul      you're like a business man then now ?
12:40 kados     paul: right before LibLime's first conference :-)
12:40 kados     paul: but the long hair was cut some months ago ... about 6 months in fact
12:39 paul      wow !
12:39 paul      (same for me -konqueror-)
12:39 kados     paul: I shaved my head with a razer about a week ago :-)
12:39 hdl       Some information ar not well displayed (on the right of the picture
12:39 paul      bald ?
12:39 kados     paul: no ... quite bald now :-)
12:39 kados     owen: hehe
12:39 paul      (still long ?)
12:39 kados     hehe ... yes ... with long hair even :-)
12:39 paul      lol
12:38 paul      ;-)
12:38 paul      http://www.paulpoulain.com/photos/voyge_USA/p_img_0061.jpg.html
12:38 paul      aren't you here joshua ?
12:38 kados     (of course, my pic is not avaiable :-))
12:37 kados     owen has created a very nice patronimages option in the NPL templates
12:37 kados     http://koha.liblime.com/cgi-bin/koha/circ/circulation.pl?findborrower=0054313
12:37 kados     in case noone has seen it:
12:37 hdl       Under Windows, Mysql and apache may be utf8 by default.
12:36 kados     owen: the patron images stuff looks nice :-)
12:36 owen      Hi
12:36 kados     morning owen
12:36 kados     pierrick_: what is your opinion?
12:36 kados     pierrick_: you have postgres experience, right?
12:36 kados     ahh
12:35 paul      whereas switching to Postgres will make some problems with DB structure & management
12:35 paul      yes because adding Encode is a boring but trivial task
12:35 kados     paul: in your opinion?
12:35 kados     paul: would switching to postgres be harder than putting in 'Encode' everywhere?
12:34 kados     (though much harder to use)
12:32 kados     (I have been working with postgres with Evergreen and I must say it is much nicer than mysql)
12:32 kados     paul: how many hours do you estimate it would take?
12:32 paul      in fact the fix for mysql you can find on the net is a port from the fix for Postgres !
12:31 paul      no.
12:31 paul      (+ complex for existing libraries)
12:31 kados     does the DBD::Postgres driver have the same problems?
12:31 paul      I just will say : why not, but that's a huge task !
12:31 paul      I won't
12:31 kados     no answer from mysql
12:31 paul      no answer from mysql ?
12:31 kados     paul: you will hate me to say this: what about switching to postgres? :-)
12:30 kados     interesting
12:30 paul      it worked it seems. I just Encode everything coming from mysql socket
12:30 paul      it's a Pure Perl mysql driver.
12:30 paul      I installed mysqlPP driver and hacked it a little.
12:30 paul      I had a solutions that seemed to work :
12:29 kados     sure
12:29 paul      could you ask him ?
12:29 paul      :-(
12:29 kados     in that case, I suspect that as soon as he re-encodes mysql as utf-8 as in HEAD he will have the same problems we have
12:29 paul      pierrick_ and Sylvain : introduce yourself
12:29 paul      in fact I think he's working with 2.2 but I could be wrong.
12:28 kados     because I could also say 'everything is OK' on my server
12:28 paul      no.
12:28 kados     are you sure he is testing with HEAD?
12:28 paul      it can be the utf8 config of it's server (+ he's under windows)
12:28 paul      as, for Tümer everything is OK, I was suspecting he had something that could explain this.
12:28 kados     so we must again set the flag
12:27 paul      and DBD::mySQL returns a "non flagged" results.
12:27 kados     I see no way around this
12:27 kados     yep
12:27 paul      and the Encode:decode does this for variables.
12:27 kados                open.
12:27 kados                layer.  Other encodings can be converted to Perl's encoding on input or from Perl's encoding on output by use of the ":encoding(...)"  layer.  See
12:27 kados                Perl knows when a filehandle uses Perl's internal Unicode encodings (UTF-8, or UTF-EBCDIC if in EBCDIC) if the filehandle is opened with the ":utf8"
12:27 kados     Input and Output Layers
12:27 paul      kados : you're right.
12:27 paul      another frenchy !
12:26 Sylvain   Hi all
12:26 kados     the bottom line is, we must explicitly tell perl we are working with utf-8
12:24 kados     when they have been so marked, perl will convert all byte-based characters to utf-8
12:23 kados     to use character-based you must 'mark' the character-based interfaces so that perl knows to expect chracter-oriented data
12:23 kados     to do this, perl uses bytes by default (at least in 5.6)
12:22 kados     2. allow byte-based programs to use character-based characters 'magically'
12:22 kados     when they were using byte-based characters
12:21 kados     1. not break old byte-based programs
12:21 kados     in perl 5.6 they wanted to:
12:21 kados     which is a major problem for unicode of course
12:20 kados     earlier versions of perl (before 5.6) did not distinguish between a byte and a character
12:20 kados     ok
12:20 paul      (i'm still here, but answering an RFP)
12:20 paul      for sure !
12:20 kados     paul: shall I explain what I have learned about utf-8?
12:20 paul      OK, I have to stop working on utf-8 atm
12:20 kados     maybe 4 gig isn't big enough?
12:19 hdl       shadow get full and no .mf files...
12:19 kados     hdl: i haven't had time to ...
12:19 pierrick_ (but my "problem" is really not important at all, I admit)
12:18 hdl       unless you commited a fix  very recently.
12:18 kados     hdl: it should use the z3950_extended_services() routine in the same way that bulkmarcimport.pl does
12:18 hdl       kados: Yet it seems to be.
12:18 kados     hdl: unless rebuild_zebra.pl does not use the correct subroutine in Biblio.pm to connect to z3950 extended services
12:18 pierrick_ kados: this is the reason why many applications offers a table of characters
12:17 kados     hdl: zebraidx commit should not be necessary
12:17 hdl       (commit all the stuff)
12:17 hdl       5) zebraidx commit
12:17 kados     pierrick_: the problem is, what if I want to search for that but don't have the correct keyboard?
12:17 pierrick_ we're are talking about a stupid simple example, imagine something like a chinese character^W ideogram search
12:17 hdl       4) launch rebuild_zebra.pl -c (on an updated base) wait and wait and wait....
12:17 kados     hdl: looks correct
12:16 paul      (in real life I mean)
12:16 paul      yes, but WHO want to search anaës and not anaes ?
12:16 pierrick_ but it might be a MySQL cleverness. I thought it was a Koha feature
12:15 pierrick_ paul: it's false on a general point of view because it become impossible to search anaël and not anaes
12:15 hdl       3) zebrasrv localhost:2100/nameofyourbase
12:14 paul      :-(
12:14 paul      still not utf-8 when logged as paul
12:14 hdl       paul : marche mieux ?
12:14 hdl       zebraidx commit
12:14 hdl       then zebraidx create Nameofyourzebrabase  (define in the /etc/koha.conf)
12:13 hdl       And create tmp, shadow, lock directories.
12:12 hdl       1) modify zebracfg according to your US one.
12:12 kados     please do
12:12 hdl       kados : can I try and make a summary of operations needed to get a zebra base.
12:11 kados     paul: ok
12:11 kados     earlier versions of perl (before 5.6) did not distinguish between a byte and a character
12:11 paul      wait a little
12:10 kados     so here is what I have learned so far about utf-8 and perl
12:10 paul      otherwise, a forgotten or wrong accent would make many data disappear !
12:10 paul      because from a librarian point of view it's exactly what they want !
12:09 paul      pierrick_ : who told you it was false ?
12:09 kados     you'd have to check the manual though
12:09 kados     you can probably turn off this feature if you don't like it
12:09 kados     and I suspect that's mysql being clever :-)
12:08 pierrick_ kados: searching an accentuated name returns unaccentuated nams
12:06 pierrick_ As I suspected, in Koha 2.2 borrowers search, searching "anaë" returns "anaël" and "anaes". I might be smart, but it's false...
12:04 kados     yep
12:04 thd       kados: by import you mean authority building?
12:03 kados     thd: I suspect that's a template prob ... I noticed it as well
12:03 kados     thd: so we can refine it :-)
12:03 kados     thd: in the meantime, if you're finished with the MARC framework, you could start compiling a list of problems with the import :-)
12:03 thd       kados: What I do not know is why bib records show as 0
12:02 kados     thd: I hope to have some more time to work on authorities this afternoon
12:02 thd       kados: The values are not being normalised before the comparison is made.
12:02 kados     thd: I'm currently working on some utf-8 probs in 3.0
12:01 thd       kados: Some differences are because of the presence or absence of a full stop at the end of the field.
12:00 thd       kados: Only half of those records used authority control for 100.
11:59 hdl       Normalement, français.
11:59 hdl       J'ai eu à le faire moi-même à la main. dans le fichier /etc/sysconfig/keyboard
11:58 thd       kados: look at William Faulkner.
11:58 hdl       Je n'ai rien vu de très probant.
11:58 kados     thd: you found such instances?
11:58 thd       kados: I know what happened for some issues of multiple authorities created where there should have only been one.
11:54 paul      et je choisis quoi ?
11:54 paul      disposition du clavier => configuration du clavier tu veux dire ?
11:53 hdl       Il y a aussi les paramètres d'accessibilité KDE.
11:53 hdl       paul : dans matériel/disposition du clavier.
11:48 paul       Configurer votre ordinateur et aller dans clavier. ==>>> I don't see what you mean
11:47 hdl       (I rebuilded the stuff, once again today).
11:47 kados     hehe
11:47 hdl       (Sh.....!!!) kados: I can't search my base again.
11:46 hdl       Test in console mode Ctrl Alt F2 and check if locale is the same.
11:44 hdl       Configurer votre ordinateur et aller dans clavier.
11:44 hdl       (If I told you, I don't remember ;) )
11:43 paul      (you already told me but I don't remember)
11:43 paul      how can I check ?
11:43 hdl       or in KDE control ?
11:42 hdl       Do you have loaded a keyboard or a system with your MCC ?
11:42 kados     strange indeed
11:41 paul      except after a su - that says fr_FR.UTF-8
11:41 paul      locale told me fr_FR
11:41 paul      my locale.
11:41 hdl       you say I am not... But is it you keyboard or your locale that is not .
11:41 paul      fi
11:41 paul              . /etc/bashrc
11:41 paul      if [ -f /etc/bashrc ]; then
11:41 paul      export PATH
11:41 paul      set $PATH=$PATH:/usr/local/kde/bin
11:41 paul      [ -z $INPUTRC ] && export INPUTRC=/etc/inputrc
11:40 paul      }
11:40 paul              . /etc/profile.d/alias.sh
11:40 paul      [ -n $DISPLAY ] && {
11:40 paul      alias cp='cp -i'
11:40 kados     I'm not sure then :/
11:40 paul      alias mv='mv -i'
11:40 paul      alias rm='rm -i'
11:40 paul      bashrc :
11:40 paul      and that's all
11:40 paul      xmodmap -e 'keycode 0x5B = comma'
11:40 paul      export USERNAME BASH_ENV PATH
11:40 paul      USERNAME=""
11:40 paul      BASH_ENV=$HOME/.bashrc
11:40 paul      PATH=$PATH:$HOME/bin
11:40 paul      fi
11:40 paul              . ~/.bashrc
11:40 paul      if [ -f ~/.bashrc ]; then
11:40 paul      # Get the aliases and functions
11:40 kados     in .bash_profile?
11:39 paul      yes I know, but in .bashrc I don't see anything
11:39 kados     there are several .bash* files
11:39 kados     paul: in your /home/paul dir
11:39 kados     is that what you've done as of now?
11:39 paul      how could I see ?
11:39 kados     paul: http://dev.mysql.com/doc/refman/4.1/en/charset-conversion.html?ff=nopfpls
11:38 kados     could be that your charset is specified there?
11:37 kados     paul: check your .bash* files
11:37 paul      any idea someone ?
11:37 paul      really strange...
11:37 paul      when I su - i am.
11:37 paul      when I log as "paul" i'm not.
11:36 paul      oups. no
11:30 hdl       you are used to mysql and perl ;)
11:29 hdl       paul : i18n should be fr_FR.UTF-8 You certainly missed the hyphen (-)
11:28 paul      yep
11:28 hdl       Did you restart your computer after your sysconfig modification ?
11:26 paul      s/set/locale/
11:26 paul      what am I doing wrong ?
11:26 paul      ...
11:26 paul      LC_MESSAGES=fr_FR
11:26 paul      LC_MEASUREMENT=fr_FR
11:26 paul      LC_IDENTIFICATION=fr_FR
11:26 paul      LC_CTYPE=fr_FR
11:26 paul      LC_COLLATE=fr_FR
11:26 paul      LC_ADDRESS=fr_FR
11:26 paul      LANGUAGE=fr_FR:fr
11:26 paul      LANG=fr_FR
11:25 paul      a set gives me !
11:25 paul      BUT :
11:25 paul      SYSFONT=lat0-16
11:25 paul      LC_PAPER=fr_FR.UTF8
11:25 paul      LC_MONETARY=fr_FR.UTF8
11:25 paul      LC_TELEPHONE=fr_FR.UTF8
11:25 paul      LC_CTYPE=fr_FR.UTF8
11:25 paul      LC_MESSAGES=fr_FR.UTF8
11:25 paul      LC_IDENTIFICATION=fr_FR.UTF8
11:25 paul      LANG=fr_FR.UTF8
11:25 paul      LC_TIME=fr_FR.UTF8
11:25 paul      LC_MEASUREMENT=fr_FR.UTF8
11:25 paul      LC_NUMERIC=fr_FR.UTF8
11:25 paul      LC_NAME=fr_FR.UTF8
11:25 paul      LC_COLLATE=fr_FR.UTF8
11:25 paul      LC_ADDRESS=fr_FR.UTF8
11:25 paul      LANGUAGE=fr_FR.UTF8:fr
11:25 paul      SYSFONTACM=iso15
11:25 paul      mmm... strange. my /etc/sysconfig/i18n contains :
11:24 hdl       kados ::http://pastebin.com/592599
11:23 kados     on the kohatest machine
11:23 kados     I see mainly "en_US"
11:23 kados     hdl: what do you see when you type: locale ?
11:23 hdl       My locale, my addDefaultCharset in Apache, my keybord.
11:22 hdl       yes.
11:22 kados     hdl: and your locale is set to utf-8?
11:22 hdl       Yes.
11:21 kados     Bo0k52R3aD
11:21 kados     kohaadmin
11:21 hdl       login/pass ?
11:20 hdl       no trspassing :/
11:20 kados     hdl: this is the problem you're having?
11:19 kados     hdl: http://kohatest.liblime.com/cgi-bin/koha/admin/branches.pl
11:17 hdl       So if there is a perl/MySQL problem, when importing to zebra, you will import problems ;)
11:17 hdl       Since, when launching a rebuild_zebra.pl, you use MySQL data.
11:16 kados     first I must repair my HEAD box :-)
11:16 kados     hdl: no need to send data
11:16 kados     hdl: I'm ok
11:16 paul      yep
11:16 kados     in that case, I can just copy/paste from websites
11:16 hdl       But it is linked.
11:16 paul      yep.
11:16 kados     right ... so it's branch names, borrowers, etc.
11:16 paul      the zebra one is another thing.
11:15 paul      the problem we are speaking of is a MYSQL one for instance.
11:15 hdl       this channel is iso8859-1.
11:15 kados     I would need iso2709
11:15 kados     hmmm
11:15 paul      NONE :
11:15 paul      they are not "true" utf8.
11:15 hdl       xml or iso2709 ?
11:15 paul      mmm... no.
11:15 paul      éàùÏ
11:14 paul      just copy paste this :
11:14 kados     I will attempt to get it working on my test box
11:14 kados     hdl: do you have some MARC records with accented chars you can send me (the data you are having trouble with)?
11:14 hdl       or try to googleize : DBD::mysql utf8 support
11:13 pierrick_ OK
11:12 hdl       About DBD::Mysql and utf8
11:11 hdl       Read this post.
11:11 pierrick_ hdl: warning, there is character set for server, database, table, column and for the connection
11:11 hdl       http://lists.mysql.com/perl/3779
11:10 pierrick_ before installing phpmyadmin, I was using a perl script, with a "set names 'latin1'" before any query
11:10 hdl       Mysql may return latin1 whatever collation you set it to.... I read sthg about that yesterday.
11:09 pierrick_ with latin1_general_cs, through phpmyadmin, it still returns Simone
11:09 hdl       either PERL or MySQL or perl with Mysql.
11:09 paul      and thought the .utf8 in env was a good solution.
11:08 paul      that's why I was thinking it was a perl problem.
11:08 hdl       This is quite embarassing for me.
11:08 hdl       When playing with phpmyadmin I have NO display problems though.
11:07 paul      change it to "Case sensitive" and you should have a different result
11:07 paul      _ci means "Case Insensitive".
11:07 pierrick_ maybe I should test another collation, you're wright
11:06 pierrick_ hdl: I'm using default collation with latin1 character set, this is latin1_swedish_ci
11:06 hdl       Is that not depending on collation or such ?
11:05 pierrick_ if Koha makes a transformation, that's OK, but MySQL should see a difference
11:05 pierrick_ paul: "select author from biblio where author like '%ère%'" should not return "Simone CAILLERE'
11:04 hdl       Would it be with biblio or with borrowers.
11:04 hdl       With accentuated letters, I encounter problems.
11:03 pierrick_ hdl: yes, google also transforms "é" to "e"
11:03 hdl       My pb is to clearly decode what is mysql pb and zebra.
11:03 kados     paul: on my rel_2_2 box I have no utf-8 problems
11:03 paul      pierrick : maybe not, as unicode is a hard feature, for mySQL, as well as for Perl !
11:02 hdl       some of them are quite old and always type data in capital letter.
11:02 pierrick_ paul: I've read more than twice MySQL documentation about mysql connection, maybe I missed something, gonna test one more time
11:02 paul      to all : I think we should clearly separate UTF8 problem with mySQL & UTF8 with zebra. I think it's a mySQL one isn't it ?
11:02 kados     pierrick_: http://indexdata.dk/zebra/doc/data-model.tkl
11:02 paul      (although I agree with hdl I don't understand why you consider this as wrong)
11:01 kados     pierrick_: with zebra, you have quite a bit of control over such behavior
11:01 paul      something like unicode_bin should be your friend !
11:01 hdl       but this is also what librarian wait for.
11:01 paul      thus, you need a mySQL collection other than utf8_unicode_ci
11:01 kados     hehe
11:01 pierrick_ this is quit smart from MySQL, but this is not what I'm waiting for
11:01 paul      that's not a bug, that's a feature !
11:00 pierrick_ I don't know exactly what was hdl problem, but I'm working on utf-8 handling for 3.0. I already encounter a problem with 2.2 : when I search '%ère%' (like in 'ministère'), it also returns things like 'Simonne CAILLERE'