Time |
S |
Nick |
Message |
12:00 |
|
kados |
so there are two unreleated problems it seems: |
12:00 |
|
kados |
one is with your local computer |
12:00 |
|
kados |
one is with your zebra/koha system |
12:01 |
|
paul |
if you could connect to my box, it would be useful |
12:01 |
|
kados |
what is the link again? |
12:02 |
|
paul |
http://o19.bureau.paulpoulain.[…]RCdetail.pl?bib=5 |
12:02 |
|
paul |
(opac) |
12:02 |
|
paul |
or : |
12:02 |
|
paul |
http://i19.bureau.paulpoulain.[…]RCdetail.pl?bib=1 |
12:02 |
|
paul |
(login test/test) |
12:02 |
|
kados |
server not found |
12:02 |
|
kados |
:( |
12:02 |
|
kados |
Firefox can't find the server at o19.bureau.paulpoulain.com. |
12:03 |
|
paul |
does that one work better : |
12:03 |
|
paul |
http://i5.bureau.paulpoulain.c[…]/koha/mainpage.pl |
12:03 |
|
paul |
? |
12:03 |
|
kados |
no |
12:03 |
|
paul |
if yes, then i've a dns problem is o19 |
12:03 |
|
kados |
in traceroute I get 'host name not found' |
12:03 |
|
kados |
so it's definitely dns prob |
12:03 |
|
paul |
bureau.paulpoulain.com is better ? |
12:03 |
|
kados |
no |
12:04 |
|
paul |
wow... same thing for me... host unknown |
12:04 |
|
paul |
(when I try to connect from a distant server |
12:04 |
|
paul |
) |
12:04 |
|
paul |
www.koha-fr.org works ? |
12:05 |
|
kados |
yes |
12:05 |
|
paul |
mmm... hdl around ? |
12:09 |
|
paul |
ok, it's time to leave for dinner. + i've a headhacke. |
12:09 |
|
paul |
have a good day kados & see you tomorrow. |
12:09 |
|
kados |
you too |
12:09 |
|
paul |
(+ a last note : when displaying result list from zebra, i don't have any problem) |
12:10 |
|
kados |
in the browser? |
12:10 |
|
paul |
yep |
12:10 |
|
kados |
with the same font? |
12:10 |
|
paul |
yep, of course |
12:10 |
|
kados |
strange |
12:10 |
|
paul |
(still dev_week from CVS) |
12:10 |
|
kados |
then maybe it's mysql again :-) |
12:10 |
|
kados |
(where can you see results list from zebra in dev-week? |
12:11 |
|
kados |
(afaik it's all display from mysql) |
12:11 |
|
paul |
opac-zoomsearch => search for something with X results |
12:11 |
|
paul |
nope, this one is zebra, if i don't mind |
12:11 |
|
kados |
ahh ... you're right |
12:12 |
|
paul_away |
this time, bye bye |
13:05 |
|
kados |
owen: you around? |
13:05 |
|
kados |
owen: adding a new subscription, I don't seem to be able to receive issues for it |
13:05 |
|
owen |
I'm getting full-serial-issues squared away finally I think. |
13:05 |
|
kados |
cool |
13:05 |
|
kados |
http://wipokoha.liblime.com/cg[…]?subscriptionid=1 |
13:08 |
|
owen |
No errors or anything, I take it? |
13:08 |
|
kados |
nope ... the form is just blank |
13:08 |
|
kados |
hang on, looks like you committed something to statecollection this morning |
13:09 |
|
kados |
nope ... |
13:09 |
|
kados |
only the Manual Issue option is showing up |
13:09 |
|
kados |
there are no auto-generated issues showing up :( |
13:10 |
|
kados |
:-) |
13:10 |
|
owen |
I'll finish with full-serial-issues.tmpl and then tackle that. |
13:10 |
|
kados |
k |
13:11 |
|
kados |
location not getting filled in either |
13:11 |
|
kados |
on that page |
13:54 |
|
owen |
Okay kados, I committed an updated full-serial-issues.tmpl as well as updated intranet.css and colors.css. That page is still giving a javascript error, but looks and fucntions much better than before |
13:55 |
|
kados |
owen: looks like my problem with the statecollection was due to not updateing the datbase |
13:55 |
|
kados |
http://wipokoha.liblime.com/cg[…]?subscriptionid=1 |
14:01 |
|
owen |
Great |
14:04 |
|
owen |
kados: did you ever get a chance to work on the OPAC facets so that the template could output stuff as a list? |
17:24 |
|
thd |
kados: are you there? |
17:25 |
|
kados |
thd: yes |
17:25 |
|
kados |
thd: alwys :-) |
17:25 |
|
thd |
kados: are you always awake too? |
17:26 |
|
kados |
thd: almost always :-) |
17:26 |
|
thd |
kados: I hope my reorganised koha-zebra list question is clearer |
17:27 |
|
thd |
kados: I was expecting some answer of just do it like we told paul to do it in the thread you quoted from February |
17:28 |
|
kados |
you mean expecting from ID? |
17:28 |
|
thd |
yes |
17:29 |
|
thd |
kados: paul's character display problems seem like X-windows problems with some fonts if he had a different result when changing fonts in CSS |
17:31 |
|
thd |
kados: Even if MS Windows and OSX work fine we should not have paul and I wondering if something else is wrong every time the characters do not look correct. |
17:32 |
|
thd |
kados: all the standard CSS should use only the fonts which have the fewest problems in the least capable environment |
17:34 |
|
thd |
kados: paul should be worrying about real encoding issues instead of chasing CSS phantoms |
17:40 |
|
thd |
kados: are you unconcerned because it does not happen on your system and X-windows is a minority system that most of your prospective customers are not running? |
17:43 |
|
thd |
kados: for proportional fonts I trust nothing but Arial, Helvetica, Geneva, sans-serif on X-windows |
17:44 |
|
kados |
I was having trouble with Arial on OSX |
17:44 |
|
kados |
using firefox |
17:44 |
|
kados |
(safari was ok) |
17:49 |
|
thd |
kados: well then Helvetica, Geneva, sans-serif, Arial |
17:49 |
|
thd |
kados: if we put the somewhat problematic ones last, that will be an improvement |
17:51 |
|
thd |
in case the environment is MS Windows where at least fonts work but maybe some fonts will not be present |
17:56 |
|
thd |
kados: if I remember, only 2 monospaced fonts worked for me in X-Windows. and none was in the Koha list |
18:05 |
|
kados |
thd: we should at some point, try to compile a scratchpad of which fonts work well with which language scripts and encodings on which operating systems and browsers :-) |
18:06 |
|
thd |
kados: meanwhile paul is chasing phantom problems instead of real ones |
18:06 |
|
kados |
phantom problems? |
18:06 |
|
kados |
I think the problem he was having was actually a font problem |
18:06 |
|
kados |
because it was only happening for utf8 combining characters |
18:07 |
|
thd |
kados: exactly, a phantom problem, not a real underlying problem |
18:07 |
|
kados |
ahh, right |
18:08 |
|
thd |
kados: paul and I should not wonder if something is wrong every time the characters do display correctly |
18:08 |
|
thd |
kados: if the characters do not display correctly we should know that something is wrong |
18:09 |
|
thd |
at a lower level than fonts |
18:09 |
|
Burgwork |
somebody was looking for me? |
18:09 |
|
kados |
not me I don't think |
18:10 |
|
thd |
maybe in another channel Burgwork |
18:21 |
|
thd |
kados: testing monospaced fonts again, the 2 named Courier fonts in Koha CSS do not work for me |
18:25 |
|
Burgwork |
thd, kados hmm, ok. Might have been a mistaken ping, or somebody looking for me at home |
18:27 |
|
thd |
kados: I think that monospaced should be FreeMono, "Courier New", Courier, monospace None of those but FreeMono works for me |
05:34 |
|
slef |
can someone boot From: "KISS Madeleine \(OPOCE\)" <Madeleine.Kisscec.eu.int> |
05:34 |
|
slef |
off the lists until next month... there's a dumb auto-reply replying to the From address of every list email |
06:38 |
|
osmoze |
hello #koha |
06:44 |
|
osmoze |
just one question, who are the designer for koha ? (tee-shirt and logo ?) |
07:02 |
|
paul |
hello osmoze. It's katipo |
07:02 |
|
kados |
morning paul |
07:02 |
|
paul |
hello kados |
07:02 |
|
paul |
seems I still have DNS problems. |
07:03 |
|
kados |
strange |
07:03 |
|
paul |
but we can solve them easily for you : |
07:03 |
|
paul |
in etc/hosts, add |
07:03 |
|
paul |
213.41.245.208 o19.bureau.paulpoulain.com |
07:03 |
|
paul |
and you should be able to reach |
07:04 |
|
paul |
http://o19.bureau.paulpoulain.[…]RCdetail.pl?bib=1 |
07:04 |
|
paul |
and tell me if conbined chars are OK for you |
07:05 |
|
kados |
200$a is not ok |
07:05 |
|
kados |
a titre propre Dogmatique chr�etienne |
07:05 |
|
paul |
??? |
07:05 |
|
kados |
same with 606 $a |
07:05 |
|
paul |
I just see dogmatique ch,tienne |
07:05 |
|
kados |
I see question marks |
07:05 |
|
kados |
I see: |
07:05 |
|
osmoze |
hi paul and kados |
07:06 |
|
kados |
sujet Th?eologie dogmatique |
07:06 |
|
osmoze |
paul, c est katipo, mais il y a une personne en particulier ? c est pour une suggestion de logo |
07:06 |
|
paul |
kados : who wrote koha logo at katipo ? do you know ? |
07:06 |
|
kados |
rachel |
07:07 |
|
paul |
I see the same thing for chr,tienne & th,ologie |
07:07 |
|
paul |
a kind of comma |
07:07 |
|
paul |
you see different things ? |
07:07 |
|
paul |
(maybe you could screen copy your page & show it to me) |
07:08 |
|
osmoze |
rachel is rach ? |
07:09 |
|
paul |
you can see mine at 213.41.245.208/kados.png |
07:09 |
|
paul |
osmoze: yes. |
07:10 |
|
kados |
http://kados.org/desktop.png |
07:10 |
|
paul |
ok, we have the same thing, except that i see a kind of comma instead of a ? |
07:10 |
|
paul |
a question : |
07:11 |
|
paul |
when you pasted the "titre propre", I got many A+square as é/?/, |
07:11 |
|
osmoze |
ok, in fact, i ll suggest a logo like http://www.debian.org/logos/button-mini.png for liking in a blog, so purhups it's already exist |
07:11 |
|
paul |
how did you do your copy/paste ? |
07:11 |
|
paul |
(I have 11 A+square) |
07:11 |
|
kados |
just with copy paste of OSX :-) |
07:12 |
|
kados |
right |
07:12 |
|
kados |
it's because this characters is a combining character |
07:12 |
|
paul |
but it should appear correctly on your osX isn't it ? |
07:12 |
|
paul |
(or should I update the stylesheet ? |
07:12 |
|
paul |
) |
07:12 |
|
kados |
well ... not always |
07:13 |
|
kados |
let me check with safari |
07:13 |
|
kados |
ok, I can verify, it's not OK |
07:13 |
|
kados |
in safari I also get a question mark |
07:13 |
|
paul |
which means what ? |
07:14 |
|
kados |
it means that perhaps the encoding is still wrong |
07:14 |
|
kados |
and it might not just be the font |
07:17 |
|
paul |
very nice to read :-( |
07:17 |
|
kados |
paul: one thing that troubles me about this |
07:17 |
|
kados |
in UTF8, combining characters are written: |
07:17 |
|
kados |
e/ |
07:17 |
|
kados |
but it seems in these exmples, the ? is _before_ the base character |
07:17 |
|
kados |
as in /e |
07:18 |
|
kados |
/ standing for acute above the e |
07:18 |
|
kados |
in MARC8 it would be written /e |
07:18 |
|
kados |
so I wonder did you run these records through MARC::* before displaying them? |
07:19 |
|
paul |
yep, as MARCgetrecord read mySQL & uses MARC::Record |
07:20 |
|
paul |
(note that this database comes from Koha 2.2, and previously it was a marc21 database on a proprietary software, so maybe encoding is wrong since the migration to koha, but it worked correctly until now) |
07:21 |
|
kados |
paul: sometimes a character can be correctly encoded and still show up as a ? |
07:21 |
|
kados |
paul: http://www.mezzoblue.com/archi[…]005/07/25/glyphs/ |
07:21 |
|
kados |
paul: it was marc21 and you converted to unimarc? |
07:21 |
|
paul |
yep |
07:21 |
|
kados |
why? |
07:21 |
|
paul |
(but I did nothing specific for encoding |
07:21 |
|
paul |
because the library resquested it ;-) |
07:22 |
|
kados |
very strange |
07:22 |
|
paul |
(it was specified in RFP, I did not suggest anything) |
07:22 |
|
kados |
it's a french library? |
07:22 |
|
paul |
of course |
07:22 |
|
paul |
(Institut Protestant de Théologie) |
07:22 |
|
kados |
wow, I thought most french libraries were anticipating moving from unimarc to usmarc :-) |
07:22 |
|
paul |
where did you get this idea ? |
07:23 |
|
kados |
from dev week :-) |
07:23 |
|
kados |
so you've done nothing specific for encoding? |
07:23 |
|
kados |
of these records? |
07:24 |
|
kados |
did you at least convert to UTF8 using MARC::Charset? |
07:24 |
|
kados |
(otherwise, how could it be done ... marc21 only has two encodings, MARC8 and UTF8) |
07:24 |
|
paul |
no, I strictly did nothing. |
07:25 |
|
paul |
just read the iso file, put in in MARC::Record, get subfield by subfield a rearrange them in an UNIMARC way |
07:25 |
|
kados |
ok, so you have MARC8 data of course :-) |
07:25 |
|
paul |
(but i don't know what was the real encoding in previous ils) |
07:26 |
|
paul |
but why does it works with koha 2.2 ? |
07:26 |
|
kados |
what is leader / 09? |
07:26 |
|
paul |
mmm... i rebuilded the leader, it's no more the marc21 one. |
07:27 |
|
kados |
if leader / 09 is 'a', it means UNICODE, otherwise it means MARC8 |
07:27 |
|
paul |
but for instance, it's : 005767nam 22001813 4500 |
07:27 |
|
kados |
so , you have MARC8 ... or else you don't have USMARC records to start with :-) |
07:28 |
|
paul |
(it's the space between m and 2 right ?) |
07:28 |
|
kados |
no, that's position 7 |
07:28 |
|
paul |
the 'a' ? |
07:28 |
|
kados |
ahh ... yes |
07:28 |
|
kados |
sorry ... between m and 2 |
07:29 |
|
kados |
marc8 and latin1 share some codepoints |
07:29 |
|
paul |
ubt it's unimarc record, so MARC::Record should read : |
07:29 |
|
paul |
100 _a d u y0frey50 |
07:29 |
|
paul |
the 50 meaning it's utf8 data |
07:29 |
|
paul |
(it's 100$a) |
07:30 |
|
kados |
but it's _not_ utf8 data in the marc21 record! |
07:30 |
|
kados |
it can't be ... unless it's not MARC21 |
07:30 |
|
kados |
it's really quite simple |
07:31 |
|
kados |
with MARC21, you either have MARC8 or UTF8 |
07:31 |
|
kados |
and you look in the leader to see which one you have |
07:31 |
|
kados |
it's MARC8 in your case |
07:32 |
|
kados |
we know this for two reasons: |
07:32 |
|
kados |
1. leader position 9 is ' ' |
07:32 |
|
kados |
2. your combining characters are structured as /e instead of e/ |
07:32 |
|
dewey |
Hmm. No matches for that, kados. |
07:33 |
|
kados |
paul: so you must convert from MARC8 to UTF8 using (probably) MARC::* |
07:33 |
|
kados |
or ... MARCEdit does it too |
07:33 |
|
kados |
hey owen |
07:33 |
|
owen |
Hi |
07:33 |
|
dewey |
what's up, owen |
07:33 |
|
owen |
dewey: you're chatty this morning! |
07:33 |
|
dewey |
owen: huh? |
07:33 |
|
paul |
and how do I that ? (convert from marc8 to utf8) ? |
07:34 |
|
paul |
(MARCedit seems to require windows, which I don't have :-( ) |
07:34 |
|
kados |
here is one example: http://liblime.com/public/roundtrip.pl |
07:35 |
|
kados |
but ... it's just an example |
07:35 |
|
kados |
sometimes you need to also check the leader length |
07:36 |
|
kados |
also check Opening Files here: http://wiki.koha.org/doku.php?[…]pad#opening_files |
07:36 |
|
paul |
strange, wiki.koha.org still don't work for me :( |
07:37 |
|
paul |
mmm... really strange ... |
07:37 |
|
paul |
works on konqueror but not on firefox ! |
07:37 |
|
kados |
hehe |
07:37 |
|
kados |
I bet firefox has cached something wrong |
07:38 |
|
paul |
I wrote the trick for opening_file, so I knew it ;-) |
07:38 |
|
kados |
hehe |
07:38 |
|
paul |
my problem with your script is that I don't have a iso file, I just have a koha database. |
07:39 |
|
kados |
so you have to use export.pl |
07:39 |
|
paul |
but export with or without :utf8 when opening the file ??? |
07:40 |
|
kados |
in fact, I don't think you can do it properly |
07:40 |
|
kados |
it's the fault of Koha |
07:41 |
|
kados |
I tried many hours to fix NPL's encoding probs |
07:41 |
|
kados |
but mysql doesn't understand MARC8 |
07:41 |
|
kados |
and if your table defs are set to latin1 |
07:42 |
|
kados |
the characters will be mangled when you export I think |
07:42 |
|
kados |
especially the combining characters |
07:45 |
|
paul |
OK, i've exported the 1st biblio only. once with utf8, once without |
07:45 |
|
paul |
the .utf8 is 578 bytes long, the leader says 578 bytes and I can see \xc3\xa2 for é |
07:46 |
|
kados |
Dogmatique chr<e2>etienne |
07:46 |
|
paul |
the .latin is 576 bytes long, the leader still says 578, and I see \xe2\x65 (âe) |
07:46 |
|
kados |
that is the code of the character |
07:46 |
|
kados |
00e2 |
07:49 |
|
kados |
paul: <code> |
07:49 |
|
kados |
<isCombining>true</isCombining> |
07:49 |
|
kados |
<marc>E2</marc> |
07:49 |
|
kados |
<ucs>0301</ucs> |
07:49 |
|
kados |
<utf-8>CC81</utf-8> |
07:49 |
|
kados |
<name>ACUTE / COMBINING ACUTE ACCENT (Oxia)</name> |
07:49 |
|
kados |
so the MARC is E2 for Acute / Combining Acute Accent |
07:49 |
|
kados |
meaning you have MARC8 data :-) |
07:50 |
|
paul |
but why is it shown correctly with koha 2.2 ??? |
07:50 |
|
paul |
(unimarc, default templates, french, iso8859-1) |
07:51 |
|
kados |
where is the catalog? |
07:51 |
|
kados |
I can look at the same record there? |
07:52 |
|
paul |
you mean the real life catalogue ? |
07:52 |
|
kados |
yes |
07:52 |
|
paul |
http://catalogue.iptheologie.f[…]koha/opac-main.pl |
07:55 |
|
kados |
it's the same code point there |
07:55 |
|
paul |
which means it's marc21 ? |
07:55 |
|
kados |
yes, it seems so |
07:55 |
|
paul |
so why the hell does it work ??? |
07:55 |
|
kados |
:-) |
07:56 |
|
kados |
very interesting question :-) |
07:57 |
|
kados |
wait ... |
07:57 |
|
kados |
it's not the same |
07:57 |
|
kados |
before it was E2, now it's E9 |
07:57 |
|
kados |
and it's not a combining character! |
07:58 |
|
kados |
latin small letter e with acute, U+00E9 ISOlat1 |
07:58 |
|
paul |
so, something between my 2.2. and my dev_week transformed the E9 in something else... |
07:58 |
|
kados |
so it's not MARC8 :-) and not MARC21 either :-) |
07:58 |
|
kados |
yes, it seems so |
07:58 |
|
paul |
what I did : |
07:59 |
|
paul |
- copy the 2.2 database (mysqldump => import) |
07:59 |
|
kados |
do you have the mysqldump? |
07:59 |
|
paul |
- alter table biblioitems ... collate utf8 |
07:59 |
|
kados |
check the codepoint in there |
07:59 |
|
paul |
mmm... I can do it again, as i have the 2.2 on my computer |
08:01 |
|
kados |
what is very strange to me |
08:01 |
|
paul |
in the dump of marc_subfield_table, I can see : |
08:01 |
|
kados |
is how did latin1 data turn into valid marc8 data |
08:01 |
|
paul |
\xc3\xa9 |
08:02 |
|
paul |
but that does not mean anything, as the dump alway produces utf8... |
08:02 |
|
paul |
so, how can I check ? |
08:02 |
|
kados |
you can view the hex in mysql itself |
08:02 |
|
paul |
how ? |
08:02 |
|
kados |
http://mysql.he.net/doc/refman[…]ng-functions.html |
08:03 |
|
kados |
something like: |
08:03 |
|
paul |
BIN(subfield_value) ? |
08:03 |
|
paul |
no |
08:03 |
|
kados |
SELECT subfieldvalue, HEX(subfieldvalue) from marc_subfield_table where ...; |
08:04 |
|
kados |
maybe even use substr |
08:04 |
|
kados |
to isolate a specific place in subfielfvalue |
08:06 |
|
paul |
é = E9 |
08:06 |
|
kados |
E9 = latin small letter e with acute |
08:12 |
|
paul |
ok, i've select hex() on marcxml from dev_week database, and I have C3A9 |
08:12 |
|
paul |
for é |
08:12 |
|
paul |
which means something added a C3. |
08:12 |
|
paul |
the question being : who ! |
08:12 |
|
kados |
? |
08:12 |
|
paul |
oups... |
08:12 |
|
kados |
A9? |
08:13 |
|
paul |
yes, A9 (and not E9 |
08:13 |
|
kados |
that's the (c) symbol |
08:13 |
|
paul |
it's C3A9, not A9 alone |
08:13 |
|
kados |
C3 is latin capital letter A with tilde |
08:13 |
|
kados |
in latin1 |
08:14 |
|
kados |
ahh |
08:14 |
|
paul |
(how do you know so quickly what means what ? you learned tables by heart ?) |
08:14 |
|
kados |
in utf8 C3A9 is LATIN SMALL LETTER E WITH ACUTE |
08:14 |
|
kados |
google :-) |
08:14 |
|
paul |
ah, so it's correct utf8 ? |
08:14 |
|
kados |
yes |
08:15 |
|
paul |
which is a good news isn't it ? |
08:16 |
|
paul |
as we just have to find why it's no more correct utf8 in my browser. |
08:16 |
|
kados |
I think so |
08:16 |
|
paul |
so it can be : mySQL, perl DBI, MARC::Record, MARC::File::XML |
08:16 |
|
kados |
but how did it get to UTF8 from MARC8? |
08:17 |
|
paul |
(or my browser, but we eliminated it through your test) |
08:17 |
|
paul |
maybe it was utf8 data already ? |
08:17 |
|
paul |
(in marc21 I mean) |
08:17 |
|
kados |
no ... we checked original catalog |
08:17 |
|
paul |
mmm... good point... |
08:17 |
|
kados |
it's MARC8 to start with |
08:17 |
|
paul |
I just ran updatedatabase from head (which translate tables to utf8) |
08:18 |
|
kados |
so maybe mysql is MARC8 aware? |
08:18 |
|
kados |
and we just didn't know it? |
08:18 |
|
paul |
(and yesterday, i did the text => blob => text manip that you requested) |
08:23 |
|
kados |
http://dev.mysql.com/doc/refma[…]set-charsets.html |
08:23 |
|
kados |
mysql doesn't know marc8 |
08:27 |
|
paul |
mmm... i've dumped in a file what is read from marcxml => it seems that I still get c3a9 |
08:27 |
|
paul |
then i've dumped the MARC record made from the xml => it's transformed to e265 |
08:37 |
|
kados |
paul: here's the problem: |
08:37 |
|
kados |
you have: |
08:37 |
|
kados |
Dogmatique chr<e1>etienne |
08:38 |
|
kados |
s/e1/e2/ |
08:38 |
|
paul |
<e2> you mean ? |
08:38 |
|
kados |
you should have: |
08:38 |
|
kados |
Dogmatique chre<e2>tienne |
08:39 |
|
paul |
??? |
08:40 |
|
kados |
so this was never touched by MARC::*? |
08:40 |
|
paul |
sorry, I didn't understand what you said previously : |
08:41 |
|
paul |
I have <e1> and I should have <e2> ? |
08:41 |
|
kados |
no |
08:41 |
|
paul |
i have <e2> and I should have <c3><a9> |
08:41 |
|
paul |
right ? |
08:41 |
|
kados |
you have <e2>e and you should have e<e2> |
08:42 |
|
kados |
in MARC8, combining characters are written as /e but in UTF8 as e/ |
08:42 |
|
dewey |
Hmm. No matches for that, kados. |
08:42 |
|
paul |
??? I thought c3a9 was e with acute ? |
08:42 |
|
kados |
hmmm |
08:42 |
|
paul |
we said some lines ago that I had correct utf8 in mySQL ? |
08:43 |
|
kados |
yes, in mysql |
08:43 |
|
paul |
and I still have c3a9 when retreiving the marcxml field. |
08:43 |
|
kados |
but I'm looking at the webpage |
08:43 |
|
paul |
but when I do new_as_xml, i get e265 |
08:43 |
|
kados |
65 is e |
08:44 |
|
kados |
becaues you didn't tell MARC::* that you have UTF-8 |
08:44 |
|
kados |
there are two ways to do it: |
08:44 |
|
kados |
1. leader position 9 is a |
08:44 |
|
kados |
2. call as: |
08:44 |
|
paul |
ah, OK, I think I begin to understand : you're saying my MARC::Record has been low endian encoded instead of hi endian |
08:44 |
|
paul |
(or something like that) |
08:44 |
|
kados |
new_from_xml('UTF-8','UNIMARC'); |
08:44 |
|
kados |
yes |
08:45 |
|
paul |
you want the bad news ? that's already what I do... |
08:45 |
|
paul |
$record = MARC::Record::new_from_xml( $marcxml,'utf8','UNIMARC' ) if $marcxml; |
08:45 |
|
kados |
hmm |
08:45 |
|
kados |
which version of MARC::File::XML ? |
08:45 |
|
kados |
and MARC::Record |
08:45 |
|
kados |
(not tumer's version I hope) |
08:46 |
|
paul |
what it tumer version ? |
08:46 |
|
kados |
one he posted to koha-devel |
08:46 |
|
paul |
mmm... iirc, yes, I commented the decode_utf8 line... |
08:46 |
|
kados |
I think only SF version knows about UNIMARC |
08:46 |
|
paul |
i've MARC::Record from sf |
08:46 |
|
paul |
$VERSION = '2.0'; |
08:47 |
|
kados |
MARC::File::Xml from SF? |
08:47 |
|
paul |
$VERSION = '0.83'; |
08:47 |
|
kados |
mike rylander asked you to test many months ago, unimarc support |
08:47 |
|
kados |
because he added this function |
08:47 |
|
kados |
to not touch the encoding if unimarc |
08:47 |
|
kados |
?? |
08:48 |
|
kados |
that's how we got the ,UNIMARC flag in the first place |
08:48 |
|
kados |
so if I were you, I'd post to perl4lib, explain that: |
08:48 |
|
kados |
1. you are using MARC::* from SF |
08:49 |
|
kados |
2. you have a UNIMARC record, encoded as UTF-8 but with no position 9=a |
08:49 |
|
kados |
3. the new_from_xml($record,UTF-8,UNIMARC); isn't working |
08:49 |
|
paul |
you mean that with position9=a it should work ? |
08:49 |
|
paul |
(because I could force position 9 to check) |
08:49 |
|
kados |
because the reverses the code points for combining characters |
08:50 |
|
kados |
(yes, with position 9=a it will work fine) |
08:50 |
|
kados |
(not that leader/09 is actually 10th position) |
08:50 |
|
kados |
s/not/note/ |
08:50 |
|
paul |
of course, I remember |
08:50 |
|
kados |
I suspect that |
08:51 |
|
kados |
mike rylander's solution for unimarc works perfectly |
08:51 |
|
kados |
except for combining characters |
08:52 |
|
paul |
mmm... strange : even if I force a, it is replaced by a space |
08:53 |
|
paul |
(i putted 4 'a', and only 3 are shown, the last one, the 9 being replaced by a space ! |
08:53 |
|
kados |
did you turn off the unicode flag? |
08:53 |
|
kados |
I mean unimarc flag |
08:54 |
|
paul |
I tried without unimarc & without unimarc and utf8 |
08:54 |
|
paul |
and none of them work |
08:55 |
|
kados |
what are you working with, a file? |
08:55 |
|
kados |
or data from mysql? |
08:55 |
|
paul |
data from mysql |
08:55 |
|
kados |
and export.pl? |
08:55 |
|
paul |
export.pl ? what do you want with export.pl ? |
08:55 |
|
kados |
ahh ... |
08:55 |
|
kados |
you are just refreshing your browser page? |
08:56 |
|
paul |
yep |
08:56 |
|
kados |
try to change in mysql |
08:56 |
|
kados |
the leader |
08:56 |
|
paul |
that's what I did ;-) |
08:56 |
|
kados |
hehe |
08:57 |
|
kados |
wow, it' quite strange |
08:58 |
|
kados |
are you 100% sure you don't have tumer's MARC::Record? |
09:00 |
|
kados |
paul: did you comment out line 171 in USMARC.pm? |
09:00 |
|
kados |
because I think this is probably the problem :-) |
09:01 |
|
kados |
I have suspected that tumer was wrong about this from the beginning |
09:01 |
|
paul |
I commented it, but even without commenting, i have the problem |
09:01 |
|
kados |
hmmm |
09:01 |
|
paul |
now i have a 100% official MARC::Record package |
09:01 |
|
paul |
(unless i'm missing something) |
09:03 |
|
kados |
and you checked the code points in the webpage? |
09:03 |
|
paul |
?? |
09:03 |
|
kados |
it's still ?e instead of e? |
09:03 |
|
kados |
yes, it is |
09:03 |
|
kados |
strange |
09:04 |
|
kados |
but your encoding is still wrong in leader |
09:04 |
|
kados |
someone is setting it to ' ' ... but who? |
09:04 |
|
kados |
ahh ... |
09:04 |
|
paul |
yes, and what is strange is that I abs assure I putted 'a' in mySQL ! |
09:04 |
|
kados |
you need to tell new_from_xml that you want UTF-8 |
09:04 |
|
paul |
ahh... |
09:05 |
|
kados |
new_from_xml($record,'UTF-8'); |
09:05 |
|
kados |
otherwise, it will give you MARC8 |
09:05 |
|
paul |
reminder : |
09:05 |
|
paul |
$record = MARC::Record::new_from_xml( $marcxml,'UTF-8','UNIMARC' ) if $marcxml; |
09:06 |
|
kados |
can you 'warn marcxml'? |
09:06 |
|
kados |
maybe it is converted to marc8 before it becomes $marcxml |
09:06 |
|
paul |
I already have saved it in a file. |
09:06 |
|
kados |
and? |
09:06 |
|
paul |
(/tmplxmldump.iso & /tmp/xmldump.utf) |
09:06 |
|
paul |
.iso being opened without anything |
09:07 |
|
paul |
.utf being open with :utf8 |
09:07 |
|
paul |
xmldump.utf has reencoded utf8 : c383c2a9 |
09:08 |
|
paul |
xmldump.iso is unchanged : c3a9 |
09:09 |
|
paul |
+ if I warn the XML, I see in my logs : \xc3\xa9 |
09:09 |
|
paul |
so I would vote : $marcxml is correct & new_from_xml did something unexpected |
09:10 |
|
kados |
I would remove MARC::File::XMl |
09:10 |
|
kados |
and install new one from SF |
09:10 |
|
paul |
OK, let's try it |
09:14 |
|
kados |
it's from cvs |
09:14 |
|
paul |
OK, that's what is wrong : I never compiled this package from cvs. |
09:14 |
|
kados |
cvs -z3 -d:pserver:anonymousmarcpm.cvs.sourceforge.net:/cvsroot/marcpm co -P marc-xml |
09:14 |
|
paul |
100% sure |
09:15 |
|
kados |
maybe you can test ,UNIMARC flag also now :-) |
09:15 |
|
kados |
and tell Mike if it works :-) |
09:16 |
|
kados |
ok, I've got to get some breakfast ... |
09:16 |
|
kados |
I will be back in about an hour or so |
09:17 |
|
paul |
i'll left in 30mn around |
09:17 |
|
paul |
I may be back in some hours. |
09:17 |
|
paul |
but it seems that this does not fix the problem |
09:17 |
|
kados |
ok ... tell me if it works for you |
09:17 |
|
kados |
:( |
09:17 |
|
paul |
(& that I already had the same version,even if not from SF) |
09:18 |
|
kados |
(the versions are not maintained in SF, only on CPAN) |
09:18 |
|
kados |
(so 0.83 can be several versions) |
09:18 |
|
kados |
so the problem seems to be that: |
09:19 |
|
kados |
koha is turning 'a' in leader to ' ' |
09:19 |
|
kados |
before printing to the browser |
09:19 |
|
kados |
and as a result, MARC::File::XMl isn't preserving the order of the combining characters |
09:20 |
|
kados |
just to be safe, try installing MARC::Record from SF too: |
09:20 |
|
kados |
cvs -z3 -d:pserver:anonymousmarcpm.cvs.sourceforge.net:/cvsroot/marcpm co -P marc-record |
09:20 |
|
kados |
cvs -z3 -d:pserver:anonymousmarcpm.cvs.sourceforge.net:/cvsroot/marcpm co -P marc-charset |
09:20 |
|
kados |
for the heck of it too :-) |
09:20 |
|
kados |
ok, now I must go |
09:20 |
|
kados |
good luck paul |
09:20 |
|
paul |
OK, bye |
09:20 |
|
paul |
and many many thanks |
09:20 |
|
kados |
np |
09:21 |
|
kados |
paul++ :-) |
09:26 |
|
slef |
hello |