Time Nick Message 12:46 kados sanspach: hi there 12:46 sanspach hey! 12:46 kados sanspach: it seems I'm having some problems with the data 12:46 kados zebra is complaining when I index 12:46 kados but I don't have any details yet 12:46 kados I'm thinking of running a check on the records using MARC::Record 12:46 kados later today 12:46 sanspach there may be some usmarc rules that aren't followed 12:47 sanspach I'm thinking in particular that there may be repeated 001 fields 12:47 kados sanspach: it's strange since it gets through quite a few records before crashing 12:47 kados hmmm, that might be a problem 12:48 sanspach our system doesn't natively store in MARC and therefore doesn't enforce usmarc rules 12:48 sanspach I knew I needed to strip out our non-standard subfields 12:48 sanspach the newer records (created on current system, ca. 4.5 yrs) would have only single 001 12:49 sanspach but records imported from old system went through a merge algorithm and ended up with multiple 001 fields 12:49 sanspach didn't think about it until after I'd sent the file 12:49 kados interesting 12:50 kados do you think it would be easier to modif them from the big raw file or on your end? 12:50 sanspach depends on your tools; I can't natively work with them in MARC, so I do the edits then convert 12:51 sanspach if you can work with them in MARC, it might be better for you to manipulate within the file 12:52 sanspach also, when I looked at bits of the files I noticed that fields aren't always in usmarc order-- 12:52 sanspach specifically 008 seems to be in odd places (sometimes at end) 12:54 kados I'll give it a shot and if I can't do it I'll let you know 12:55 sanspach great; I've got the data all extracted now, so it will just be a matter of re-parsing the records and converting 12:55 kados sweet 13:04 kados sanspach: if it's not extracted as MARC what is it extracted as? 13:04 kados (out of curiosity) 13:04 sanspach flat ascii! (I live and die by perl but I use activestate on win32, so Marc::... aren't available) 13:10 kados sanspach: right 13:10 kados sanspach: so how big is the flat ascii file? 13:10 kados sanspach: it might actually be easier for me to import that with MARC::Record (as it will automatically enforce correct MARC syntax) 13:11 sanspach don't have one (merged them after converting to MARC) but could easily do it; in fact, 13:11 sanspach I could remove the duplicate 001's as I'm merging 13:14 kados hmmm 13:14 kados well it's up to you 13:14 kados I owe you already for providing the records ;-) 13:15 kados so a little work to tweak them isn't really a problem 13:15 kados on the other hand, if you've got the proc time and bandwidth to do another export and ftp that'd be ok too ;-) 13:15 sanspach I'll try to figure out good compression; re-sending them in ASCII is going to be no problem at all! (MARC's the hard part) 13:15 kados gzip is pretty good 13:15 kados if you'd like to cut back on bandwidth 13:19 sanspach can MARC::Record read them in from LC's marcbreaker/marcmaker formate? 13:21 kados no idea 13:36 sanspach kados: just reviewed MARC::Record docs at cpan and it looks like those tools are for MARC records 13:37 sanspach so you have a script that reads in flat files and does the creating? 13:39 kados sortof ... I can use MARC::Record to take data in any format and feed it in to construct a valid MARC record 13:40 kados and export as iso2709 13:40 kados I've done this in the past for various projects 13:40 kados like Koha's Z39.50 server 13:40 sanspach OK; marcbreaker has three idiosyncrasies: 13:40 sanspach 1) each line (=field) starts with = character 13:41 sanspach 2) next comes tag (leader has name LDR rather than 000) and two spaces 13:41 sanspach 3) next comes indicators with spaces substituted by \ character (backslash) 13:43 sanspach each line is thus /^=(LDR|\d{3}) (.*)$/ 13:43 sanspach with $1 being tag and 13:44 sanspach $2 being data (where tag<10, all data, tag>9 /^(.)(.)(.*)$/ for ind1=$1,ind2=$2,field=$3) 13:46 sanspach OK, done; I've removed dup 001 (can't say for sure tag order is up to standard); text file slightly smaller 13:47 sanspach than MARC file was (makes sense--no directories) 13:47 kados right 13:57 kados sanspach: if you take a look at http://liblime.com/zap/advanced.html you can see the latest results I'm getting from the data 13:57 kados sanspach: it looks like the search is working but the data coming back isn't displaying normally 13:57 kados (choose the LibLime to see your data) 13:58 sanspach hmm 13:58 kados (also notice that it's extremely fast ;-) 13:58 kados (which is good news) 13:58 kados i'd be interested in comparing it's speed and results to your current system 13:58 kados do you have a link for that? 14:00 sanspach specs for z39.50 connection are at http://kb.iu.edu/data/ajhr.html 14:01 kados k ... just a sec 14:03 kados heh ... ok ... try that 14:04 kados so the result set numbers aren't adding up 14:04 kados interestingly 14:04 sanspach yeah, saw that from my favorite author search (durrenmatt) 14:04 sanspach looks like field and/or record boundaries are all messed up 14:04 kados yea probably 14:05 sanspach maybe from multiple 001s? 14:05 kados could be ... wanna send me the updated one and we'll try that? 14:05 sanspach working on compressing now 14:05 kados cool 14:07 kados sanspach: it'll be neat to compare Indiana's Zserver to Zap/Zebra 14:07 sanspach our server's big (/fast) but I'm not sure how optimized we are for z39.50 connections--that's never been very high priority 14:08 kados sanspach: esp since you're prolly paying about 4-6K per year for that module 14:09 sanspach mostly 'cause we think we need it (as state inst. / major research lib. / etc.) not 'cause we actually want to support it 14:09 sanspach don't think we've got anybody knockin' down our door when it goes down! 14:09 kados right ... still ... it'd be neat if you were able to propose cutting back on the ILS budget a bit 14:21 sanspach kados: compressed file 26% of original; ftp begun but will take ca. 40 minutes 14:22 kados sweet 14:23 kados let me know when it's done 14:23 kados (FYI the indexing takes about 4 min too) 14:23 kados s/4/40/ 14:23 sanspach will do 14:24 sanspach still slays me it goes so fast 15:00 sanspach kados: ftp is done, right on schedule; let me know if there are any problems with the file or record format 15:02 kados sanspach: sweet ... I'll get started on the indexing 15:04 kados unzipping now 15:04 kados tar -xzvf /home/sanspach/all.tar.gz 15:04 kados all.txt 15:04 kados tar: Skipping to next header 15:04 kados tar: Archive contains obsolescent base-64 headers 15:07 sanspach working on it; google says common error; workaround should be possible... 15:09 kados sanspach: tar: Read 3790 bytes from /home/sanspach/all.tar.gz 15:10 kados tar: Error exit delayed from previous errors 15:10 kados sanspach: any clue why that's happening? 15:11 sanspach 'cause I used a win32 tool to tar/gzip?! 15:11 kados could be :-( 15:11 sanspach workaround is to unzip first then tar, but I'm seeing an error there, too; but maybe it will finish ok 15:12 sanspach ls 15:12 kados all.txt 15:12 sanspach oops, wrong window :) 15:12 kados hehe 15:13 kados so all.txt is it? 15:13 sanspach not good; the all.txt file should be about same size as all.tar (ever so slightly smaller: without tar header) 15:13 sanspach way too small--it is choking partway through or something 15:14 kados right .. 15:14 kados look at the output from tail 15:15 kados it's choking here: 15:15 kados =505 1\[v. 1.] Theoretical and empiri 15:15 kados for some reason 15:15 sanspach data is probably irrelevant; most likely bad length from header, etc. 15:16 kados fair enough 15:24 sanspach kados: the all.tar file should be good to use if you can just strip the first few bytes 15:24 sanspach maybe read in first line and dump everything before the = that is the start of the data? 15:25 sanspach don't know what text editing tools you might have that can handle file that large; don't want to read it all into memory! 15:31 kados grep, sed, awk, bash ;-) 15:31 kados perl even ;-) 15:36 kados sed 's/*=//' all.tar 15:36 kados I'm making a backup first 15:36 kados :-) 15:40 kados hmmm, seems it didn't work 16:05 sanspach OK, think I've got it with perl 16:09 kados sweet 16:09 kados let me know when i's done uncompressing 16:12 kados cool ... done eh? 16:12 sanspach looks like the right size... 16:12 sanspach seems right 16:12 kados ok ... I"m gonna index it (I'll move it first) 16:12 sanspach sorry for the hassle 16:14 kados hmmm, strange error: 14:05:08-07/06 ../../index/zebraidx(32333) [warn] records/sample-records:0 MARC record length < 25, is 0 16:14 kados it's not indexing the file 16:15 sanspach it's flat ascii, not marc 16:15 kados well that would explain it ;-) 16:24 kados sanspach: so ... just so I have this straight 16:24 kados the file is currently in MARCBreaker format 16:24 kados you already tried using MARCMaker and it didn't produce valid MARC records 16:25 kados so now we're going to try to use MARC::Record to create a valid MARC record 16:25 kados sound right? 16:25 sanspach well, only sort of 16:25 sanspach I had separate small files which I converted into MARC 16:26 sanspach I'm guessing the problem was the repeated 001's 16:26 kados using Marc Maker for the conversion (and join) 16:26 kados right 16:27 sanspach I used MarcMaker for the conversion; I joined them afterward 16:27 kados ok ... how big was each file (approx) 16:28 sanspach 100mgb 16:54 kados sanspach: I'm headed home now ... I hacked together a start of a script to convert from marcmaker to usmarc using MARC::Record and I'll try to finish it up tonight 16:55 sanspach OK; if I think of anything brilliant, I'll let you know :) 16:56 kados sanspach: sounds good ;-) 02:37 osmoze bonjour 02:37 paul salut js 02:37 osmoze coucou paul 02:37 osmoze t as deux minutes ? 02:38 paul vas y, je t'écoute 02:39 osmoze j ai une question : Il y a t il un moyen simple pour avoir la liste des retard dans overdue du type personne1 --> livre1,livre2,livre3 au lieu de personne1--> livre 1; personne1->livre2 etc etc 02:39 osmoze ceci pour un mailing 02:40 paul dans la prochaine version, on a bien la liste des ouvrages en retard. 02:40 osmoze j avais fait un petit script php, mais j ai une erreur que je ne peux reparer :( 02:40 osmoze alors j en viens a vos services :) 02:40 paul c'était un manque évident. 02:40 osmoze comment ca ? 02:40 osmoze il y avait deja un module (overdue.pl) 02:41 paul le overduenotice.pl a été amélioré. 02:41 paul il envoie un mail à tous les lecteurs ayant un mail pour leur donner leur liste de retard. 02:41 paul et il envoie un mail à la bibliothèque avec tous les lecteurs ayant des retards mais pas de mail 02:41 osmoze le probleme est qu on ne peux pas se servir de ces données 02:42 osmoze du mail a la bibliotheque 02:42 osmoze car mon but etait de creer une lettre type mailing et d inclure les noms automatiquement 02:42 osmoze cependant, ca ne marche qu avec une base ou un fichier text bien etabli 02:43 paul exact. En fait, il faudrait que l'on mette en PJ un fichier CSV avec les infos 02:43 osmoze tout a fait cela 02:43 osmoze mais il y aura toujours le probleme de la redondance des noms 02:44 osmoze pour les emprunteur qui ont plus de un livre en retard 02:46 paul oui. On pourrait imaginer faire ca avec les titres séparés par une , 02:46 paul ils apparaitraient sur une seule ligne dans le mailing. 02:48 osmoze c est exactement ce que je cherche :) 02:50 osmoze comme cela, je peux faire un mailing rapide et efficace pour l envoi de lettre ^^ 02:50 paul avec OpenOffice ? 02:51 paul si c'est le cas, faudra mettre ce doc dans le CVS. 02:51 osmoze J avais tester avec word, je vais tester avec openoffice 02:51 osmoze (les machines de l accueil son sous windows + word, mais j exclu pas de mettre openoffice-win32) 02:53 hdl_away hi. 02:53 osmoze hello hdl 02:54 osmoze paul, tu n aurais pas un petit fichier csv tout fait sous la main ? ^^ 02:54 hdl osmoze : c'est juste un fichier texte séparé par des points virgules. ;) 02:55 osmoze donc c'est bon 02:55 osmoze ca marche bien 04:55 jean hi/bonjour 05:04 paul mercredi, c'est la journée des enfants ET la journée de Jean sur Koha ;-) 05:04 paul bonjour Jean. Tu vas bien ? 05:05 jean :) 05:05 jean oui tres bien 05:06 paul ton doc sur l'optimisation avance bien ? 05:06 jean je pense release aujourd'hui 05:06 paul super ! 05:06 paul je suis impatient de le lire. 05:06 jean mais j'ai travaille qu'un jour par semaine dessus et encore avec de multiples ralentissement 05:06 jean c'est pour ca que ca a mi un peu de temps :) 05:07 jean bah en tout cas je compte sur toi pour me donner ton avis 05:07 paul tu peux y compter. 05:10 paul bon, allez, à table. A tout à l'heure 10:29 Sylvain hi 10:29 hdl hi Sylvain. 10:34 Sylvain is it envisaged to include xml in koha in any way ? 10:35 hdl No, as far as I know. Are you interested in doing it ;) ? 10:36 Sylvain no, just because a customer was asking ... I hadn't heard anything about it so I wondered 10:38 paul sylvain : "include xml" is not enough. 10:39 paul what does he want : exporting XML, importing xml, showing xml... 10:39 hdl ... using xml ? 10:39 Sylvain I know paul it's not enough :) But the customer is a librarian and didn't say more in its mail. So I was asking if anything was envisaged in xml 10:40 paul zebra cause xml pas trop mal il semble... 10:40 Sylvain mouais, rien de bien précis en tout cas concernant xml alors 10:41 paul on est en phase "bazar", et la roadmap devrait être prête d'ici la fin du mois 10:42 Sylvain ok 10:42 Sylvain et la 2.2.3 une date précise ? (peut être passé sur les ML mais j'ai pas fait gaffeà 10:42 Sylvain ) 10:43 paul je vais l'annoncer pour la semaine prochaine. 10:43 paul il reste surtout de la trad à faire, et quelques peaufinages éventuellement. 10:43 paul (par exemple, faut que je copie tes plugins unimarc dans la 2.2 10:43 paul ) 10:45 Sylvain c'est matthieu qui a fait ça mais ok 11:13 owen Hi sanspach 11:33 Sylvain can someone explain me the meaning of "datelastseen" ? 11:36 hdl latest date when you see the book... For Inventory purpose IMHO. 11:36 Sylvain last time it was "barcoded" ? 11:37 Sylvain scanné à la douchette ;) 11:37 sanspach hi owen (sorry, started IRC then walked away!) 11:38 hdl Pas seulement, vois l'onglet inventaire/Récollement des stats ;) 11:38 hdl English : Not Only, hav a look at Inventory/StockTaking in reports 11:38 Sylvain "01:53 +1d chris datelastseen is the last time the item was issued, or returned, or transfered between branches" 11:39 Sylvain hdl the stats are too powerful and have too many things, I haven't had time yet to explorate them ;) 11:39 hdl That's why I told you about that. 11:40 hdl And *I* am not the only one to have worked on that ;) 11:40 Sylvain ok, I thought you had all done alone 11:40 hdl So long ;) 11:47 owen sanspach, you're in Indiana? 11:47 sanspach yes 11:51 kados sanspach: I tried indexing the new marc file 11:51 kados sanspach: results are displaying weirdly 11:51 kados similar to before 11:51 kados http://liblime.com/zap/advanced.html 11:52 sanspach kados: yes, I see; very odd 11:52 sanspach almost like the directory is off and the fields are getting all mangled 11:52 kados yea 11:53 sanspach only this time the MARC was generated w/MARC::Record 11:53 kados right ... so it should be valid 11:53 sanspach how can both (very different) methods produce the same problems? 11:53 kados well it may be the indexer 11:53 kados but I haven't had trouble with it using other MARC records 11:54 sanspach do you want batches of smaller sections of the db? I still have the original 54 files 11:54 kados sure ... send em over 11:54 kados maybe if we do them one-by-one we can catch the problem 11:55 sanspach I'll ftp in batches of 10, in numeric order (you'll see the pattern) 11:55 kados k ...