IRC log for #koha, 2005-06-11

All times shown according to UTC.

Time S Nick Message
15:01 rach it's halloween?
15:02 owen No, I checked.
15:02 owen I haven't checked for the possibility of a mummy's curse, though.
15:10 kados hehe
15:18 rach :-)
15:24 rach hi
15:24 sanspach hello
15:28 rach you guys have had a busy night :-)
15:28 rach well - day for you :-)
15:28 sanspach yeah; still nothing solved, though :(
15:30 rach you worked out the ^m tho
15:30 rach they are windows line breaks
15:30 rach end of line markers
15:30 sanspach yeah, but still not certain which files are affected by it
15:31 sanspach and not exactly clear why some of the files that don't have them still don't work
15:31 rach and if you change one record to get rid of them that doesn't help - so make a 1 record clean file?
15:32 rach but I see gavin tried that
15:33 rach so gavin didn't manage to get it in?
15:34 gavin hi
15:34 rach hi
15:34 gavin the stuff was inserting for me at the end
15:36 gavin i took one of sanspach's files which he emailed me (small.sample2.mrc) and substituted one delimiter for another after which it worked
15:36 rach ah cool
15:36 gavin then a newer file failed due to having some wierd win32 linebreaks stuck in the middle
15:36 gavin no idea why they're there
15:36 rach yep you'll have to take them out too, in the same sort of way
15:36 rach the magic of windows :-)
15:37 gavin I haven't seen kados big file so I don't know what is wrong with what he got
15:37 sanspach it is probably messed up in exactly the same way
15:37 sanspach it seems that MARC::Record doesn't strip the trailing ^M from the leader field when it re-writes it
15:38 sanspach or maybe I messed it up; I'll have to check
15:38 gavin yes if i remove the ^M out that one works too
15:38 gavin they ^Ms are all over the middle of records
15:38 gavin it almost looks like an editor wrapped them or something
15:39 sanspach they're *all* separate lines to begin with ("flat" format)
15:39 sanspach but when MARC::Record writes them out, I figured all the formatting would be fixed
15:39 gavin not any of the ones i've seen
15:39 rach you wish :-)
15:40 gavin do you mean marc format should have linebreaks? none that I've seen have them
15:40 sanspach no, no just for me
15:40 gavin but i know little or nothing about marc
15:41 sanspach I get the data out of our system db (Oracle, but same for mysql) as separate lines
15:41 gavin i see, and you patch them up together?
15:41 sanspach then I put everything back together and have MARC::Record create true marc format out of them
15:42 gavin Oracle. that's an expensive library system!
15:42 sanspach not for a univ. that has a site license already (!)
15:43 sanspach but yes, actually, Sirsi's Unicorn product isn't the cheapest out there
15:43 gavin universities are indeed wonderful places
15:43 rach ah well, at least it sounds like you know how to work on the data now
15:44 gavin sanspach: what do you think we need to do with kados data?
15:45 sanspach rm *    and start over
15:45 gavin not fixable?
15:45 sanspach I've lost track of what the problems might be.
15:45 sanspach if it is just ^M we could strip those
15:46 sanspach if it is subfield delimiters too, we could do that
15:46 gavin as far as I can tell it boils down to ^M and possibly delimiter substitution which would be very quick
15:46 gavin rather than go through the pain of downloading 2GB again
15:48 sanspach problem is, I think the delimiter that's wrong is used elsewhere in the data, which means no global replace
15:48 sanspach I think the data's got to be processed again
15:48 gavin ah.
15:49 gavin in that case I guess we'd better get the recreation process moving
15:50 gavin would it help if we rehearsed on a small data set?
15:50 sanspach definitely!
15:51 gavin well if you want to give it a go and send me some stuff I'll try it out
15:51 gavin then we can organise getting the 2GB batch off you
15:52 gavin i have a good amount of bandwidth in my university which I can use for that
15:55 sanspach OK, how should I get you the test files?  I don't think putting them on my windows box and then
15:55 sanspach sending them through email is good ?!
15:56 gavin you were able to put it on a web server before
15:56 gavin if you bzip it you, windows will just treat it as a blob and it should be safe
15:56 sanspach I'll work on that
15:56 gavin so whatever works
15:58 sanspach OK, same place: two files--one with 2 records, one with 100
16:01 gavin those seem fine to me
16:04 sanspach want to try 10K ?
16:05 gavin yeah if you like.  whatever size
16:05 gavin but start thinking about bzipping it
16:05 gavin it'll save both of us time and bandwith
16:06 gavin width..
16:06 sanspach gzip?
16:06 gavin yeah, that's fine either, bzip2 just gets a greater compression (although it takes more cpu time)
16:07 gavin if we step up to 2gb that'll make a whale of a difference
16:08 sanspach don't seem to find bzip/bzip2 so I'll have to use gzip
16:08 gavin n prob
19:23 kados well that's a trick ;-)
19:24 chris whats that then?
22:07 sanspach kados: problems?
22:14 kados sanspach: you still around?
22:14 sanspach yeah
22:14 kados sanspach: What's the deal with the latest conversion?
22:14 kados (looks like the process stopped)
22:14 sanspach looks like the script stopped executing; I got disconnected a couple times, but I thought it would keep going
22:15 sanspach it was only about 1/4 done
22:15 kados hmmm, guess not ...
22:15 kados I can start it on my end -- sound good?
22:15 sanspach I removed the partial files
22:15 sanspach I had it running on my machine and it has finished
22:15 kados sweet
22:15 sanspach I'm bzip2'ing it now
22:15 kados great
22:17 sanspach as soon as it is done I'll start it transferring, but then I'm going to bed
22:17 kados that's cool
22:18 kados shoot me an email with the size and I'll start indexing when it's finished uploading
22:18 sanspach will do
22:32 Genji kados: tried my search options sidebar?
02:31 paul salut hdl
02:39 hdl salut paul

| Channels | #koha index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary