Time  Nick     Message
02:39 hdl      salut paul
02:31 paul     salut hdl
22:32 Genji    kados: tried my search options sidebar?
22:18 sanspach will do
22:18 kados    shoot me an email with the size and I'll start indexing when it's finished uploading
22:17 kados    that's cool
22:17 sanspach as soon as it is done I'll start it transferring, but then I'm going to bed
22:15 kados    great
22:15 sanspach I'm bzip2'ing it now
22:15 kados    sweet
22:15 sanspach I had it running on my machine and it has finished
22:15 sanspach I removed the partial files
22:15 kados    I can start it on my end -- sound good?
22:15 kados    hmmm, guess not ...
22:15 sanspach it was only about 1/4 done
22:14 sanspach looks like the script stopped executing; I got disconnected a couple times, but I thought it would keep going
22:14 kados    (looks like the process stopped)
22:14 kados    sanspach: What's the deal with the latest conversion?
22:14 sanspach yeah
22:14 kados    sanspach: you still around?
22:07 sanspach kados: problems?
19:24 chris    whats that then?
19:23 kados    well that's a trick ;-)
16:08 gavin    n prob
16:08 sanspach don't seem to find bzip/bzip2 so I'll have to use gzip
16:07 gavin    if we step up to 2gb that'll make a whale of a difference
16:06 gavin    yeah, that's fine either, bzip2 just gets a greater compression (although it takes more cpu time)
16:06 sanspach gzip?
16:06 gavin    width..
16:05 gavin    it'll save both of us time and bandwith
16:05 gavin    but start thinking about bzipping it
16:05 gavin    yeah if you like.  whatever size
16:04 sanspach want to try 10K ?
16:01 gavin    those seem fine to me
15:58 sanspach OK, same place: two files--one with 2 records, one with 100
15:56 gavin    so whatever works
15:56 sanspach I'll work on that
15:56 gavin    if you bzip it you, windows will just treat it as a blob and it should be safe
15:56 gavin    you were able to put it on a web server before
15:55 sanspach sending them through email is good ?!
15:55 sanspach OK, how should I get you the test files?  I don't think putting them on my windows box and then
15:52 gavin    i have a good amount of bandwidth in my university which I can use for that
15:51 gavin    then we can organise getting the 2GB batch off you
15:51 gavin    well if you want to give it a go and send me some stuff I'll try it out
15:50 sanspach definitely!
15:50 gavin    would it help if we rehearsed on a small data set?
15:49 gavin    in that case I guess we'd better get the recreation process moving
15:48 gavin    ah.
15:48 sanspach I think the data's got to be processed again
15:48 sanspach problem is, I think the delimiter that's wrong is used elsewhere in the data, which means no global replace
15:46 gavin    rather than go through the pain of downloading 2GB again
15:46 gavin    as far as I can tell it boils down to ^M and possibly delimiter substitution which would be very quick
15:46 sanspach if it is subfield delimiters too, we could do that
15:45 sanspach if it is just ^M we could strip those
15:45 sanspach I've lost track of what the problems might be.
15:45 gavin    not fixable?
15:45 sanspach rm *    and start over
15:44 gavin    sanspach: what do you think we need to do with kados data?
15:43 rach     ah well, at least it sounds like you know how to work on the data now
15:43 gavin    universities are indeed wonderful places
15:43 sanspach but yes, actually, Sirsi's Unicorn product isn't the cheapest out there
15:42 sanspach not for a univ. that has a site license already (!)
15:42 gavin    Oracle. that's an expensive library system!
15:41 sanspach then I put everything back together and have MARC::Record create true marc format out of them
15:41 gavin    i see, and you patch them up together?
15:41 sanspach I get the data out of our system db (Oracle, but same for mysql) as separate lines
15:40 gavin    but i know little or nothing about marc
15:40 sanspach no, no just for me
15:40 gavin    do you mean marc format should have linebreaks? none that I've seen have them
15:39 rach     you wish :-)
15:39 gavin    not any of the ones i've seen
15:39 sanspach but when MARC::Record writes them out, I figured all the formatting would be fixed
15:39 sanspach they're *all* separate lines to begin with ("flat" format)
15:38 gavin    it almost looks like an editor wrapped them or something
15:38 gavin    they ^Ms are all over the middle of records
15:38 gavin    yes if i remove the ^M out that one works too
15:38 sanspach or maybe I messed it up; I'll have to check
15:37 sanspach it seems that MARC::Record doesn't strip the trailing ^M from the leader field when it re-writes it
15:37 sanspach it is probably messed up in exactly the same way
15:37 gavin    I haven't seen kados big file so I don't know what is wrong with what he got
15:36 rach     the magic of windows :-)
15:36 rach     yep you'll have to take them out too, in the same sort of way
15:36 gavin    no idea why they're there
15:36 gavin    then a newer file failed due to having some wierd win32 linebreaks stuck in the middle
15:36 rach     ah cool
15:36 gavin    i took one of sanspach's files which he emailed me (small.sample2.mrc) and substituted one delimiter for another after which it worked
15:34 gavin    the stuff was inserting for me at the end
15:34 rach     hi
15:34 gavin    hi
15:33 rach     so gavin didn't manage to get it in?
15:32 rach     but I see gavin tried that
15:31 rach     and if you change one record to get rid of them that doesn't help - so make a 1 record clean file?
15:31 sanspach and not exactly clear why some of the files that don't have them still don't work
15:30 sanspach yeah, but still not certain which files are affected by it
15:30 rach     end of line markers
15:30 rach     they are windows line breaks
15:30 rach     you worked out the ^m tho
15:28 sanspach yeah; still nothing solved, though :(
15:28 rach     well - day for you :-)
15:28 rach     you guys have had a busy night :-)
15:24 sanspach hello
15:24 rach     hi
15:18 rach     :-)
15:10 kados    hehe
15:02 owen     I haven't checked for the possibility of a mummy's curse, though.
15:02 owen     No, I checked.
15:01 rach     it's halloween?