Time Nick Message 15:01 rach it's halloween? 15:02 owen No, I checked. 15:02 owen I haven't checked for the possibility of a mummy's curse, though. 15:10 kados hehe 15:18 rach :-) 15:24 rach hi 15:24 sanspach hello 15:28 rach you guys have had a busy night :-) 15:28 rach well - day for you :-) 15:28 sanspach yeah; still nothing solved, though :( 15:30 rach you worked out the ^m tho 15:30 rach they are windows line breaks 15:30 rach end of line markers 15:30 sanspach yeah, but still not certain which files are affected by it 15:31 sanspach and not exactly clear why some of the files that don't have them still don't work 15:31 rach and if you change one record to get rid of them that doesn't help - so make a 1 record clean file? 15:32 rach but I see gavin tried that 15:33 rach so gavin didn't manage to get it in? 15:34 gavin hi 15:34 rach hi 15:34 gavin the stuff was inserting for me at the end 15:36 gavin i took one of sanspach's files which he emailed me (small.sample2.mrc) and substituted one delimiter for another after which it worked 15:36 rach ah cool 15:36 gavin then a newer file failed due to having some wierd win32 linebreaks stuck in the middle 15:36 gavin no idea why they're there 15:36 rach yep you'll have to take them out too, in the same sort of way 15:36 rach the magic of windows :-) 15:37 gavin I haven't seen kados big file so I don't know what is wrong with what he got 15:37 sanspach it is probably messed up in exactly the same way 15:37 sanspach it seems that MARC::Record doesn't strip the trailing ^M from the leader field when it re-writes it 15:38 sanspach or maybe I messed it up; I'll have to check 15:38 gavin yes if i remove the ^M out that one works too 15:38 gavin they ^Ms are all over the middle of records 15:38 gavin it almost looks like an editor wrapped them or something 15:39 sanspach they're *all* separate lines to begin with ("flat" format) 15:39 sanspach but when MARC::Record writes them out, I figured all the formatting would be fixed 15:39 gavin not any of the ones i've seen 15:39 rach you wish :-) 15:40 gavin do you mean marc format should have linebreaks? none that I've seen have them 15:40 sanspach no, no just for me 15:40 gavin but i know little or nothing about marc 15:41 sanspach I get the data out of our system db (Oracle, but same for mysql) as separate lines 15:41 gavin i see, and you patch them up together? 15:41 sanspach then I put everything back together and have MARC::Record create true marc format out of them 15:42 gavin Oracle. that's an expensive library system! 15:42 sanspach not for a univ. that has a site license already (!) 15:43 sanspach but yes, actually, Sirsi's Unicorn product isn't the cheapest out there 15:43 gavin universities are indeed wonderful places 15:43 rach ah well, at least it sounds like you know how to work on the data now 15:44 gavin sanspach: what do you think we need to do with kados data? 15:45 sanspach rm * and start over 15:45 gavin not fixable? 15:45 sanspach I've lost track of what the problems might be. 15:45 sanspach if it is just ^M we could strip those 15:46 sanspach if it is subfield delimiters too, we could do that 15:46 gavin as far as I can tell it boils down to ^M and possibly delimiter substitution which would be very quick 15:46 gavin rather than go through the pain of downloading 2GB again 15:48 sanspach problem is, I think the delimiter that's wrong is used elsewhere in the data, which means no global replace 15:48 sanspach I think the data's got to be processed again 15:48 gavin ah. 15:49 gavin in that case I guess we'd better get the recreation process moving 15:50 gavin would it help if we rehearsed on a small data set? 15:50 sanspach definitely! 15:51 gavin well if you want to give it a go and send me some stuff I'll try it out 15:51 gavin then we can organise getting the 2GB batch off you 15:52 gavin i have a good amount of bandwidth in my university which I can use for that 15:55 sanspach OK, how should I get you the test files? I don't think putting them on my windows box and then 15:55 sanspach sending them through email is good ?! 15:56 gavin you were able to put it on a web server before 15:56 gavin if you bzip it you, windows will just treat it as a blob and it should be safe 15:56 sanspach I'll work on that 15:56 gavin so whatever works 15:58 sanspach OK, same place: two files--one with 2 records, one with 100 16:01 gavin those seem fine to me 16:04 sanspach want to try 10K ? 16:05 gavin yeah if you like. whatever size 16:05 gavin but start thinking about bzipping it 16:05 gavin it'll save both of us time and bandwith 16:06 gavin width.. 16:06 sanspach gzip? 16:06 gavin yeah, that's fine either, bzip2 just gets a greater compression (although it takes more cpu time) 16:07 gavin if we step up to 2gb that'll make a whale of a difference 16:08 sanspach don't seem to find bzip/bzip2 so I'll have to use gzip 16:08 gavin n prob 19:23 kados well that's a trick ;-) 19:24 chris whats that then? 22:07 sanspach kados: problems? 22:14 kados sanspach: you still around? 22:14 sanspach yeah 22:14 kados sanspach: What's the deal with the latest conversion? 22:14 kados (looks like the process stopped) 22:14 sanspach looks like the script stopped executing; I got disconnected a couple times, but I thought it would keep going 22:15 sanspach it was only about 1/4 done 22:15 kados hmmm, guess not ... 22:15 kados I can start it on my end -- sound good? 22:15 sanspach I removed the partial files 22:15 sanspach I had it running on my machine and it has finished 22:15 kados sweet 22:15 sanspach I'm bzip2'ing it now 22:15 kados great 22:17 sanspach as soon as it is done I'll start it transferring, but then I'm going to bed 22:17 kados that's cool 22:18 kados shoot me an email with the size and I'll start indexing when it's finished uploading 22:18 sanspach will do 22:32 Genji kados: tried my search options sidebar? 02:31 paul salut hdl 02:39 hdl salut paul