Time |
S |
Nick |
Message |
12:01 |
|
sanspach |
first file's done; if that breaks it, then we'll know it is the whole thing |
13:08 |
|
kados |
sanspach: about 15 minutes for the first index to finish up |
14:32 |
|
sanspach |
kados: how goes it |
14:32 |
|
sanspach |
? |
14:58 |
|
kados |
sanspach: didn't work :-( |
14:58 |
|
kados |
looks the same as before |
15:07 |
|
sanspach |
argh! |
15:11 |
|
chris |
fun with marc records I see? |
15:14 |
|
sanspach |
kados: did *any* of them index correctly? the last file is actually the smallest if you want to try a tiny subset |
15:15 |
|
kados |
k ... I'll give that a shot |
15:15 |
|
kados |
hey chris |
15:15 |
|
kados |
http://opac.liblime.com/cgi-bi[…]rogramming%20perl |
15:15 |
|
kados |
check this out |
15:15 |
|
kados |
are you familiar with opensearch? |
15:15 |
|
kados |
A9's opensearch? |
15:16 |
|
kados |
http://opensearch.a9.com/ |
15:16 |
|
kados |
I'm close to having a valid opensearch for Koha |
15:17 |
|
kados |
(need to get the bibid returned) |
15:17 |
|
chris |
cool |
15:17 |
|
kados |
the actual search method will need to be discussed |
15:17 |
|
kados |
like we can decide what happens depending on the data entered |
15:17 |
|
kados |
(if an ISBN comes in, do an ISBN search, etc.) |
15:18 |
|
chris |
yep |
15:18 |
|
kados |
do you happen to know if MARC stores bibis or biblbionumbers somewhere? |
15:18 |
|
kados |
(and if so, where?) |
15:20 |
|
chris |
umm |
15:20 |
|
chris |
you mean marc from koha? |
15:20 |
|
kados |
ya |
15:20 |
|
chris |
probably if you tell it to |
15:20 |
|
kados |
yea ... we should do that |
15:20 |
|
kados |
so we can link directly to detail screens and such |
15:21 |
|
chris |
according to cgi-bin/koha/admin/koha2marclinks.pl |
15:21 |
|
kados |
it would also help out with integrating our Z-server searching with current catalogsearch routine |
15:21 |
|
chris |
on a default install |
15:21 |
|
chris |
090c |
15:22 |
|
chris |
i dunno if it actually does or not |
15:22 |
|
chris |
ill look in the db |
15:23 |
|
chris |
just used the acquisition routines to import 40k items yesterday, so ill see what if it did |
15:23 |
|
chris |
-if |
15:23 |
|
kados |
doesn't seem to be working on NPL |
15:23 |
|
chris |
yeah but u dont use koha for acquisitions |
15:23 |
|
chris |
you import from marc |
15:24 |
|
chris |
so ill check one where the koha routines where used |
15:24 |
|
chris |
were even |
15:24 |
|
chris |
(too early yet and not enough coffee) |
15:24 |
|
kados |
:-) |
15:24 |
|
kados |
I'm wondering whether the updatedatabase and the koha routines use the same baseline subs |
15:24 |
|
kados |
they ought to |
15:25 |
|
chris |
if you edited each of your biblios in koha that might make them |
15:25 |
|
chris |
so where do you reckon id find them in the db? |
15:25 |
|
chris |
marc_biblio ? |
15:26 |
|
kados |
probably marc_word |
15:26 |
|
kados |
where tagsubfield = 090c |
15:27 |
|
chris |
yep |
15:27 |
|
chris |
+-------+-------------+----------+---------------+-------+-----------+ |
15:27 |
|
chris |
| bibid | tagsubfield | tagorder | subfieldorder | word | sndx_word | |
15:27 |
|
chris |
+-------+-------------+----------+---------------+-------+-----------+ |
15:27 |
|
chris |
| 1 | 090c | 3 | 1 | 1 | | |
15:27 |
|
chris |
| 10 | 090c | 2 | 1 | 10 | | |
15:27 |
|
chris |
| 100 | 090c | 1 | 1 | 100 | | |
15:27 |
|
chris |
| 1000 | 090c | 2 | 1 | 1000 | | |
15:27 |
|
chris |
| 10000 | 090c | 2 | 1 | 10000 | | |
15:27 |
|
chris |
| 10001 | 090c | 1 | 1 | 10001 | | |
15:27 |
|
chris |
| 10002 | 090c | 3 | 1 | 10002 | | |
15:27 |
|
chris |
| 10003 | 090c | 2 | 1 | 10003 | | |
15:27 |
|
chris |
| 10004 | 090c | 1 | 1 | 10004 | | |
15:27 |
|
chris |
| 10005 | 090c | 1 | 1 | 10005 | | |
15:27 |
|
chris |
+-------+-------------+----------+---------------+-------+-----------+ |
15:28 |
|
chris |
looks like it does |
15:28 |
|
chris |
thats using C4::Biblio and the nebiblio, newbiblioitem newitem routines to import from my xml |
15:28 |
|
kados |
sweet ... so there there |
15:28 |
|
chris |
so i reckon in npl's case |
15:28 |
|
chris |
bulkmarcimport needs to be tweaked |
15:28 |
|
chris |
to store that |
15:29 |
|
chris |
if they arent there for you |
15:29 |
|
kados |
yea |
15:30 |
|
kados |
they are in there |
15:30 |
|
chris |
yay |
15:30 |
|
kados |
but I can't get them to display for some reason |
15:30 |
|
kados |
ahh ... I know |
15:30 |
|
kados |
because I didn't index them |
15:31 |
|
kados |
with zebra |
15:31 |
|
chris |
that'd do it |
15:31 |
|
chris |
need to edit your .abs files or something |
15:31 |
|
kados |
yep |
15:31 |
|
chris |
zebra really does look promisong |
15:31 |
|
kados |
yep |
15:32 |
|
kados |
i started with the A9 search interface for Zebra because I figured it'd be real simple |
15:32 |
|
kados |
and it was |
15:32 |
|
kados |
I'll commit it tonight (unless you want to see it now) |
15:33 |
|
chris |
nope thats fine, it will just sidetrack me from work :) |
15:35 |
|
kados |
:-) |
15:39 |
|
sanspach |
kados: I'll be back later if there's anything more I can help with |
15:40 |
|
chris |
heya sanspach |
15:40 |
|
chris |
i dont think weve met yet |
15:40 |
|
kados |
sanspach: chris wrote the original Koha |
15:40 |
|
sanspach |
great to meet you |
15:40 |
|
sanspach |
I've been tieing up bandwith throughout the central US sending kados huge MARC files for testing |
15:41 |
|
chris |
hehe nice |
15:41 |
|
sanspach |
indexing with :) |
15:41 |
|
chris |
cant think of a better use for the internet :) |
15:41 |
|
kados |
hehe |
15:41 |
|
kados |
we've probably transfered about 20 gig thusfar ;-) |
15:41 |
|
chris |
lucky ur not on my home isp's plan |
15:42 |
|
chris |
if i go over my monthly 5 gig cap, its .35 cents a meg |
15:42 |
|
kados |
yikes! |
15:42 |
|
chris |
i havent ever yet |
15:42 |
|
kados |
I bet not |
15:43 |
|
chris |
most of my traffic is just ssh to the katipo servers |
15:43 |
|
kados |
do they monitor it so you can at least check your usage? |
15:43 |
|
chris |
yep |
15:43 |
|
chris |
they have a nice webpage |
15:43 |
|
chris |
broken down by hour |
15:43 |
|
kados |
sweet |
15:43 |
|
chris |
and they warn you when you get to about 80% |
15:44 |
|
chris |
its a racket tho, as there is a duopoly here on broadband in reality |
15:44 |
|
kados |
yea ... by air or under the sea right? |
15:44 |
|
chris |
two big telco's .. so they pretty much can charge whatever they want |
15:45 |
|
kados |
hum ... I think maybe Koha's export routine isn't exporting 090c |
15:45 |
|
chris |
ahh |
15:45 |
|
kados |
which would make sense |
15:45 |
|
chris |
yep |
15:45 |
|
kados |
I couldn't get that to run right from the command line either |
16:07 |
|
kados |
006909000090008210000140009124500340010594200240013995200330016377325ACLS1999032 |
16:07 |
|
kados |
6000000.0950324s19uu |
16:31 |
|
owen |
kados, you still there? |
16:31 |
|
kados |
owen: yea |
16:31 |
|
kados |
owen: knee high in MARC |
16:31 |
|
kados |
owen: what's up? |
16:31 |
|
owen |
Do you know anything about this new file: /search.marc/dictionary.pl |
16:32 |
|
kados |
hmmm, I asked paul about it |
16:32 |
|
kados |
but I don't remember the answer |
16:32 |
|
kados |
should be in the logs for this week |
16:32 |
|
kados |
(I think it has something to do with authorized values) |
16:33 |
|
owen |
Oh, I see now that it's linked to from search.pl |
16:36 |
|
owen |
Seems to be a way to do a 'pre-search' for existing terms before building a search. |
16:40 |
|
kados |
that's a sweet features |
16:40 |
|
kados |
feature even |
16:44 |
|
rach |
hello |
16:44 |
|
kados |
hey rach |
16:45 |
|
rach |
how are things with you |
16:46 |
|
kados |
oh pretty good |
16:46 |
|
kados |
I'm very close to getting a working opensearch going for Koha |
16:47 |
|
kados |
which will be the first step in getting our catalogsearch subroutine to use Z39.50 for searches instead of marc_word |
16:47 |
|
rach |
ah cool |
16:47 |
|
kados |
plus it adds a feature that won't appear in proprietary systems for months or years ;-) |
16:48 |
|
chris |
being able to get your own data out? |
16:48 |
|
chris |
:-) |
16:49 |
|
kados |
hehe |
16:54 |
|
rach |
nice |
17:30 |
|
chris |
hi mason |
17:30 |
|
mason |
hi chris |
17:31 |
|
chris |
nice and crisp this morning eh |
17:33 |
|
mason |
its freezing up here in northland, im wearing skiing trousers, and fingerless gloves |
17:34 |
|
chris |
fingerless gloves are a good idea |
17:39 |
|
kados |
ooh interesting ... |
17:39 |
|
kados |
18:38:34-08/06 ../../index/zebrasrv(19103) [warn] usmarc.abs:36: Couldn't find att 'Item-number' in attset |
17:40 |
|
kados |
I added Item-number ;-) |
17:40 |
|
kados |
hmmm |
17:46 |
|
kados |
hmmm, I'm getting 090 from my web-based z39.50 client |
17:46 |
|
kados |
so there must be some problem with the way I'm doing Net::Z3950 |
17:51 |
|
chris |
kados: mason is the person who helped me out a lot with getting the data out of the dynix db |
17:51 |
|
kados |
sweet ... hey there mason |
17:51 |
|
chris |
mason: kados is joshua, from Ohio, who is the release manager for version 2.4 of koha .. and one of the principles of http://liblime.com |
17:52 |
|
kados |
recent author of a beta opensearch for Koha |
17:52 |
|
kados |
http://opac.liblime.com/cgi-bi[…]rogramming%20perl |
17:52 |
|
chris |
is that principle, or principal? |
17:52 |
|
kados |
chris: got it! |
17:52 |
|
kados |
principle |
17:52 |
|
chris |
ta |
17:52 |
|
kados |
no ... I'm wrong |
17:52 |
|
kados |
I always go the other way ;-) |
17:53 |
|
kados |
chris: got bibids showing up ;-) |
17:53 |
|
chris |
sweet, that works a treat |
17:53 |
|
kados |
yep |
17:53 |
|
kados |
couple of things to fix still |
17:53 |
|
chris |
mason, we are currently investigating using zebra http://www.indexdata.dk/zebra/ as a way to search koha faster |
17:54 |
|
kados |
mason quit ;-) |
17:54 |
|
chris |
doh |
17:54 |
|
kados |
heh |
17:54 |
|
chris |
ping timeout, prolly someone rang him and bumped him off his dialup |
17:54 |
|
chris |
ahh i remember those days |
17:55 |
|
chris |
i had to work to school in barefeet for miles |
17:55 |
|
kados |
hehe |
17:55 |
|
chris |
and crack stones together to make electricity for the computer |
17:55 |
|
kados |
uphill both ways in the snow all year long ;-) |
17:55 |
|
kados |
hehe |
17:55 |
|
kados |
man that was tough |
17:55 |
|
chris |
heh |
17:57 |
|
kados |
pages are working too: |
17:57 |
|
kados |
http://opac.liblime.com/cgi-bi[…]q=new&startPage=2 |
18:11 |
|
mason |
back again |
18:12 |
|
kados |
ok ... I'm experimenting with descriptions using 500a and 520a |
18:12 |
|
kados |
notes fields |
18:13 |
|
kados |
I'm not sure the best approach |
18:14 |
|
michael |
quote : "i had to work to school in barefeet for miles and crack stones together to make electricity for the computer"... ;-) very funny |
18:14 |
|
kados |
:-) |
18:15 |
|
chris |
hmm i typoed again, that should have been walk to school :) |
18:42 |
|
kados |
ok everyone |
18:42 |
|
kados |
chris, mason |
18:42 |
|
kados |
give this a shot: |
18:42 |
|
kados |
http://a9.com/-/search/moreCol[…]0Public%20Library |
18:42 |
|
kados |
rach |
18:43 |
|
kados |
in the 'search columns' section |
18:43 |
|
kados |
type Nelsonville Public Library |
18:44 |
|
kados |
error messages aren't handled well yet ... but the searching works great |
18:45 |
|
kados |
(general keyword searching) |
18:45 |
|
kados |
I'm gonna work on getting it to display correct error codes |
18:45 |
|
kados |
then I'll commit it and write up a description |
18:48 |
|
rach |
sorry phone then heading to lunch will look when I get back |
19:12 |
|
kados |
well ... not working perfectly yet |
19:12 |
|
kados |
page 2 doesn't display for instance ;-) |
20:53 |
|
kados |
chris: got holdings displaying in A9 |
20:54 |
|
chris |
show off :) |
20:54 |
|
kados |
http://a9.com/cryptonomicon |
20:54 |
|
kados |
:-) |
20:54 |
|
kados |
for some reason, even though the image links are there, they arne't showing up |
20:55 |
|
kados |
I just noticed that there's three of everything in the db ;-) |
20:55 |
|
chris |
:) |
20:55 |
|
kados |
must have something to do with my mad indexing skills ;-) |
21:03 |
|
kados |
hum ... also, <description> isn't taking my <a hrefs> |
21:04 |
|
kados |
oooh ... I think i see the problem |
21:05 |
|
kados |
chris got a sec? |
21:05 |
|
kados |
http://search.athenscounty.lib[…]opensearch?q=cats |
21:05 |
|
kados |
https://catalog.spl.org/rss?te[…]s&type=opensearch |
21:05 |
|
chris |
yep |
21:05 |
|
kados |
if you do, compare those two and pay attention to the <description><img> elements |
21:05 |
|
kados |
in mozilla |
21:06 |
|
kados |
for some reason, in the spl one <img> doesn't show up as a major element |
21:06 |
|
kados |
(an XML element) |
21:06 |
|
kados |
do you know why that is? |
21:06 |
|
kados |
(same thing happens with <a) |
21:07 |
|
chris |
hmm no idea |
21:09 |
|
Genji |
hiya all |
21:09 |
|
kados |
hey there Genji |
21:10 |
|
kados |
Genji: have you ever used A9's search, particularly opensearch? |
21:10 |
|
Genji |
A9? |
21:10 |
|
kados |
http://opensearch.a9.com/ |
21:10 |
|
kados |
check that out ... |
21:12 |
|
kados |
when you get a feel for what it is, do a search on Nelsonville Public Library in the 'columns' section (if you're interested) |
21:17 |
|
kados |
chris: here's an interesting article on escaping XML: |
21:17 |
|
kados |
http://www.xml.com/pub/a/2003/08/20/embedded.html |
21:17 |
|
kados |
maybe that's what's going on? |
21:18 |
|
chris |
sounds logical |
21:30 |
|
kados |
hehe ... |
21:30 |
|
kados |
chris: http://a9.com/neal%20stephenson |
21:31 |
|
chris |
what am i looking at? |
21:33 |
|
kados |
ahh ... you'll need to setup your search prefs first |
21:33 |
|
kados |
to search NPLKoha |
21:33 |
|
kados |
(just type NPLKoha in the search |
21:33 |
|
chris |
right |
21:34 |
|
kados |
shoot |
21:35 |
|
kados |
ok ... back up |
21:35 |
|
kados |
$holdings=."<br /><b>($copies) Copies at:</b> "; |
21:35 |
|
kados |
that wrong syntax? |
21:36 |
|
kados |
shoot ... none of the holdings are showing up |
21:38 |
|
kados |
there we go |
21:38 |
|
kados |
count, list of places and link to reserve all showing up |
22:26 |
|
kados |
there are only two other libraries listed: Seattle Public and the British Library |
22:27 |
|
kados |
neither of them list holdings, have links to 'read it now', or link directly to reserves or author searches |
03:15 |
|
osmoze |
hello |
06:20 |
|
kados |
paul_away: around? |
06:24 |
|
gavin |
kados: any chance I could grab sanspach's files off you? |
06:25 |
|
gavin |
you seem to have the better bandwidth |
06:27 |
|
kados |
gavin: it's not valid marc ... alas ;-) |
06:27 |
|
kados |
gavin: I've not been able to do anything useful with it yet |
06:27 |
|
gavin |
ah :( |
06:28 |
|
gavin |
would you be able to try something for me some time |
06:29 |
|
gavin |
specifically, to run that fulltext search again and then run it in the monitor and see what the query time is |
06:29 |
|
gavin |
only thing is, we need to avoid it being cached. |
06:32 |
|
gavin |
which is a bit of a hassle |
06:39 |
|
Sylvain |
hi all |
06:42 |
|
kados |
gavin: it's not that hard ... I can just flush the cache once in a while ;-) |
06:43 |
|
kados |
gavin: I don't have time for that atm ... maybe later today? (I've got meetings tomorrow so Sat would work also) |
06:43 |
|
kados |
gavin: wanna see something cool? |
06:44 |
|
kados |
gavin: http://opensearch.a9.com/ |
06:44 |
|
gavin |
kados: you could mail me (or the list) with results any time you like |
06:44 |
|
kados |
gavin: sounds good |
06:44 |
|
kados |
gavin: i just wrote an opensearch interface for Koha |
06:44 |
|
gavin |
kados: i'll have a look |
06:44 |
|
kados |
gavin: using a Zebra index engine |
06:45 |
|
kados |
gavin: (search for Nelsonville Public Library under 'Columns') |
06:45 |
|
kados |
it's super fast |
06:45 |
|
gavin |
I saw you guys talking about it before |
06:45 |
|
gavin |
looks very impressive |
06:46 |
|
kados |
the same engine is available here (if you don't want the hastle of setting up A9) |
06:46 |
|
kados |
http://liblime.com/zap/advanced.html |
06:46 |
|
kados |
(the LibLime check-box isn't working just yet ... try the Nelsonville one) |
06:47 |
|
gavin |
seems very quick alright |
06:47 |
|
kados |
adds boolian too (and some other stuff i haven't configured yet) |
06:47 |
|
gavin |
cool! |
06:47 |
|
kados |
relevance ranking, stemming |
06:48 |
|
gavin |
how big a deal is it to set up |
06:48 |
|
gavin |
? |
06:48 |
|
kados |
pretty easy |
06:48 |
|
kados |
install zebra |
06:48 |
|
kados |
export your MARC records |
06:48 |
|
kados |
run the indexer |
06:48 |
|
kados |
start the server |
06:48 |
|
gavin |
i mean, can a library just switch it on as part of koha or is that a big deal? |
06:48 |
|
kados |
well ... there's a lot of integration work to be done |
06:49 |
|
gavin |
of course, but once it's part of a release |
06:49 |
|
kados |
before Zebra could be used officially in Koha |
06:49 |
|
kados |
yea ... we could set it up that way |
06:49 |
|
gavin |
interesting |
06:49 |
|
gavin |
i'm convinced i was doing something wrong with that fulltext search so I'd like to chase it down |
06:49 |
|
kados |
sure ... I understand ;-) |
06:50 |
|
gavin |
easiest way would be to get sanspach's data |
06:50 |
|
gavin |
I think there must've been something wrong with the indexes but I'm not sure |
06:50 |
|
kados |
that's if sanspach's data were actually readable ;-) |
06:50 |
|
gavin |
yes |
06:50 |
|
kados |
sanspach doesn't use Koha |
06:50 |
|
gavin |
is his existing tool bad at exports? |
06:50 |
|
kados |
so it seems |
06:51 |
|
gavin |
thta's really bad |
07:25 |
|
Genji |
hiya all. |
07:54 |
|
gavin |
kados: any idea what is wrong with sanspach's data. could it be fixed? |
08:09 |
|
sanspach |
talking about me, I see |
08:10 |
|
sanspach |
or rather, my data :) |
08:11 |
|
gavin |
spansach: hi! |
08:11 |
|
gavin |
spansach: I'm guessing that'll make you migration process more than a bit tough |
08:11 |
|
sanspach |
actually, the data I have available (from my university's catalog) isn't migrating |
08:12 |
|
sanspach |
I'm using Koha for a special-purpose catalog, other data |
08:12 |
|
gavin |
i see. not such an issue then |
08:12 |
|
sanspach |
and my sysadmin finally got it up last night!!! I'm *so* anxious to finally see it in action |
08:13 |
|
gavin |
still, i wonder if it might be possible to repair the marc data |
08:13 |
|
sanspach |
I am really puzzled about why it is a problem. |
08:13 |
|
sanspach |
every tool I've used can read it just fine |
08:14 |
|
gavin |
by the sounds of it there's a subtle bug either in the koha import or in your export |
08:15 |
|
sanspach |
if I only knew what, I could probably fix it (in my experience, perl can fix anything!) |
08:15 |
|
gavin |
indeed |
08:15 |
|
gavin |
although perl can break anything too : |
08:16 |
|
gavin |
:) |
08:16 |
|
sanspach |
yeah, that's where the knowing what you're doing part comes in, isn't it?! |
08:16 |
|
gavin |
i guess so |
08:17 |
|
gavin |
could you send me a small piece of data with just one or two records? |
08:17 |
|
sanspach |
sure; hold on |
08:18 |
|
sanspach |
oh my |
08:18 |
|
gavin |
? |
08:18 |
|
sanspach |
is kados still around? |
08:18 |
|
gavin |
uh oh ! :) |
08:19 |
|
sanspach |
I may have just found the problem |
08:21 |
|
gavin |
if you redo the export I can take it from you over ssh and I'll get it to kados |
08:23 |
|
sanspach |
OK, I've prepared two files: |
08:23 |
|
sanspach |
one is just three records, in the format I had them before |
08:23 |
|
sanspach |
the other is the same three records, with what I think is the fix for the problem |
08:24 |
|
gavin |
cool |
08:25 |
|
sanspach |
can I email you where they are? |
08:25 |
|
gavin |
eah, that's fine. either that or just attach them if they're not too big |
08:25 |
|
sanspach |
can do that; they're not big at all |
08:25 |
|
gavin |
great |
08:28 |
|
gavin |
got them |
08:35 |
|
gavin |
hmmm. both files seemed to be successfully imported |
08:39 |
|
gavin |
oh. except they have NULL biblio fields |
08:39 |
|
gavin |
(both of them) |
08:40 |
|
sanspach |
the second file should be fine; I can't see anything non-standard about it (it is marc21, not unimarc) |
08:40 |
|
sanspach |
the first one is missing the |a at the beginning of each field (except those that start with another subfield) |
08:40 |
|
sanspach |
that's how our system stores the data --I forgot to add them back in when converting |
08:40 |
|
gavin |
I ran "perl misc/migration_tools/bulkmarcimport.pl -file orig/sanspach/small.sample.mrc" |
08:41 |
|
gavin |
and got "3 MARC record done in 1.23205494880676 seconds" |
08:41 |
|
gavin |
same for both files |
08:41 |
|
gavin |
the records are in the database but the fields seem to be null |
08:42 |
|
gavin |
gavinrobin sanspach> marclint small.sample.mrc |
08:42 |
|
gavin |
small.sample.mrc |
08:42 |
|
gavin |
245: No 245 tag. |
08:42 |
|
gavin |
245: No 245 tag. |
08:42 |
|
gavin |
245: No 245 tag. |
08:42 |
|
gavin |
Recs Errs Filename |
08:42 |
|
gavin |
----- ----- -------- |
08:42 |
|
gavin |
3 3 small.sample.mrc |
08:42 |
|
gavin |
same errors for both |
08:43 |
|
gavin |
I'm not veyr familiar with marc so I'm not sure what this means |
08:43 |
|
sanspach |
and I don't know the koha tools yet, so I'm a little lost |
08:43 |
|
sanspach |
go ahead and dump the first file; it is definitely wrong |
08:43 |
|
gavin |
do you have the spec of marc? |
08:44 |
|
sanspach |
oh, it is complicated; my favorite resource is |
08:44 |
|
sanspach |
http://lcweb.loc.gov/marc/ |
08:44 |
|
sanspach |
from there select "bibliographic" |
08:44 |
|
sanspach |
but you kind of have to know what you're doing before you can really make good use of it |
08:45 |
|
gavin |
i see |
08:46 |
|
gavin |
according to that "245 - TITLE STATEMENT " |
08:46 |
|
gavin |
is what marc_lint is complaining is missing |
08:47 |
|
sanspach |
yeah, the 245 is basically the one required field in all records |
08:47 |
|
sanspach |
all 3 have the 245 field, with subfields a, b, h, and c populated |
08:48 |
|
gavin |
did you say you have koha running? |
08:48 |
|
sanspach |
my sysadmin just got the OS problems worked out and koha installed last night |
08:49 |
|
sanspach |
haven't seen any of it except to log in to the shell to make sure my pw worked |
08:49 |
|
gavin |
do you think you could input one of those records and export it? |
08:49 |
|
gavin |
if we could see the differences in the marc records we might see what's causing the problem |
08:49 |
|
gavin |
(i mean input by hand of course) |
08:51 |
|
sanspach |
will try |
08:54 |
|
gavin |
is there anyone here who is a marc expert? |
08:55 |
|
kados |
I'm probably as close as you'll get at the moment (paul wrote the marc stuff) |
08:57 |
|
sanspach |
there's not really any way to reproduce this record in our basic install of koha |
08:57 |
|
sanspach |
lots of fields and subfields not defined |
08:58 |
|
kados |
right ... best thing to do is run bulkmarcimport on your records and import them that way |
08:58 |
|
kados |
that way you won't lose any of that data |
08:58 |
|
gavin |
we're trying to go the other way though, to see why the records he exported won't read into koha |
08:58 |
|
sanspach |
anyone want to walk me through that? |
08:59 |
|
sanspach |
starting with where to put the file to be input? |
08:59 |
|
kados |
shouldn't matter where it is |
08:59 |
|
kados |
navigate to [kohainstalldir]/misc/migration_tools |
08:59 |
|
kados |
you'll find a file there called bulkmarcimport |
09:10 |
|
sanspach |
I find /usr/local/koha/intranet/scripts/misc/bulkmarcimport.pl |
09:17 |
|
kados |
that's it |
09:17 |
|
kados |
sanspach: you sure you've got 2.2? |
09:17 |
|
kados |
(it may be that the file structure only changed in CVS, but iirc it should be in migration_tools) |
09:18 |
|
sanspach |
how do I tell? |
09:18 |
|
gavin |
i have 2.2.2b and /usr/local/koha/intranet/scripts/misc/migration_tools/bulkmarcimport.pl |
09:21 |
|
kados |
check koha.conf |
09:21 |
|
kados |
should be in /etc/ |
09:21 |
|
sanspach |
2.0.2 |
09:23 |
|
kados |
k... |
09:23 |
|
kados |
you need to grab 2.2.2b |
09:23 |
|
kados |
or just wait untill next week for 2.3 |
09:23 |
|
kados |
err ... 2.2.3 |
09:23 |
|
sanspach |
i think that was the plan |
09:24 |
|
kados |
well 2.0.2 is quite different than the 2.2 |
09:24 |
|
kados |
series |
09:24 |
|
kados |
if you're going to start working on something I'd suggest 2.2 |
09:24 |
|
sanspach |
that makes sense; easy to upgrade? |
09:25 |
|
kados |
should be |
09:25 |
|
kados |
but the best thing to do is setup |
09:25 |
|
kados |
a CVS repo so you can always grab the latest bugfixes, etc. |
09:25 |
|
kados |
I've documented it on kohadocs.org |
09:25 |
|
kados |
under "Updating Koha" |
09:25 |
|
kados |
(symlink your install dirs to the CVS repo) |
09:26 |
|
kados |
that way you get 2.2.2b + any bugfixes |
09:27 |
|
gavin |
I got a marc export from the liblime site and it seems quite different from sanspach's one |
09:32 |
|
gavin |
one seems to use the |[a-z] as a field label, the other is using _[a-z] |
09:35 |
|
kados |
interesting |
09:35 |
|
kados |
maybe that's the prob with sanspach's records |
09:41 |
|
sanspach |
the second full file I wrote out with MARC::Record; that would have used whatever it uses |
09:46 |
|
gavin |
I presume the two use the same version of Marc? |
09:47 |
|
sanspach |
the marc standard carries a value for how many characters delimit the subfields (2); |
09:47 |
|
sanspach |
aside from that, it is implementation-specific |
09:48 |
|
gavin |
you mean the delimiter is specified within the marc file headers? |
09:48 |
|
sanspach |
no; it is up to the application to handle; each marc record carries the value "2" to indicate |
09:48 |
|
sanspach |
|a or _a or $a or !a or whatever is two characters |
09:49 |
|
sanspach |
it is up to the application to take the first as the delimiter and the second as something of value |
09:50 |
|
gavin |
but where is it dictated which of !_$ are used? |
09:54 |
|
sanspach |
according to LoC's docs, it should be hex 1F which wouldn't be any of those |
09:55 |
|
gavin |
hmmm. |
09:56 |
|
gavin |
as it appears to me there are several delimiters |
09:56 |
|
gavin |
^] (the control char) delimits biblio entries |
09:56 |
|
gavin |
^^ (also control char) is a sub-delimiter |
09:57 |
|
gavin |
|[a-z] or ^_[a-z] seem to be sub-sub-delimiters. |
09:57 |
|
gavin |
i may be way off track here |
10:02 |
|
sanspach |
gavin: try the file I just re-sent |
10:09 |
|
gavin |
sanspach: okay just waiting for it |
10:17 |
|
gavin |
sanspach: still no sign of it? |
10:20 |
|
sanspach |
I could put it on our webserver for you to retrieve |
10:20 |
|
gavin |
whatever suits, did the email fail? |
10:21 |
|
sanspach |
not that I've seen; seems to have been sent just like the first |
10:21 |
|
gavin |
i wonder is the greylisting slowing it down. a web link is fine |
10:23 |
|
sanspach |
http://www.indiana.edu/~glbtlib/koha/tmp/ |
10:26 |
|
gavin |
got it, thanks |
10:28 |
|
gavin |
3 MARC record done in 3.77325296401978 seconds |
10:28 |
|
gavin |
same problem |
10:29 |
|
sanspach |
argh!! |
10:29 |
|
gavin |
marclint complains a little more this time but basically it's still "245: No 245 tag." |
10:29 |
|
sanspach |
that file was written out with MARC::Record ! how can the format be wrong? |
10:30 |
|
sanspach |
oh, wait |
10:33 |
|
sanspach |
it seems the fields are separated by (hex) 0D 1E, not the 2E 1E I'm used to |
10:40 |
|
sanspach |
so, how about I contributed a bulkasciiimport script to the koha project? |
10:40 |
|
sanspach |
I'm so tired of marc I could scream |
10:41 |
|
gavin |
agreed |
10:43 |
|
gavin |
do you think this difference in separator is the problem? |
10:43 |
|
gavin |
cos that would be easy to fix |
10:43 |
|
sanspach |
if you want to try, you could change it (or I could) and try the import again |
10:46 |
|
gavin |
i'm trying to decide what needs changing. you're talking in ascii, and I'm seeing cntrol chars :) |
10:50 |
|
gavin |
ooh! that might be it |
10:50 |
|
sanspach |
? |
10:51 |
|
gavin |
i think we're in business |
10:51 |
|
gavin |
just search and replace | for ctrl-underscore |
10:51 |
|
gavin |
I have a tiny perl script for it if you want it |
10:53 |
|
gavin |
http://mccullagh.homeip.net/~gavin/trans_mrc |
10:53 |
|
gavin |
kados: are you still about? |
10:54 |
|
kados |
gavin: just for another second |
10:55 |
|
gavin |
i think we might have figured out that problem |
10:55 |
|
gavin |
it may just be a delimiter that needs search and reaplce |
10:55 |
|
kados |
sweet ... I'll try it out after my meeting -- gotta run |
10:55 |
|
kados |
(be back in about an hour or two I hope ;-)) |
10:55 |
|
gavin |
no bother |
10:56 |
|
gavin |
http://mccullagh.homeip.net/~gavin/trans_mrc |
10:56 |
|
sanspach |
I'm lost/confused |
10:57 |
|
gavin |
sorry, I took one of your samples and ran a search and replace |
10:57 |
|
gavin |
I then pointed bulkmarcimport at the output |
10:57 |
|
gavin |
and it seems to have inserted properly and marclint doesn't complain about it |
10:58 |
|
gavin |
the replacement is done by the script above |
10:58 |
|
sanspach |
hmm the last version (via websever) should have everything correct (it was generated by marc::record) |
10:59 |
|
sanspach |
even if the earlier files didn't |
11:00 |
|
gavin |
marclint complains a lot about it |
11:00 |
|
gavin |
there are lots of ^M characters in it which is odd |
11:01 |
|
sanspach |
too much transferring between machines--win32 to linux and back--will cause that |
11:01 |
|
sanspach |
I wondered if some of the moving files about was causing some of the problems |
11:01 |
|
gavin |
possibly but I certainly haven't done that |
11:02 |
|
gavin |
are you on windows? |
11:02 |
|
sanspach |
yes; also have linux (gentoo) |
11:02 |
|
sanspach |
I have the flat files (extracts from our system) on windows, but marc::record on linux |
11:02 |
|
gavin |
well, when I look at your last entry in vim I get strange ^Ms here and there |
11:04 |
|
sanspach |
my first thought this morning (before I even got out of bed) was wondering if |
11:04 |
|
sanspach |
kados' problems could be due to line breaks introduced during file transfers |
11:05 |
|
kados |
could be |
11:05 |
|
gavin |
i guess it's possible. as i understood it there are no linebreaks in these files/ |
11:05 |
|
kados |
though I had problems even with the single file |
11:05 |
|
kados |
(but not that I think of ig I don't think I deleted the previous iindex) |
11:05 |
|
kados |
(on phone -- conference call) |
11:06 |
|
sanspach |
kados: kinda wondered--certain searches were giving too many hits for the small file we ended with |
11:22 |
|
sanspach |
I'm gone for a couple hours; back later |