IRC log for #koha, 2010-01-01

All times shown according to UTC.

Time S Nick Message
00:22 cait left #koha
00:34 cait joined #koha
00:53 cait left #koha
00:57 tirabo left #koha
01:11 imp happy new year :)
04:44 greenmang0 joined #koha
05:15 thd-away is now known as thd
05:17 thd chris: I have bn speaking to someone who claims that storing blobs in the database as we do for bibliographic records is much less efficient than storing them as individual files on the filesystem
06:01 imp thd: each record as own file?
06:02 moodaepo Happy New Year!
06:02 imp you'll have much filesystem overhead and maybe get some trouble with your filesystem (max amount of files per fs / dir)
06:02 imp happy new year moodaepo :)
06:23 thd yes imp
06:23 thd each record as its own file
06:24 thd I built such a system myself at the same time as Koha was being created but my system was certainly not efficient for other reasons
06:51 chris thd that may be true for some actions
06:51 chris but definitely wrong for the way we use them
06:51 chris we dont ever search the blobs
06:52 chris and the marcxml is actually text anyway and thats were what little db interaction we do is done
06:54 thd chris: what would be the difference in performance between storing the blobs in the DB or on the filesystem?
06:54 chris thd: in a sufficiently ram endowed system, mysql would cache the most used blobs in ram
06:54 chris thus outperforming most filesystems
06:55 thd yet filesystems also do caching
06:55 chris some do
06:55 chris not all
06:55 chris and not often into userspace ram
06:56 chris the upshot is, there are about 10 million things that i would work on to speed up koha before i even bothered looking at that one
06:56 chris ie, that is the least of our worries
06:57 chris the other thing is
06:57 thd I am actually not asking necessarily in relation to Koha
06:58 chris i would agree it could be less efficient .. but could and less ... not definitely and much
06:59 chris and it depends entirely on the application, the filesystem and the OS
06:59 chris (blanket statements like the one above annoy me)
06:59 chris brb putting kids to bed
07:08 chris back
07:11 thd In Koha we are not doing many simultaneous writes of bibliographic records in real time.
07:11 imp thd: you'll get some other problems if you store stuff in small files, the mysql performs smarter caching then a filesystem cache will do (if free memory is needed, the os drops the fs cache first)
07:12 imp and you maybe have a new file, which is not written onto the disk if it's created (write buffer), if the host fails, it might be lost
07:13 imp a database may enforce writing back the stuff it stores directly
07:13 thd What would be the effect for DB vs. filesystem performance where thousands of bibliographic records were being automatically captured and written within a few minutes?
07:15 thd Libraries tend to be writing few records in real time.  Most activity is reading how would the issue differ if most activity would be writing?
07:16 imp a database is just one open file (or at least not many), using single files, your system has to create a new filedecriptor, write the stuff and close the file handle
07:16 imp (for each one)
07:18 thd Ok. another question about scalability: What happens when the size of the DB table storage exceeds the maximum file size for a single file on the file system.
07:18 thd ?
07:19 imp lets asume, both operations would take the same amount of (cpu) time. you'll still need an index to store the rest of the metadata (which would be quite stupid to implement that stuff as well in a flat file (-> reimplementing a db)), so you need a db anyway
07:19 thd Very good point
07:20 imp it's up to the database how it'll handle such a situation (never had this problem), maybe it'll split the db into two files? dunno
07:21 thd How would you implement a split?
07:24 thd Would the database perform such a split automatically?
07:25 imp another good question :). i would maybe get a random (yes, random) set of lines from the old file, open a new one and perform place the new lines into the new file (so you'll gain some space to update the lines in the old one). on a search, i would take a look into both ones parallel
07:25 imp i don't know
07:26 imp mysql has a filesizelimit -> http://dev.mysql.com/doc/refma[…]n/full-table.html
07:30 thd Do terabyte ext2-4 files require building the filesystem with large file option?
07:46 imp dunno
07:46 imp it's late here ;)
07:48 imp but i never had any problems with linux storing complete hdd images on any filesystem i used
07:49 imp (like, no problem with 150gb in one file on ext3, anyway, it's still far below the limit for the maximal filesize)
11:36 mib_t7ov5l joined #koha
11:37 mib_t7ov5l left #koha
12:26 Ropuch Good morning #koha
13:01 francharb joined #koha
13:15 greenmang0 left #koha
13:18 francharb1 joined #koha
13:18 francharb left #koha
13:19 francharb joined #koha
13:19 francharb1 left #koha
13:30 francharb left #koha
13:44 francharb joined #koha
15:14 bebbi joined #koha
15:33 bebbi left #koha
15:58 imp thd: a friend just told me, that orcale and db2 are able to start a new contain for the tables and continue working (but he has now idea, too, if mysql can du such things)
16:02 imp s/contain/container/
16:46 francharb left #koha
19:10 CGI194 joined #koha
19:11 CGI194 left #koha
22:17 CGI376 joined #koha
22:17 CGI376 left #koha
22:39 tirabo joined #koha

| Channels | #koha index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary