Time |
S |
Nick |
Message |
00:22 |
|
|
cait left #koha |
00:34 |
|
|
cait joined #koha |
00:53 |
|
|
cait left #koha |
00:57 |
|
|
tirabo left #koha |
01:11 |
|
imp |
happy new year :) |
04:44 |
|
|
greenmang0 joined #koha |
05:15 |
|
|
thd-away is now known as thd |
05:17 |
|
thd |
chris: I have bn speaking to someone who claims that storing blobs in the database as we do for bibliographic records is much less efficient than storing them as individual files on the filesystem |
06:01 |
|
imp |
thd: each record as own file? |
06:02 |
|
moodaepo |
Happy New Year! |
06:02 |
|
imp |
you'll have much filesystem overhead and maybe get some trouble with your filesystem (max amount of files per fs / dir) |
06:02 |
|
imp |
happy new year moodaepo :) |
06:23 |
|
thd |
yes imp |
06:23 |
|
thd |
each record as its own file |
06:24 |
|
thd |
I built such a system myself at the same time as Koha was being created but my system was certainly not efficient for other reasons |
06:51 |
|
chris |
thd that may be true for some actions |
06:51 |
|
chris |
but definitely wrong for the way we use them |
06:51 |
|
chris |
we dont ever search the blobs |
06:52 |
|
chris |
and the marcxml is actually text anyway and thats were what little db interaction we do is done |
06:54 |
|
thd |
chris: what would be the difference in performance between storing the blobs in the DB or on the filesystem? |
06:54 |
|
chris |
thd: in a sufficiently ram endowed system, mysql would cache the most used blobs in ram |
06:54 |
|
chris |
thus outperforming most filesystems |
06:55 |
|
thd |
yet filesystems also do caching |
06:55 |
|
chris |
some do |
06:55 |
|
chris |
not all |
06:55 |
|
chris |
and not often into userspace ram |
06:56 |
|
chris |
the upshot is, there are about 10 million things that i would work on to speed up koha before i even bothered looking at that one |
06:56 |
|
chris |
ie, that is the least of our worries |
06:57 |
|
chris |
the other thing is |
06:57 |
|
thd |
I am actually not asking necessarily in relation to Koha |
06:58 |
|
chris |
i would agree it could be less efficient .. but could and less ... not definitely and much |
06:59 |
|
chris |
and it depends entirely on the application, the filesystem and the OS |
06:59 |
|
chris |
(blanket statements like the one above annoy me) |
06:59 |
|
chris |
brb putting kids to bed |
07:08 |
|
chris |
back |
07:11 |
|
thd |
In Koha we are not doing many simultaneous writes of bibliographic records in real time. |
07:11 |
|
imp |
thd: you'll get some other problems if you store stuff in small files, the mysql performs smarter caching then a filesystem cache will do (if free memory is needed, the os drops the fs cache first) |
07:12 |
|
imp |
and you maybe have a new file, which is not written onto the disk if it's created (write buffer), if the host fails, it might be lost |
07:13 |
|
imp |
a database may enforce writing back the stuff it stores directly |
07:13 |
|
thd |
What would be the effect for DB vs. filesystem performance where thousands of bibliographic records were being automatically captured and written within a few minutes? |
07:15 |
|
thd |
Libraries tend to be writing few records in real time. Most activity is reading how would the issue differ if most activity would be writing? |
07:16 |
|
imp |
a database is just one open file (or at least not many), using single files, your system has to create a new filedecriptor, write the stuff and close the file handle |
07:16 |
|
imp |
(for each one) |
07:18 |
|
thd |
Ok. another question about scalability: What happens when the size of the DB table storage exceeds the maximum file size for a single file on the file system. |
07:18 |
|
thd |
? |
07:19 |
|
imp |
lets asume, both operations would take the same amount of (cpu) time. you'll still need an index to store the rest of the metadata (which would be quite stupid to implement that stuff as well in a flat file (-> reimplementing a db)), so you need a db anyway |
07:19 |
|
thd |
Very good point |
07:20 |
|
imp |
it's up to the database how it'll handle such a situation (never had this problem), maybe it'll split the db into two files? dunno |
07:21 |
|
thd |
How would you implement a split? |
07:24 |
|
thd |
Would the database perform such a split automatically? |
07:25 |
|
imp |
another good question :). i would maybe get a random (yes, random) set of lines from the old file, open a new one and perform place the new lines into the new file (so you'll gain some space to update the lines in the old one). on a search, i would take a look into both ones parallel |
07:25 |
|
imp |
i don't know |
07:26 |
|
imp |
mysql has a filesizelimit -> http://dev.mysql.com/doc/refma[…]n/full-table.html |
07:30 |
|
thd |
Do terabyte ext2-4 files require building the filesystem with large file option? |
07:46 |
|
imp |
dunno |
07:46 |
|
imp |
it's late here ;) |
07:48 |
|
imp |
but i never had any problems with linux storing complete hdd images on any filesystem i used |
07:49 |
|
imp |
(like, no problem with 150gb in one file on ext3, anyway, it's still far below the limit for the maximal filesize) |
11:36 |
|
|
mib_t7ov5l joined #koha |
11:37 |
|
|
mib_t7ov5l left #koha |
12:26 |
|
Ropuch |
Good morning #koha |
13:01 |
|
|
francharb joined #koha |
13:15 |
|
|
greenmang0 left #koha |
13:18 |
|
|
francharb1 joined #koha |
13:18 |
|
|
francharb left #koha |
13:19 |
|
|
francharb joined #koha |
13:19 |
|
|
francharb1 left #koha |
13:30 |
|
|
francharb left #koha |
13:44 |
|
|
francharb joined #koha |
15:14 |
|
|
bebbi joined #koha |
15:33 |
|
|
bebbi left #koha |
15:58 |
|
imp |
thd: a friend just told me, that orcale and db2 are able to start a new contain for the tables and continue working (but he has now idea, too, if mysql can du such things) |
16:02 |
|
imp |
s/contain/container/ |
16:46 |
|
|
francharb left #koha |
19:10 |
|
|
CGI194 joined #koha |
19:11 |
|
|
CGI194 left #koha |
22:17 |
|
|
CGI376 joined #koha |
22:17 |
|
|
CGI376 left #koha |
22:39 |
|
|
tirabo joined #koha |