IRC log for #koha, 2010-01-01

All times shown according to UTC.

Time	Nick	Message
00:22		cait left #koha
00:34		cait joined #koha
00:53		cait left #koha
00:57		tirabo left #koha
01:11	imp	happy new year :)
04:44		greenmang0 joined #koha
05:15		thd-away is now known as thd
05:17	thd	chris: I have bn speaking to someone who claims that storing blobs in the database as we do for bibliographic records is much less efficient than storing them as individual files on the filesystem
06:01	imp	thd: each record as own file?
06:02	moodaepo	Happy New Year!
06:02	imp	you'll have much filesystem overhead and maybe get some trouble with your filesystem (max amount of files per fs / dir)
06:02	imp	happy new year moodaepo :)
06:23	thd	yes imp
06:23	thd	each record as its own file
06:24	thd	I built such a system myself at the same time as Koha was being created but my system was certainly not efficient for other reasons
06:51	chris	thd that may be true for some actions
06:51	chris	but definitely wrong for the way we use them
06:51	chris	we dont ever search the blobs
06:52	chris	and the marcxml is actually text anyway and thats were what little db interaction we do is done
06:54	thd	chris: what would be the difference in performance between storing the blobs in the DB or on the filesystem?
06:54	chris	thd: in a sufficiently ram endowed system, mysql would cache the most used blobs in ram
06:54	chris	thus outperforming most filesystems
06:55	thd	yet filesystems also do caching
06:55	chris	some do
06:55	chris	not all
06:55	chris	and not often into userspace ram
06:56	chris	the upshot is, there are about 10 million things that i would work on to speed up koha before i even bothered looking at that one
06:56	chris	ie, that is the least of our worries
06:57	chris	the other thing is
06:57	thd	I am actually not asking necessarily in relation to Koha
06:58	chris	i would agree it could be less efficient .. but could and less ... not definitely and much
06:59	chris	and it depends entirely on the application, the filesystem and the OS
06:59	chris	(blanket statements like the one above annoy me)
06:59	chris	brb putting kids to bed
07:08	chris	back
07:11	thd	In Koha we are not doing many simultaneous writes of bibliographic records in real time.
07:11	imp	thd: you'll get some other problems if you store stuff in small files, the mysql performs smarter caching then a filesystem cache will do (if free memory is needed, the os drops the fs cache first)
07:12	imp	and you maybe have a new file, which is not written onto the disk if it's created (write buffer), if the host fails, it might be lost
07:13	imp	a database may enforce writing back the stuff it stores directly
07:13	thd	What would be the effect for DB vs. filesystem performance where thousands of bibliographic records were being automatically captured and written within a few minutes?
07:15	thd	Libraries tend to be writing few records in real time. Most activity is reading how would the issue differ if most activity would be writing?
07:16	imp	a database is just one open file (or at least not many), using single files, your system has to create a new filedecriptor, write the stuff and close the file handle
07:16	imp	(for each one)
07:18	thd	Ok. another question about scalability: What happens when the size of the DB table storage exceeds the maximum file size for a single file on the file system.
07:18	thd	?
07:19	imp	lets asume, both operations would take the same amount of (cpu) time. you'll still need an index to store the rest of the metadata (which would be quite stupid to implement that stuff as well in a flat file (-> reimplementing a db)), so you need a db anyway
07:19	thd	Very good point
07:20	imp	it's up to the database how it'll handle such a situation (never had this problem), maybe it'll split the db into two files? dunno
07:21	thd	How would you implement a split?
07:24	thd	Would the database perform such a split automatically?
07:25	imp	another good question :). i would maybe get a random (yes, random) set of lines from the old file, open a new one and perform place the new lines into the new file (so you'll gain some space to update the lines in the old one). on a search, i would take a look into both ones parallel
07:25	imp	i don't know
07:26	imp	mysql has a filesizelimit -> http://dev.mysql.com/doc/refma[…]n/full-table.html
07:30	thd	Do terabyte ext2-4 files require building the filesystem with large file option?
07:46	imp	dunno
07:46	imp	it's late here ;)
07:48	imp	but i never had any problems with linux storing complete hdd images on any filesystem i used
07:49	imp	(like, no problem with 150gb in one file on ext3, anyway, it's still far below the limit for the maximal filesize)
11:36		mib_t7ov5l joined #koha
11:37		mib_t7ov5l left #koha
12:26	Ropuch	Good morning #koha
13:01		francharb joined #koha
13:15		greenmang0 left #koha
13:18		francharb1 joined #koha
13:18		francharb left #koha
13:19		francharb joined #koha
13:19		francharb1 left #koha
13:30		francharb left #koha
13:44		francharb joined #koha
15:14		bebbi joined #koha
15:33		bebbi left #koha
15:58	imp	thd: a friend just told me, that orcale and db2 are able to start a new contain for the tables and continue working (but he has now idea, too, if mysql can du such things)
16:02	imp	s/contain/container/
16:46		francharb left #koha
19:10		CGI194 joined #koha
19:11		CGI194 left #koha
22:17		CGI376 joined #koha
22:17		CGI376 left #koha
22:39		tirabo joined #koha