Time Nick Message 06:46 drojf hi #koha 20:10 archer121 Hi, our koha server is crashing occasionally for some reason. Where can I look to get an idea as to why this is happening? I have described my issue in detail over here: http://stackoverflow.com/questions/683052/why-am-i-getting-an-apache-proxy-503-error Please see the answer to the post too. 20:11 archer121 Fellas, extremely sorry, I posted the wrong link. Here is the correct one: https://serverfault.com/questions/801029/apache-crashing-occasionally-with-error-503/801032 20:12 archer121 I don't see anything else in /var/logs/koha 20:14 cait1 hi archer121 20:14 cait1 still sunday in some parts of the world - quiet here usually 20:15 archer121 monday morning 2 AM here... :-) 20:15 cait1 did you see the answer on the mailing list? 20:17 archer121 yeah, but in my case, the backed service, ie koha has completely crashed. 20:17 archer121 backend* 20:18 archer121 Do you still think it will solve my issue? 20:21 cait1 i am not a sysadmin 20:22 cait1 sorry, probably not much help there 20:22 cait1 how did you install Koha? which OS? I think that's what you will be asked first 20:22 archer121 me neither, but thanks anyway 20:22 archer121 debain, apt 20:24 cait1 is there something special about your setup? 20:25 cait1 mostly feeding the chat logs here - we are running with packages on Debian too, but haven't experienced problems like that so far 20:25 archer121 We have it integrated to RFID hardware on our own, but in a safe manner 20:26 archer121 we have memcached and plack running 20:27 cait1 hm maybe you should add all that information to the mailing list threaad 20:27 cait1 together with the exact version of koha you are using 20:28 archer121 It;s been just 2 months since we migrated to koha, and the issue was there since the beginning. In this two months, we have upgraded twice, and once purged and reinstalled koha. 20:28 archer121 yes I will. 20:29 cait1 hm sounds not like fun :( 20:30 rangi hmm i haven't run into plack crashing, on any of the 50 or so sites i look after 20:30 rangi is it OOMing? 20:30 archer121 restarting apache won't help, but what the technicians do about it is to restart the entire system. 20:30 archer121 nope 20:30 rangi yeah restarting apache won't do anything to plack 20:30 archer121 4.2G free 20:30 rangi i dont think it's anything to do with apache 20:30 archer121 why do you guys think it is plack? 20:31 rangi because that is what the error is telling you 20:31 archer121 why do you guys think it has something to do with plack? 20:31 rangi HTTP: attempt to connect to Unix domain socket /var/run/koha/nitc/plack.sock (localhost) failed 20:31 rangi next time it happens run 20:32 archer121 I currently have it in the crashed state 20:32 rangi sudo koha-plack --restart instancename 20:32 rangi if that doesnt work 20:32 rangi try --stop 20:32 rangi then --start 20:34 archer121 alright, so plack was not running, so I could not restart or stop it (as it was not running) 20:34 rangi but starting it worked? 20:35 archer121 half worked. 20:36 archer121 the 503 goes away, but it is asif the zebra indexing is not done, you know, like all searches on opac returning empty 20:36 rangi restart zebra too then 20:37 rangi sudo koha-restart-zebra instancename 20:38 archer121 yeah, that worked. 20:38 rangi it really does feel like that your machine OOMed at some point in the past and killed zebra and plack but that is just a guess you'd have to go back through syslogs looking 20:38 rangi really odd for both plack and zebra to have died 20:38 archer121 I don;t think so, because OOMs should come in the dmesg 20:38 rangi so yeah, i think you are going to need to do some forensics to find out what is killing those 2 things 20:39 rangi bad ram, ooming, something else 20:39 rangi id set up some monitoring with monit, or icinga2 or something 20:39 rangi to monitor zebra and plack, see if you can pinpoint when it happens 20:40 rangi but, at least restarting works, which is a zillion times better than rebooting 20:40 archer121 I just found out when it happens from the plack-error.log: 2016/09/04-07:37:06 Server closing! 20:41 archer121 And at that time the library is closed. 20:41 rangi yeah, so you will want to track down what is doing it, see if you can find out when zebra was turned off/crashed too 20:50 archer121 at the same time: zebra-error.log: 20160904 07:37:06 nitc-koha-zebra: client (pid 19241) killed by signal 15, stopping 20:53 rangi yeah 20:53 rangi something did that 20:54 rangi thats not a crash 20:54 rangi so you need to find what was running at 7.30ish 20:54 rangi maybe logrotate 20:59 archer121 ah, I found something sweet in syslog! 20:59 archer121 cron.daily was executed at the same time. 20:59 rangi yeah, so its most likely logrotate 21:00 rangi that stops stuff, rotates the logs, and supposed to restart it 21:00 archer121 here, take a look: https://paste.ubuntu.com/23134317/ 21:02 rangi look at the syslog before that 21:02 rangi because syslog gets restarted as part of the logrotate too 21:03 cait1 archer121: i tihnk what rangi is trying to tell you is that it doesn't crash 21:03 cait1 it's shut down intentionally 21:03 cait1 to do some system taks - but it doesn't come back like it should 21:05 archer121 i see. 21:06 archer121 here is the syslog before the logrotate: https://paste.ubuntu.com/23134337/ 21:07 archer121 I do not see anything useful in it, but my eyes are not that trained, 21:07 archer121 I now think that this crash occurs every sunday. 21:07 archer121 and sunday is a calendar holiday 21:09 rangi yeah thats logrotate running 21:17 archer121 at this point the only thing that I can think of doing is to manually run all the commands in koha's cron.daily and see if it fails. 21:19 rangi i guarantee it is logrotate doing it 21:19 archer121 okey, so now I am stuck. what should I do? 21:20 rangi what version are you running ? 21:20 rangi (of koha) 21:21 archer121 3.22.10 21:21 archer121 but the crash was there since 3.22.08, which was our first version 21:21 rangi it's not a crash 21:22 archer121 Why is this not occuring daily if logrotate is doing it? 21:22 rangi it will be a race condition 21:22 rangi it'll be trying to start it again, while it is still stopping 21:23 rangi so the start will fail, and it will continue stopping, and be stopped 21:24 rangi if you look in /etc/logrotate.d/ 21:24 rangi there is a file koha-common 21:24 rangi that is what tells it what to do 21:26 archer121 and this happens weekly! 21:27 rangi right so logrotate probably tells it to rotate weekly 21:27 rangi weekly 21:27 rangi yep 21:27 archer121 and the fix? 21:27 wahanui the fix is https://www.youtube.com/watch?v=Pg_ArW8lrl0 21:28 archer121 is that a bot? 21:28 wizzyrea yes 21:28 wizzyrea also hi 21:28 archer121 hi 21:28 rangi https://lists.katipo.co.nz/public/koha/2016-July/045823.html 21:28 rangi maybe try that 21:28 rangi then check again next time it runs 21:29 rangi we now know when that is going to be 21:29 rangi 7.30am sunday morning 21:30 archer121 great! thanks a lot! 21:30 archer121 but will this change get overwritten on every update of koha? 21:31 rangi yep, but if it works, you can file a bug and say that is the fix, then it will go into koha 21:32 archer121 Will do that 21:37 archer122_ hey, I dot disconnected for a moment 21:37 archer122_ got* 21:38 cait1 nothing happened 21:38 archer122_ great 21:39 archer122_ I am planning to confirm it this is the issue and if so file the bug report by manually triggering a logrotate right now. is that okey, rangi? 21:41 rangi yeah you probably didnt want to do that 21:42 archer122_ oops, why? I already did that. is it going to create any problems? 21:43 archer122_ and koha is functioning properly even after I did the logrotate 21:44 archer122_ (without any sleep) 21:47 archer122_ okey, so maybe zebra needs to do something in the background if this issue is to be reporduced. 21:49 rangi yeah, it won't be a real test unless its running for real 21:50 archer122_ okey, gotta hit bed. I have to attend classes tomorrow! 21:50 archer122_ thanks you again for helping. 22:09 eythian https://www.theguardian.com/us-news/2016/sep/03/borrowed-time-us-library-to-enforce-jail-sentences-for-overdue-books 22:15 rangi yeah, what a horrible idea 22:33 Archer121 rangi: I got this mail from radek siman that he solved the same issue by replacing anacrom with cron. 22:34 rangi for some definition of the word solved 22:35 rangi :) 22:36 Archer121 What do you mean? 22:36 rangi it's not really something we can tell all users of koha to do 22:36 rangi fixing the actual logrotate job is a better way to actually fix it 22:37 Archer121 I see. So how does changing from amacron to cron fix it? 22:37 rangi who knows 22:37 Archer121 :-) bye! 22:37 rangi thats why i call it not an actual fix 22:43 eythian It would be interesting to see if having anacron installed causes the problem, but in theory it should be identical to just cron on a server. 23:54 * dcook waves