Time  Nick       Message
06:46 drojf      hi #koha
20:10 archer121  Hi, our koha server is crashing occasionally for some reason. Where can I look to get an idea as to why this is happening? I have described my issue in detail over here: http://stackoverflow.com/questions/683052/why-am-i-getting-an-apache-proxy-503-error Please see the answer to the post too.
20:11 archer121  Fellas, extremely sorry, I posted the wrong link. Here is the correct one: https://serverfault.com/questions/801029/apache-crashing-occasionally-with-error-503/801032
20:12 archer121  I don't see anything else in /var/logs/koha
20:14 cait1      hi archer121
20:14 cait1      still sunday in some parts of the world - quiet here usually
20:15 archer121  monday morning 2 AM here... :-)
20:15 cait1      did you see the answer on the mailing list?
20:17 archer121  yeah, but in my case, the backed service, ie koha has completely crashed.
20:17 archer121  backend*
20:18 archer121  Do you still think it will solve my issue?
20:21 cait1      i am not a sysadmin
20:22 cait1      sorry, probably not much help there
20:22 cait1      how did you install Koha? which OS? I think that's what you will be asked first
20:22 archer121  me neither, but thanks anyway
20:22 archer121  debain, apt
20:24 cait1      is there something special about your setup?
20:25 cait1      mostly feeding the chat logs here - we are running with packages on Debian too, but haven't experienced problems like that so far
20:25 archer121  We have it integrated to RFID hardware on our own, but in a safe manner
20:26 archer121  we have memcached and plack running
20:27 cait1      hm maybe you should add all that information to the mailing list threaad
20:27 cait1      together with the exact version of koha you are using
20:28 archer121  It;s been just 2 months since we migrated to koha, and the issue was there since the beginning. In this two months, we have upgraded twice, and once purged and reinstalled koha.
20:28 archer121  yes I will.
20:29 cait1      hm sounds not like fun :(
20:30 rangi      hmm i haven't run into plack crashing, on any of the 50 or so sites i look after
20:30 rangi      is it OOMing?
20:30 archer121  restarting apache won't help, but what the technicians do about it is to restart the entire system.
20:30 archer121  nope
20:30 rangi      yeah restarting apache won't do anything to plack
20:30 archer121  4.2G free
20:30 rangi      i dont think it's anything to do with apache
20:30 archer121  why do you guys think it is plack?
20:31 rangi      because that is what the error is telling you
20:31 archer121  why do you guys think it has something to do with plack?
20:31 rangi      HTTP: attempt to connect to Unix domain socket /var/run/koha/nitc/plack.sock (localhost) failed
20:31 rangi      next time it happens run
20:32 archer121  I currently have it in the crashed state
20:32 rangi      sudo koha-plack --restart instancename
20:32 rangi      if that doesnt work
20:32 rangi      try --stop
20:32 rangi      then --start
20:34 archer121  alright, so plack was not running, so I could not restart or stop it (as it was not running)
20:34 rangi      but starting it worked?
20:35 archer121  half worked.
20:36 archer121  the 503 goes away, but it is asif the zebra indexing is not done, you know, like all searches on opac returning empty
20:36 rangi      restart zebra too then
20:37 rangi      sudo koha-restart-zebra instancename
20:38 archer121  yeah, that worked.
20:38 rangi      it really does feel like that your machine OOMed at some point in the past and killed zebra and plack but that is just a guess you'd have to go back through syslogs looking
20:38 rangi      really odd for both plack and zebra to have died
20:38 archer121  I don;t think so, because OOMs should come in the dmesg
20:38 rangi      so yeah, i think you are going to need to do some forensics to find out what is killing those 2 things
20:39 rangi      bad ram, ooming, something else
20:39 rangi      id set up some monitoring with monit, or icinga2 or something
20:39 rangi      to monitor zebra and plack, see if you can pinpoint when it happens
20:40 rangi      but, at least restarting works, which is a zillion times better than rebooting
20:40 archer121  I just found out when it happens from the plack-error.log: 2016/09/04-07:37:06 Server closing!
20:41 archer121  And at that time the library is closed.
20:41 rangi      yeah, so you will want to track down what is doing it, see if you can find out when zebra was turned off/crashed too
20:50 archer121  at the same time: zebra-error.log: 20160904 07:37:06 nitc-koha-zebra: client (pid 19241) killed by signal 15, stopping
20:53 rangi      yeah
20:53 rangi      something did that
20:54 rangi      thats not a crash
20:54 rangi      so you need to find what was running at 7.30ish
20:54 rangi      maybe logrotate
20:59 archer121  ah, I found something sweet in syslog!
20:59 archer121  cron.daily was executed at the same time.
20:59 rangi      yeah, so its most likely logrotate
21:00 rangi      that stops stuff, rotates the logs, and supposed to restart it
21:00 archer121  here, take a look: https://paste.ubuntu.com/23134317/
21:02 rangi      look at the syslog before that
21:02 rangi      because syslog gets restarted as part of the logrotate too
21:03 cait1      archer121: i tihnk what rangi is trying to tell you is that it doesn't crash
21:03 cait1      it's shut down intentionally
21:03 cait1      to do some system taks - but it doesn't come back like it should
21:05 archer121  i see.
21:06 archer121  here is the syslog before the logrotate: https://paste.ubuntu.com/23134337/
21:07 archer121  I do not see anything useful in it, but my eyes are not that trained,
21:07 archer121  I now think that this crash occurs every sunday.
21:07 archer121  and sunday is a calendar holiday
21:09 rangi      yeah thats logrotate running
21:17 archer121  at this point the only thing that I can think of doing is to manually run all the commands in koha's cron.daily and see if it fails.
21:19 rangi      i guarantee it is logrotate doing it
21:19 archer121  okey, so now I am stuck. what should I do?
21:20 rangi      what version are you running ?
21:20 rangi      (of koha)
21:21 archer121  3.22.10
21:21 archer121  but the crash was there since 3.22.08, which was our first version
21:21 rangi      it's not a crash
21:22 archer121  Why is this not occuring daily if logrotate is doing it?
21:22 rangi      it will be a race condition
21:22 rangi      it'll be trying to start it again, while it is still stopping
21:23 rangi      so the start will fail, and it will continue stopping, and be stopped
21:24 rangi      if you look in /etc/logrotate.d/
21:24 rangi      there is a file koha-common
21:24 rangi      that is what tells it what to do
21:26 archer121  and this happens weekly!
21:27 rangi      right so logrotate probably tells it to rotate weekly
21:27 rangi      weekly
21:27 rangi      yep
21:27 archer121  and the fix?
21:27 wahanui    the fix is https://www.youtube.com/watch?v=Pg_ArW8lrl0
21:28 archer121  is that a bot?
21:28 wizzyrea   yes
21:28 wizzyrea   also hi
21:28 archer121  hi
21:28 rangi      https://lists.katipo.co.nz/public/koha/2016-July/045823.html
21:28 rangi      maybe try that
21:28 rangi      then check again next time it runs
21:29 rangi      we now know when that is going to be
21:29 rangi      7.30am sunday morning
21:30 archer121  great! thanks a lot!
21:30 archer121  but will this change get overwritten on every update of koha?
21:31 rangi      yep, but if it works, you can file a bug and say that is the fix, then it will go into koha
21:32 archer121  Will do that
21:37 archer122_ hey, I dot disconnected for a moment
21:37 archer122_ got*
21:38 cait1      nothing happened
21:38 archer122_ great
21:39 archer122_ I am planning to confirm it this is the issue and if so file the bug report by manually triggering a logrotate right now. is that okey, rangi?
21:41 rangi      yeah you probably didnt want to do that
21:42 archer122_ oops, why? I already did that. is it going to create any problems?
21:43 archer122_ and koha is functioning properly even after I did the logrotate
21:44 archer122_ (without any sleep)
21:47 archer122_ okey, so maybe zebra needs to do something in the background if this issue is to be reporduced.
21:49 rangi      yeah, it won't be a real test unless its running for real
21:50 archer122_ okey, gotta hit bed. I have to attend classes tomorrow!
21:50 archer122_ thanks you again for helping.
22:09 eythian    https://www.theguardian.com/us-news/2016/sep/03/borrowed-time-us-library-to-enforce-jail-sentences-for-overdue-books
22:15 rangi      yeah, what a horrible idea
22:33 Archer121  rangi: I got this mail from radek siman that he solved the same issue by replacing anacrom with cron.
22:34 rangi      for some definition of the word solved
22:35 rangi      :)
22:36 Archer121  What do you mean?
22:36 rangi      it's not really something we can tell all users of koha to do
22:36 rangi      fixing the actual logrotate job is a better way to actually fix it
22:37 Archer121  I see. So how does changing from amacron to cron fix it?
22:37 rangi      who knows
22:37 Archer121  :-) bye!
22:37 rangi      thats why i call it not an actual fix
22:43 eythian    It would be interesting to see if having anacron installed causes the problem, but in theory it should be identical to just cron on a server.
23:54 * dcook    waves