3051-3100 of 10000 results (21ms)
2013-12-02 §
02:08 <LocalisationUpdate> completed (1.23wmf4) at Mon Dec 2 02:08:15 UTC 2013 [production]
01:16 <springle> restarting labsdb1002 mysqld processes with 25% smaller buffer pools. kernel OOM killer striking. needs investigation [production]
2013-12-01 §
06:23 <apergos> sq37 hardware errors, probably controller, rt #6418 [production]
06:04 <apergos> pwercycled sq37, was giving 'i/o error' to all commands [production]
02:36 <LocalisationUpdate> ResourceLoader cache refresh completed at Sun Dec 1 02:36:06 UTC 2013 [production]
02:13 <LocalisationUpdate> completed (1.23wmf5) at Sun Dec 1 02:13:36 UTC 2013 [production]
02:08 <LocalisationUpdate> completed (1.23wmf4) at Sun Dec 1 02:07:54 UTC 2013 [production]
2013-11-30 §
05:54 <paravoid> rebooting cp1052 with the same kernel (control) [production]
02:47 <LocalisationUpdate> ResourceLoader cache refresh completed at Sat Nov 30 02:47:22 UTC 2013 [production]
02:18 <LocalisationUpdate> completed (1.23wmf5) at Sat Nov 30 02:18:36 UTC 2013 [production]
02:10 <LocalisationUpdate> completed (1.23wmf4) at Sat Nov 30 02:10:13 UTC 2013 [production]
2013-11-29 §
21:09 <ori> synchronized wmf-config/InitialiseSettings.php 'Id9c7321b8: Add a MassMessage-related user group on Meta' [production]
21:08 <ori> updated /a/common to {{Gerrit|Id9c7321b8}}: Add a MassMessage-related user group on Meta [production]
20:18 <paravoid> rebooting cp1065 with new kernel [production]
19:26 <paravoid> "swapoff -a" on all cache_text to deal with strange kernel issue with kswapd dropping the whole page cache on memory pressure [production]
14:49 <paravoid> restarted gmond on ms-fe1001/2, both were stuck 6h ago and we lost all swift eqiad's metrics for that period [production]
11:12 <Reedy> Created EducationProgram tables on arwiki [production]
05:51 <Tim> on cp1052 and cp1053: tweaked /proc/sys/net/core/rmem_default to see if that fixes the observed massive gmond packet loss [production]
02:08 <LocalisationUpdate> ResourceLoader cache refresh completed at Fri Nov 29 02:07:55 UTC 2013 [production]
02:02 <LocalisationUpdate> completed (1.23wmf5) at Fri Nov 29 02:02:25 UTC 2013 [production]
02:01 <LocalisationUpdate> completed (1.23wmf4) at Fri Nov 29 02:01:44 UTC 2013 [production]
01:22 <springle> synchronized wmf-config/db-eqiad.php 'repool pc1001 after upgrade, max_connections lowered during warm up' [production]
00:25 <springle> synchronized wmf-config/db-eqiad.php 'depool pc1001 for package upgrade' [production]
2013-11-28 §
10:49 <apergos> turned off logging for parsoid ( https://gerrit.wikimedia.org/r/#/c/98082/ ), old logs remain in place for folks to examine [production]
10:06 <apergos> stack traces filling up parsoid nohup.out logs (sveral gigs in only a few minutes once the parsoid gets into that state), sample on wtp1010 in /var/lib/parsoid/nohup.out.errors [production]
08:34 <apergos> and wtp1023 [production]
08:29 <apergos> /var/lib/parsoid/nohup.out on wtp 1005,11,12 was 6gb or more, causing / on these boxes to fill; moved it, restarted parsoid, removed it [production]
07:16 <apergos> powercycled sq80 [production]
05:41 <ori> synchronized wmf-config/CommonSettings.php 'Icdaa4c1b5: Configure parser cache databases in db-$realm file (3/3)' [production]
05:41 <ori> synchronized wmf-config/db-pmtpa.php 'Icdaa4c1b5: Configure parser cache databases in db-$realm file (2/3)' [production]
05:40 <ori> synchronized wmf-config/db-eqiad.php 'Icdaa4c1b5: Configure parser cache databases in db-$realm file (1/3)' [production]
05:37 <ori> updated /a/common to {{Gerrit|Icdaa4c1b5}}: Configure parser cache databases in db-$realm file [production]
03:37 <springle> synchronized wmf-config/db-eqiad.php 'repool slaves after package upgrade, (lvm snapshot boxes only, LB=0)' [production]
03:16 <springle> synchronized wmf-config/db-eqiad.php 'depool slaves for package upgrade' [production]
02:43 <LocalisationUpdate> ResourceLoader cache refresh completed at Thu Nov 28 02:42:58 UTC 2013 [production]
02:29 <springle> synchronized wmf-config/db-eqiad.php 'slaves to full steam after package upgrade' [production]
02:15 <LocalisationUpdate> completed (1.23wmf5) at Thu Nov 28 02:15:36 UTC 2013 [production]
02:09 <LocalisationUpdate> completed (1.23wmf4) at Thu Nov 28 02:09:38 UTC 2013 [production]
01:17 <springle> synchronized wmf-config/db-eqiad.php 'warm up slaves after package upgrade' [production]
01:02 <ori-l> started rsync of graphite data (~400gb) from professor.pmtpa to tungsten.eqiad [production]
00:40 <springle> synchronized wmf-config/db-eqiad.php 'depool slaves for package upgrade' [production]
2013-11-27 §
19:50 <demon> synchronized wmf-config/InitialiseSettings.php 'Fixes for Flow config, no-op in prod' [production]
19:49 <demon> synchronized wmf-config/CommonSettings.php 'Fixes for Flow config, no-op in prod' [production]
18:12 <paravoid> kill -9 gdb on cp3012, attached to varnish frontend [production]
11:28 <ori-l> faidon switched gdash.wm.o from professor.pmtpa -> tungsten.eqiad behind misc-varnish & rebooted ssl1 in tampa [production]
11:11 <apergos> ssl1 rebooted itself about 15 mins ago, no idea why [production]
10:20 <ariel> synchronized wmf-config/db-eqiad.php 'db1019 (s3) back to full weight in the pool' [production]
10:19 <ariel> updated /a/common to {{Gerrit|If5ebd6194}}: db1019 (s3) back to full weight in pool [production]
10:08 <apergos> shot some old puppet processes hogging memory on db9 (from march and earlier) [production]
09:49 <apergos> there was no mount /srv/pagecounts on labstore4, so rsync to /exp/pagecounts wrote to and filled /; did the mkdir and now things seem ok [production]