651-700 of 10000 results (42ms)
2017-12-11 §
11:11 <_joe_> restarting hhvm on mw1200, stuck in a kernel task [production]
11:08 <jdrewniak@tin> Synchronized portals: Wikimedia Portals Update: [[gerrit:397472|Bumping portals to master (T128546)]] (duration: 00m 45s) [production]
11:07 <jdrewniak@tin> Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:397472|Bumping portals to master (T128546)]] (duration: 00m 44s) [production]
10:49 <ema> cp4021: restart varnish-be due to mbox lag [production]
10:04 <godog> upgrade grafana to 4.6.2 on labmon1001 - T182294 [production]
10:00 <jynus> stopping dbstore2001:s5 and dbstore1002 (s5) mysql replication in sync [production]
09:28 <akosiaris> upload scap_3.7.4-1 to apt.wikimedia.org/jessie-wikimedia/main [production]
09:16 <gehel> cleaning old cassandra dumps on maps-test2001 servers [production]
09:15 <gehel> cleaning up old postgres logs on maps-test2001 [production]
09:05 <elukey> set notebook1002 as role::spare as prep step to reimage it to kafka1023 [production]
09:03 <jynus> dropping multiple leftover files from db1102 [production]
08:52 <marostegui> Stop replication in sync on db1034 and db1039 - T163190 [production]
08:12 <elukey> powercycle ganeti1008 - all vms stuck, console com2 showed a ton of printks without a clear indicator of the root cause [production]
07:49 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1034 - T182556 (duration: 00m 45s) [production]
07:44 <_joe_> restarting hhvm on mw1189,mw1229,mw1235,mw1282,mw1285,mw1315,mw1316, all stuck with a kernel hang [production]
06:59 <_joe_> restarted hhvm, nginx on mw1280, hanging kernel operations [production]
06:45 <marostegui> Deploy schema change on s2 db1060 with replication enabled, this will generate some lag on s2 on labs - T174569 [production]
06:45 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1060 - T174569 (duration: 00m 44s) [production]
06:22 <marostegui> Compress s6 on db1096 - T178359 [production]
06:21 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 to compress InnoDB there - T178359 (duration: 00m 45s) [production]
02:43 <l10nupdate@tin> scap sync-l10n completed (1.31.0-wmf.11) (duration: 09m 21s) [production]
2017-12-10 §
20:33 <elukey> execute restart-hhvm on mw1312 - hhvm stuck multiple times queueing requests [production]
20:01 <elukey> ran kafka preferred-replica-election for the kafka analytics cluster (1012->1022) to re-add kafka1012 to the kafka brokers acting as partition leaders (will spread the load in a better way) [production]
2017-12-09 §
17:00 <apergos> restarted hhvm on mw1276, the same old hang with the same old symptoms [production]
16:10 <awight@tin> Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (take 4\!) (duration: 03m 01s) [production]
16:07 <awight@tin> Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (take 4\!) [production]
16:02 <awight@tin> Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (duration: 05m 58s) [production]
15:56 <awight@tin> Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity [production]
15:55 <awight@tin> Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (duration: 00m 17s) [production]
15:55 <awight@tin> Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity [production]
15:53 <awight@tin> Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (duration: 00m 31s) [production]
15:53 <awight@tin> Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity [production]
15:53 <apergos> did same on scb1002,3,4 [production]
15:48 <awight> Making an emergency deployment to ORES logging config to reduce verbosity. [production]
15:45 <apergos> on scb1001 moved daemon.log out of the way, did "service rsyslog rotate", saved the last 5000 entries for use by ores team, removed the log [production]
11:44 <apergos> that server list: mw1278, 1277, 1226, 1234, 1230 [production]
11:42 <apergos> restarted hhvm on api servers after lockup [production]
11:19 <legoktm@tin> Synchronized wmf-config/InitialiseSettings.php: Disable ORES in fawiki - T182354 (duration: 00m 45s) [production]
00:11 <Jamesofur> removed 2FA from EVinente after verification T182373 [production]
2017-12-08 §
23:23 <hashar> force ran puppet on contint2001 [production]
22:15 <madhuvishy> Kicked off rsync of /data/xmldatadumps/public to labstore1006 & 7 [production]
22:05 <smalyshev@tin> Finished deploy [wdqs/wdqs@353b3cb]: temporary fix for T182464, better fix coming soon (duration: 05m 55s) [production]
21:59 <smalyshev@tin> Started deploy [wdqs/wdqs@353b3cb]: temporary fix for T182464, better fix coming soon [production]
20:22 <aaron@tin> Synchronized php-1.31.0-wmf.11/includes/Setup.php: a319c3e7ab61 - disable cpPosTime injection (duration: 00m 45s) [production]
18:00 <reedy@tin> Synchronized wmf-config/InitialiseSettings.php: Disable GlobalBlocking on fishbowl wikis (duration: 00m 45s) [production]
16:23 <urandom> starting cassandra, restbase1010 - T178177 [production]
16:22 <urandom> disabling smart path, restbase1010, arrays 'b'...'e' - T178177 [production]
16:20 <urandom> disabling smart path, restbase1010, array 'a' (canary) - T178177 [production]
16:15 <urandom> shutting down cassandra, restbase1010 - T178177 [production]
15:35 <marostegui> Fix dbstore1002 s5 replication [production]