4851-4900 of 10000 results (52ms)
2017-12-11 §
08:12 <elukey> powercycle ganeti1008 - all vms stuck, console com2 showed a ton of printks without a clear indicator of the root cause [production]
07:49 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1034 - T182556 (duration: 00m 45s) [production]
07:44 <_joe_> restarting hhvm on mw1189,mw1229,mw1235,mw1282,mw1285,mw1315,mw1316, all stuck with a kernel hang [production]
06:59 <_joe_> restarted hhvm, nginx on mw1280, hanging kernel operations [production]
06:45 <marostegui> Deploy schema change on s2 db1060 with replication enabled, this will generate some lag on s2 on labs - T174569 [production]
06:45 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1060 - T174569 (duration: 00m 44s) [production]
06:22 <marostegui> Compress s6 on db1096 - T178359 [production]
06:21 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 to compress InnoDB there - T178359 (duration: 00m 45s) [production]
02:43 <l10nupdate@tin> scap sync-l10n completed (1.31.0-wmf.11) (duration: 09m 21s) [production]
2017-12-10 §
20:33 <elukey> execute restart-hhvm on mw1312 - hhvm stuck multiple times queueing requests [production]
20:01 <elukey> ran kafka preferred-replica-election for the kafka analytics cluster (1012->1022) to re-add kafka1012 to the kafka brokers acting as partition leaders (will spread the load in a better way) [production]
2017-12-09 §
17:00 <apergos> restarted hhvm on mw1276, the same old hang with the same old symptoms [production]
16:10 <awight@tin> Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (take 4\!) (duration: 03m 01s) [production]
16:07 <awight@tin> Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (take 4\!) [production]
16:02 <awight@tin> Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (duration: 05m 58s) [production]
15:56 <awight@tin> Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity [production]
15:55 <awight@tin> Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (duration: 00m 17s) [production]
15:55 <awight@tin> Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity [production]
15:53 <awight@tin> Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (duration: 00m 31s) [production]
15:53 <awight@tin> Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity [production]
15:53 <apergos> did same on scb1002,3,4 [production]
15:48 <awight> Making an emergency deployment to ORES logging config to reduce verbosity. [production]
15:45 <apergos> on scb1001 moved daemon.log out of the way, did "service rsyslog rotate", saved the last 5000 entries for use by ores team, removed the log [production]
11:44 <apergos> that server list: mw1278, 1277, 1226, 1234, 1230 [production]
11:42 <apergos> restarted hhvm on api servers after lockup [production]
11:19 <legoktm@tin> Synchronized wmf-config/InitialiseSettings.php: Disable ORES in fawiki - T182354 (duration: 00m 45s) [production]
00:11 <Jamesofur> removed 2FA from EVinente after verification T182373 [production]
2017-12-08 §
23:23 <hashar> force ran puppet on contint2001 [production]
22:15 <madhuvishy> Kicked off rsync of /data/xmldatadumps/public to labstore1006 & 7 [production]
22:05 <smalyshev@tin> Finished deploy [wdqs/wdqs@353b3cb]: temporary fix for T182464, better fix coming soon (duration: 05m 55s) [production]
21:59 <smalyshev@tin> Started deploy [wdqs/wdqs@353b3cb]: temporary fix for T182464, better fix coming soon [production]
20:22 <aaron@tin> Synchronized php-1.31.0-wmf.11/includes/Setup.php: a319c3e7ab61 - disable cpPosTime injection (duration: 00m 45s) [production]
18:00 <reedy@tin> Synchronized wmf-config/InitialiseSettings.php: Disable GlobalBlocking on fishbowl wikis (duration: 00m 45s) [production]
16:23 <urandom> starting cassandra, restbase1010 - T178177 [production]
16:22 <urandom> disabling smart path, restbase1010, arrays 'b'...'e' - T178177 [production]
16:20 <urandom> disabling smart path, restbase1010, array 'a' (canary) - T178177 [production]
16:15 <urandom> shutting down cassandra, restbase1010 - T178177 [production]
15:35 <marostegui> Fix dbstore1002 s5 replication [production]
15:28 <gehel@tin> Finished deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003 (duration: 00m 03s) [production]
15:28 <gehel@tin> Started deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003 [production]
15:08 <gehel@tin> Finished deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003 (duration: 02m 08s) [production]
15:06 <gehel@tin> Started deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003 [production]
15:05 <gehel@tin> Finished deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003 (duration: 00m 42s) [production]
15:05 <gehel@tin> Started deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003 [production]
14:39 <gehel@tin> Finished deploy [tilerator/deploy@e52ea1d]: testing new tilerator packaging on maps-test2003 (duration: 02m 34s) [production]
14:36 <gehel@tin> Started deploy [tilerator/deploy@e52ea1d]: testing new tilerator packaging on maps-test2003 [production]
11:45 <elukey> updated prometheus-druid-exporter on druid* to 0.6 [production]
11:39 <elukey> upload prometheus-druid-exporter 0.6 to stretch/jessie wikimedia [production]
06:52 <marostegui> Fix labsdb1004 replication broken [production]
06:43 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Fully pool db1099:3311 - T178359 (duration: 00m 55s) [production]