5051-5100 of 10000 results (77ms)
2019-10-24 ยง
18:55 <bblack@cumin1001> START - Cookbook sre.hosts.downtime [production]
18:55 <Urbanecm> Morning SWAT done [production]
18:55 <bblack@cumin1001> START - Cookbook sre.hosts.downtime [production]
18:46 <urandom> restbase cassandra rolling restart, rack 'b' -- T200803 [production]
18:44 <bblack@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
18:42 <bblack@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
18:42 <bblack@cumin1001> START - Cookbook sre.hosts.downtime [production]
18:42 <bblack@cumin1001> START - Cookbook sre.hosts.downtime [production]
18:31 <bblack> cr3-esams: add dns3001 to anycast4 neighbors [production]
18:30 <bblack> cr2-esams: add dns3001 to anycast4 neighbors [production]
18:29 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: 263fd0f: Enable Wikibase client access on commonswiki (T223792) (duration: 00m 52s) [production]
18:25 <urandom> restbase cassandra rolling restart, rack 'a' -- T200803 [production]
18:22 <robh> completing ps1-b6-eqiad setup, pdu will reboot twice, power output unaffected T227540 [production]
18:20 <robh> ps1-a6-eqiad setup complete, icinga errors should clear up T227142 [production]
18:15 <urbanecm@deploy1001> Synchronized wmf-config/: SWAT: 84c48df: rename service definition (T222851) (duration: 00m 53s) [production]
18:06 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: b20d6de: Reference Previews: full beta deployment (T235083) (duration: 00m 52s) [production]
18:03 <robh> setting ip info for ps1-a6-eqiad, it is rebooting. T227142 [production]
17:41 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:39 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:38 <ema> pool cp3059 (cache_upload) T233242 [production]
17:29 <bblack> asw2-esams - committing switch port/vlan config for new rack 14 hosts [production]
17:26 <mobrovac@deploy1001> Synchronized wmf-config/CommonSettings.php: Enable Parsoid/PHP in the whole wtp (a.k.a. Parsoid) cluster - T236388 (duration: 00m 53s) [production]
17:18 <bblack@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
17:15 <bblack@cumin1001> START - Cookbook sre.hosts.downtime [production]
16:54 <ema> depool cp3036 (cache_upload) T233242 [production]
16:39 <urandom> restarting cassandra, restbase2011 (canary for config changes) -- T200803 [production]
16:32 <urandom> restarting cassandra, restbase1016 (canary for config changes) -- T200803 [production]
16:28 <ema> depool cp3035 (cache_upload) T233242 [production]
16:07 <ema> pool cp3057 (cache_upload) T233242 [production]
15:51 <ema> depool cp3032 (cache_text) T233242 [production]
15:45 <ema> depool cp3034 (cache_upload) T233242 [production]
15:40 <ema> depool cp3030 (cache_text) T233242 [production]
15:27 <bblack> asw2-esams: configure port descriptions and vlan/lvs groupings for all rack16 hosts (lvs3007, ganeti3003, bast3004, cp3061-5) [production]
15:19 <ema> pool cp3058 (cache_text) T233242 [production]
15:18 <effie> Slowly reload apache across the fleet (as we are enabling puppet) - T229792 [production]
15:09 <effie> Remove hhvm packages and enable puppet across the fleet - T229792 [production]
15:09 <ema> pool cp3055 (cache_upload) T233242 [production]
15:04 <addshore@deploy1001> Synchronized wmf-config/InitialiseSettings.php: testcommonswiki, Enable Wikibase client access T223792 (duration: 00m 53s) [production]
15:00 <bblack> cr2-esams - add missing lvs3005 IP to bgp pybal neighbor list [production]
14:58 <bblack> cr3-esams - change fallback static route for high-traffic2 to lvs3006 [production]
14:58 <bblack> cr2-esams - change fallback static route for high-traffic2 to lvs3006 [production]
14:47 <effie> run puppet on all canaries and codfw - T229792 [production]
14:42 <ema@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
14:40 <effie> Remove hhvm hhvm-luasandbox hhvm-tidy hhvm-wikidiff2 hhvm-dbg from all canaries and codfw - T229792 [production]
14:40 <ema@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:26 <bblack> lvs3006 (upload, becoming active) - manual pybal med s/90/0/ (will take over from lvs3002, intended permanently). [production]
14:23 <bblack> lvs3006 (upload, inactive) - manual pybal med s/100/90/ (preferred to lvs3004 for fallback from lvs3002) [production]
14:22 <effie> enable puppet on mw app canaries [production]
14:16 <ema> power-cycle cp3056, stuck rebooting into d-i T233242 [production]
13:59 <ema> pool cp3060 T233242 [production]