2019-10-24
ยง
|
18:42 |
<bblack@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
18:42 |
<bblack@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
18:31 |
<bblack> |
cr3-esams: add dns3001 to anycast4 neighbors |
[production] |
18:30 |
<bblack> |
cr2-esams: add dns3001 to anycast4 neighbors |
[production] |
18:29 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: 263fd0f: Enable Wikibase client access on commonswiki (T223792) (duration: 00m 52s) |
[production] |
18:25 |
<urandom> |
restbase cassandra rolling restart, rack 'a' -- T200803 |
[production] |
18:22 |
<robh> |
completing ps1-b6-eqiad setup, pdu will reboot twice, power output unaffected T227540 |
[production] |
18:20 |
<robh> |
ps1-a6-eqiad setup complete, icinga errors should clear up T227142 |
[production] |
18:15 |
<urbanecm@deploy1001> |
Synchronized wmf-config/: SWAT: 84c48df: rename service definition (T222851) (duration: 00m 53s) |
[production] |
18:06 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: b20d6de: Reference Previews: full beta deployment (T235083) (duration: 00m 52s) |
[production] |
18:03 |
<robh> |
setting ip info for ps1-a6-eqiad, it is rebooting. T227142 |
[production] |
17:41 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
17:39 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
17:38 |
<ema> |
pool cp3059 (cache_upload) T233242 |
[production] |
17:29 |
<bblack> |
asw2-esams - committing switch port/vlan config for new rack 14 hosts |
[production] |
17:26 |
<mobrovac@deploy1001> |
Synchronized wmf-config/CommonSettings.php: Enable Parsoid/PHP in the whole wtp (a.k.a. Parsoid) cluster - T236388 (duration: 00m 53s) |
[production] |
17:18 |
<bblack@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
17:15 |
<bblack@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
16:54 |
<ema> |
depool cp3036 (cache_upload) T233242 |
[production] |
16:39 |
<urandom> |
restarting cassandra, restbase2011 (canary for config changes) -- T200803 |
[production] |
16:32 |
<urandom> |
restarting cassandra, restbase1016 (canary for config changes) -- T200803 |
[production] |
16:28 |
<ema> |
depool cp3035 (cache_upload) T233242 |
[production] |
16:07 |
<ema> |
pool cp3057 (cache_upload) T233242 |
[production] |
15:51 |
<ema> |
depool cp3032 (cache_text) T233242 |
[production] |
15:45 |
<ema> |
depool cp3034 (cache_upload) T233242 |
[production] |
15:40 |
<ema> |
depool cp3030 (cache_text) T233242 |
[production] |
15:27 |
<bblack> |
asw2-esams: configure port descriptions and vlan/lvs groupings for all rack16 hosts (lvs3007, ganeti3003, bast3004, cp3061-5) |
[production] |
15:19 |
<ema> |
pool cp3058 (cache_text) T233242 |
[production] |
15:18 |
<effie> |
Slowly reload apache across the fleet (as we are enabling puppet) - T229792 |
[production] |
15:09 |
<effie> |
Remove hhvm packages and enable puppet across the fleet - T229792 |
[production] |
15:09 |
<ema> |
pool cp3055 (cache_upload) T233242 |
[production] |
15:04 |
<addshore@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: testcommonswiki, Enable Wikibase client access T223792 (duration: 00m 53s) |
[production] |
15:00 |
<bblack> |
cr2-esams - add missing lvs3005 IP to bgp pybal neighbor list |
[production] |
14:58 |
<bblack> |
cr3-esams - change fallback static route for high-traffic2 to lvs3006 |
[production] |
14:58 |
<bblack> |
cr2-esams - change fallback static route for high-traffic2 to lvs3006 |
[production] |
14:47 |
<effie> |
run puppet on all canaries and codfw - T229792 |
[production] |
14:42 |
<ema@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
14:40 |
<effie> |
Remove hhvm hhvm-luasandbox hhvm-tidy hhvm-wikidiff2 hhvm-dbg from all canaries and codfw - T229792 |
[production] |
14:40 |
<ema@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:26 |
<bblack> |
lvs3006 (upload, becoming active) - manual pybal med s/90/0/ (will take over from lvs3002, intended permanently). |
[production] |
14:23 |
<bblack> |
lvs3006 (upload, inactive) - manual pybal med s/100/90/ (preferred to lvs3004 for fallback from lvs3002) |
[production] |
14:22 |
<effie> |
enable puppet on mw app canaries |
[production] |
14:16 |
<ema> |
power-cycle cp3056, stuck rebooting into d-i T233242 |
[production] |
13:59 |
<ema> |
pool cp3060 T233242 |
[production] |
13:36 |
<bblack> |
re-pooling esams in dns |
[production] |
13:34 |
<effie> |
enable puppet on mwdebug* |
[production] |
13:25 |
<XioNoX> |
enable transit4/6 on cr2-knams |
[production] |
13:24 |
<ema@puppetmaster1001> |
conftool action : set/weight=100; selector: service=varnish-be,name=cp30[56].* |
[production] |
13:24 |
<bblack@cumin1001> |
conftool action : set/weight=100; selector: name=cp30[56].*,service=varnish-be |
[production] |
13:23 |
<bblack@cumin1001> |
conftool action : set/weight=1; selector: name=cp30[56].*,cluster=cache_text,service=varnish-fe |
[production] |