51-100 of 10000 results (34ms)
2020-03-03 ยง
17:29 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:28 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:27 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:22 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
17:20 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:20 <hnowlan@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'changeprop' for release 'staging' . [production]
17:17 <hnowlan@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'changeprop' for release 'staging' . [production]
17:17 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:17 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:16 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:14 <otto@deploy1001> Started restart [changeprop/deploy@e2fe8ca]: Restart to pick up new LVS TLS port for eventgate T242224 [production]
17:14 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:13 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:06 <hnowlan@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . [production]
17:02 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:02 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:02 <hnowlan@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' . [production]
17:01 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
16:58 <vgutierrez> Re-enable BGP in lvs1013 - T245984 [production]
16:51 <bblack> lvs5003 - restart pybal, back to normal operations [production]
16:51 <hnowlan@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' . [production]
16:51 <krinkle@deploy1001> Synchronized multiversion/MWWikiversions.php: I9d658ff41b78 (duration: 01m 04s) [production]
16:50 <hnowlan@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'changeprop' for release 'staging' . [production]
16:49 <hnowlan@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'changeprop' for release 'staging' . [production]
16:49 <bblack> reload icinga config on icinga1001 [production]
16:48 <krinkle@deploy1001> Synchronized wmf-config/import.php: I9d658ff41b78 (duration: 01m 03s) [production]
16:47 <hnowlan@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . [production]
16:47 <otto@deploy1001> Started restart [restbase/deploy@bfdd342]: Restart to pick up new LVS TLS port for eventgate T242224 [production]
16:47 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
16:45 <hnowlan@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . [production]
16:44 <vgutierrez@cumin1001> START - Cookbook sre.hosts.downtime [production]
16:35 <otto@deploy1001> Started restart [restbase/deploy@bfdd342] (dev-cluster): Restart (dev-cluster) to pick up new LVS TLS port for eventgate T242224 [production]
16:34 <krinkle@deploy1001> Synchronized multiversion/MWWikiversions.php: I8815be28d6a26a1 - T169821 (duration: 01m 04s) [production]
16:32 <vgutierrez> reimage lvs1013 with buster - T245984 [production]
16:28 <bblack> stopping pybal on lvs5003 to test the new icinga checks (will cause a BGP alert, among others) [production]
16:17 <Pchelolo> restart restbase on 2009 for T242224 [production]
16:14 <ottomata> switching restbase & change prop to new eventgate-main LVS TLS ports [production]
16:13 <vgutierrez> Re-enable BGP in lvs1014 - T245984 [production]
16:05 <vgutierrez> Starting pybal on lvs2009 - T246686 [production]
16:04 <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool db1096:3315 and db1096:3316 after reimage to buster T246604', diff saved to https://phabricator.wikimedia.org/P10597 and previous config saved to /var/cache/conftool/dbconfig/20200303-160433-marostegui.json [production]
15:59 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
15:56 <vgutierrez@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:55 <liw@deploy1001> Finished scap: group0 to 1.35.0-wmf.22 (duration: 24m 29s) [production]
15:49 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1096:3315 and db1096:3316 after reimage to buster T246604', diff saved to https://phabricator.wikimedia.org/P10596 and previous config saved to /var/cache/conftool/dbconfig/20200303-154913-marostegui.json [production]
15:47 <hnowlan@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . [production]
15:45 <vgutierrez> Stopping pybal on lvs2009 to let lvs2010 get its traffic - T246686 [production]
15:45 <mutante> wtp1025 - scap pull as user cscott - testing sudo privs issue [production]
15:44 <vgutierrez> reimage lvs1014 with buster - T245984 [production]
15:43 <mutante> wtp1025 - scap pull [production]
15:35 <hnowlan@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . [production]