1601-1650 of 10000 results (24ms)
2020-08-18 ยง
10:07 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2125 - host down T260670', diff saved to https://phabricator.wikimedia.org/P12288 and previous config saved to /var/cache/conftool/dbconfig/20200818-100718-marostegui.json [production]
09:45 <oblivian@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
09:43 <jiji@cumin1001> conftool action : set/pooled=yes; selector: name=mw2250.codfw.wmnet [production]
09:40 <oblivian@cumin1001> conftool action : set/pooled=yes; selector: cluster=api_appserver,dc=codfw,name=mw214[234].* [production]
09:40 <oblivian@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97) [production]
09:34 <kart_> Update cxserver to 2020-08-17-090424-production (T259980) [production]
09:32 <kartik@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' . [production]
09:29 <kartik@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' . [production]
09:28 <oblivian@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
09:28 <oblivian@cumin1001> conftool action : set/pooled=yes; selector: cluster=api_appserver,dc=codfw,name=mw214[02].* [production]
09:26 <volans> upgraded spicerack to v0.0.39 on cumin hosts [production]
09:25 <kartik@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' . [production]
09:21 <volans> uploaded spicerack_0.0.39-1+deb10u1 to apt.wikimedia.org buster-wikimedia [production]
09:05 <hashar> Restarting CI Jenkins [production]
08:44 <vgutierrez> restart ats-tls on cp5006 [production]
08:24 <oblivian@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97) [production]
08:17 <oblivian@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
08:16 <oblivian@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99) [production]
08:10 <oblivian@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
08:02 <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool db1089', diff saved to https://phabricator.wikimedia.org/P12284 and previous config saved to /var/cache/conftool/dbconfig/20200818-080256-marostegui.json [production]
07:58 <oblivian@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99) [production]
07:53 <oblivian@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
07:45 <godog> VictorOps ack'd incidents will re-trigger after 24h if not resolved - T259465 [production]
07:44 <oblivian@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=1) [production]
07:43 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P12283 and previous config saved to /var/cache/conftool/dbconfig/20200818-074325-marostegui.json [production]
07:42 <_joe_> performing rolling reboot of all codfw api servers [production]
07:38 <oblivian@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
07:23 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P12282 and previous config saved to /var/cache/conftool/dbconfig/20200818-072349-marostegui.json [production]
07:19 <oblivian@cumin1001> conftool action : set/pooled=yes; selector: name=mw213[5-9].codfw.wmnet [production]
07:16 <jynus> update rest of phabricator passwords T250361 [production]
07:11 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P12281 and previous config saved to /var/cache/conftool/dbconfig/20200818-071121-marostegui.json [production]
07:08 <oblivian@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99) [production]
07:07 <godog> prometheus eqiad: add 100G to prometheus/global [production]
07:01 <oblivian@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
07:01 <oblivian@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99) [production]
07:01 <oblivian@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
06:53 <twentyafterfour> phabricator maintenance successful [production]
06:48 <jynus> deploy another password change to phabricator service (potentially disruptive) T250361 [production]
06:41 <XioNoX> add cloudflare PNI IPs in eqiad - T259036 [production]
06:21 <jynus> deploy password change to phabricator service T146055 [production]
06:06 <oblivian@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) [production]
06:01 <oblivian@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
05:52 <_joe_> running puppet on mc1020 T260622 [production]
05:02 <twentyafterfour> phabricator appears to be fully functional [production]
05:01 <twentyafterfour> phabricator read-only ended [production]
05:00 <twentyafterfour> phabricator is now read-only [production]
05:00 <marostegui> Failover m3 (phabricator) database master from db1128 to db1132 - T259589 [production]
04:32 <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool db1088', diff saved to https://phabricator.wikimedia.org/P12279 and previous config saved to /var/cache/conftool/dbconfig/20200818-043241-marostegui.json [production]
01:54 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1376.eqiad.wmnet [production]
01:38 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1343.eqiad.wmnet [production]