2021-01-13
ยง
|
13:07 |
<akosiaris@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'eventstreams' for release 'production' . |
[production] |
12:15 |
<dcausse> |
European mid-day backport window done |
[production] |
12:09 |
<dcausse@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: T239931: Revert "Disable sanity check cirrus jobs for Wikidata" (duration: 01m 16s) |
[production] |
11:52 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2029.codfw.wmnet with reason: REIMAGE |
[production] |
11:49 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1029.eqiad.wmnet with reason: REIMAGE |
[production] |
11:49 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc2029.codfw.wmnet with reason: REIMAGE |
[production] |
11:47 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc1029.eqiad.wmnet with reason: REIMAGE |
[production] |
11:40 |
<kart_> |
Updated cxserver to 2021-01-12-095820-production (T234220, T270408) |
[production] |
11:37 |
<kartik@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' . |
[production] |
11:33 |
<kartik@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' . |
[production] |
11:23 |
<kartik@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' . |
[production] |
11:13 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1020 (re)pooling @ 100%: After restarting mysql', diff saved to https://phabricator.wikimedia.org/P13756 and previous config saved to /var/cache/conftool/dbconfig/20210113-111312-root.json |
[production] |
11:04 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove weight on es4 the master', diff saved to https://phabricator.wikimedia.org/P13755 and previous config saved to /var/cache/conftool/dbconfig/20210113-110419-marostegui.json |
[production] |
10:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1020 (re)pooling @ 75%: After restarting mysql', diff saved to https://phabricator.wikimedia.org/P13754 and previous config saved to /var/cache/conftool/dbconfig/20210113-105809-root.json |
[production] |
10:57 |
<volans> |
uploaded spicerack_0.0.47 to apt.wikimedia.org buster-wikimedia |
[production] |
10:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1020 (re)pooling @ 50%: After restarting mysql', diff saved to https://phabricator.wikimedia.org/P13753 and previous config saved to /var/cache/conftool/dbconfig/20210113-104305-root.json |
[production] |
10:35 |
<jbond42> |
puppet re-enabled on aall cp-text hosts |
[production] |
10:28 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1020 (re)pooling @ 25%: After restarting mysql', diff saved to https://phabricator.wikimedia.org/P13751 and previous config saved to /var/cache/conftool/dbconfig/20210113-102802-root.json |
[production] |
10:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Reduce weight on es1021', diff saved to https://phabricator.wikimedia.org/P13750 and previous config saved to /var/cache/conftool/dbconfig/20210113-102245-marostegui.json |
[production] |
10:18 |
<jbond42> |
disable puppet on the cp::text to deploy block list changes 651174 + 651171 |
[production] |
10:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es1020', diff saved to https://phabricator.wikimedia.org/P13749 and previous config saved to /var/cache/conftool/dbconfig/20210113-101606-marostegui.json |
[production] |
10:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1020 (re)pooling @ 25%: After restarting mysql', diff saved to https://phabricator.wikimedia.org/P13748 and previous config saved to /var/cache/conftool/dbconfig/20210113-100253-root.json |
[production] |
09:59 |
<marostegui> |
Enable report_host on es1020 T271106 |
[production] |
09:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es1020', diff saved to https://phabricator.wikimedia.org/P13747 and previous config saved to /var/cache/conftool/dbconfig/20210113-095834-marostegui.json |
[production] |
09:49 |
<marostegui> |
Enable report_host on all codfw sby masters - T271106 |
[production] |
09:42 |
<godog> |
swift eqiad-prod: add weight to ms-be106[0-3] - T268435 |
[production] |
09:05 |
<ayounsi@deploy1001> |
Finished deploy [homer/deploy@723ebfe]: Netbox 2.9 changes (duration: 03m 11s) |
[production] |
09:03 |
<ryankemper@cumin1001> |
END (FAIL) - Cookbook sre.elasticsearch.rolling-restart (exit_code=99) |
[production] |
09:02 |
<ayounsi@deploy1001> |
Started deploy [homer/deploy@723ebfe]: Netbox 2.9 changes |
[production] |
09:02 |
<moritzm> |
installing efivar bugfix update |
[production] |
09:00 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
08:54 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
08:47 |
<moritzm> |
draining ganeti4003 for eventual reboot |
[production] |
08:46 |
<ema> |
cp5008: re-enable puppet to undo JIT tslua experiment T265625 |
[production] |
08:35 |
<moritzm> |
failover ganeti master in ulsfo to ganeti4002 |
[production] |
08:29 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
08:23 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
08:19 |
<moritzm> |
draining ganeti4002 for eventual reboot |
[production] |
08:17 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
08:13 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
08:04 |
<ryankemper> |
[WDQS Deploy] Deploy is complete, and the WDQS service is healthy |
[production] |
07:59 |
<moritzm> |
draining ganeti4001 for eventual reboot |
[production] |
07:29 |
<ryankemper> |
[WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` |
[production] |
07:29 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` |
[production] |
07:28 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-updater` across all hosts simultaneously: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` |
[production] |
07:28 |
<ryankemper@deploy1001> |
Finished deploy [wdqs/wdqs@fdd2c2f]: 0.3.59 (duration: 14m 23s) |
[production] |
07:15 |
<ryankemper> |
[WDQS Deploy] All tests passing on canary instance `wdqs1003` following canary deploy. Proceeding to rest of fleet... |
[production] |
07:13 |
<ryankemper@deploy1001> |
Started deploy [wdqs/wdqs@fdd2c2f]: 0.3.59 |
[production] |
07:13 |
<ryankemper> |
[WDQS Deploy] All tests passing on canary instance `wdqs1003` prior to start of deploy. Proceeding with canary deploy of version `0.3.59`... |
[production] |
07:04 |
<ryankemper> |
T266492 T268779 T265699 Restarting cloudelastic to apply new readahead changes, this will also verify cloudelastic support works in our elasticsearch spicerack code. Only going one node at a time because cloudelastic elasticsearch indices only have 1 replica shard per index. |
[production] |