2022-07-07
§
|
08:08 |
<jnuche@deploy1002> |
rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.19 refs T308072 |
[production] |
07:31 |
<marostegui> |
dbmaint s3@eqiad T312286 |
[production] |
07:29 |
<marostegui> |
dbmaint s7@eqiad T312286 |
[production] |
07:29 |
<marostegui> |
dbmaint s2@eqiad T312286 |
[production] |
07:28 |
<marostegui> |
dbmaint s6@eqiad T312286 |
[production] |
07:27 |
<apergos> |
UTC morning backport and config training window closed |
[production] |
07:23 |
<marostegui> |
dbmaint s3@eqiad T312287 |
[production] |
07:20 |
<marostegui> |
dbmaint s6@eqiad T312287 |
[production] |
07:19 |
<marostegui> |
dbmaint s7@eqiad T312287 |
[production] |
07:19 |
<marostegui> |
dbmaint s2@eqiad T312287 |
[production] |
07:14 |
<kartik@deploy1002> |
Synchronized php-1.39.0-wmf.19/extensions/ContentTranslation/modules/mw.cx.MachineTranslationManager.js: Backport: [[gerrit:811806|Update MT label for Flores (T311411)]] (duration: 03m 20s) |
[production] |
07:11 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
07:10 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
07:10 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
07:09 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
07:07 |
<kartik@deploy1002> |
Synchronized php-1.39.0-wmf.18/extensions/ContentTranslation/modules/mw.cx.MachineTranslationManager.js: Backport: [[gerrit:811425|Update MT label for Flores (T311411)]] (duration: 03m 41s) |
[production] |
07:07 |
<moritzm> |
drain ganeti1020 T308331 |
[production] |
07:07 |
<marostegui> |
dbmaint s3@eqiad T312288 |
[production] |
07:04 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
07:03 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
07:03 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
07:03 |
<marostegui> |
dbmaint s6@eqiad T312288 |
[production] |
07:02 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
07:00 |
<marostegui> |
dbmaint s2@eqiad T312288 |
[production] |
06:56 |
<marostegui> |
dbmaint s7@eqiad T312288 |
[production] |
06:31 |
<bking@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1003.wikimedia.org with OS bullseye |
[production] |
06:21 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance |
[production] |
06:21 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance |
[production] |
06:07 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depool db1160 T311611', diff saved to https://phabricator.wikimedia.org/P30937 and previous config saved to /var/cache/conftool/dbconfig/20220707-060743-ladsgroup.json |
[production] |
06:01 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Promote db1138 to s4 primary and set section read-write T311611', diff saved to https://phabricator.wikimedia.org/P30936 and previous config saved to /var/cache/conftool/dbconfig/20220707-060112-ladsgroup.json |
[production] |
06:00 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - T311611', diff saved to https://phabricator.wikimedia.org/P30935 and previous config saved to /var/cache/conftool/dbconfig/20220707-060037-ladsgroup.json |
[production] |
06:00 |
<Amir1> |
Starting s4 eqiad failover from db1160 to db1138 - T311611 |
[production] |
05:35 |
<bking@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudelastic1003.wikimedia.org with OS bullseye |
[production] |
05:14 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Set db1138 with weight 0 T311611', diff saved to https://phabricator.wikimedia.org/P30933 and previous config saved to /var/cache/conftool/dbconfig/20220707-051406-ladsgroup.json |
[production] |
05:13 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 31 hosts with reason: Primary switchover s4 T311611 |
[production] |
05:12 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 31 hosts with reason: Primary switchover s4 T311611 |
[production] |
01:09 |
<mutante> |
gitlab1004 - systemctl reset-failed, clear icinga alerts about rsync to decom'ed machine |
[production] |
00:58 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
00:57 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
00:57 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
00:56 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
00:25 |
<dzahn@cumin2002> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts gitlab1001.wikimedia.org |
[production] |
00:25 |
<dzahn@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
2022-07-06
§
|
23:50 |
<ebernhardson@deploy1002> |
Finished deploy [wikimedia/discovery/analytics@5082f17]: increase subgraph_mapping_weekly executor memory (duration: 02m 05s) |
[production] |
23:48 |
<ebernhardson@deploy1002> |
Started deploy [wikimedia/discovery/analytics@5082f17]: increase subgraph_mapping_weekly executor memory |
[production] |
23:30 |
<dzahn@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
23:25 |
<dzahn@cumin2002> |
START - Cookbook sre.hosts.decommission for hosts gitlab1001.wikimedia.org |
[production] |
23:00 |
<mutante> |
gitlab1004 - rm /lib/systemd/system/rsync-config-backup-gitlab1001* T307142 |
[production] |
22:52 |
<mutante> |
etherpad - deleted 2 pads that had leaked information |
[production] |
22:52 |
<ebernhardson> |
restart airflow-webserver and airflow-scheduler for plugins update on an-airflow1001 |
[production] |