2022-08-22
ยง
|
13:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1166 (re)pooling @ 5%: Repooling after schema change', diff saved to https://phabricator.wikimedia.org/P32741 and previous config saved to /var/cache/conftool/dbconfig/20220822-133021-root.json |
[production] |
13:28 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1166', diff saved to https://phabricator.wikimedia.org/P32740 and previous config saved to /var/cache/conftool/dbconfig/20220822-132808-root.json |
[production] |
13:25 |
<bking@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
13:25 |
<jayme@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubemaster1001.eqiad.wmnet |
[production] |
13:17 |
<jayme@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host kubemaster1001.eqiad.wmnet |
[production] |
13:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1023 (re)pooling @ 8%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32738 and previous config saved to /var/cache/conftool/dbconfig/20220822-131649-root.json |
[production] |
13:15 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
13:14 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
13:14 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
13:13 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
13:13 |
<ladsgroup@deploy1002> |
Synchronized php-1.39.0-wmf.25/includes: Backport: [[gerrit:825276|SiteStats: Make sure initSiteStats.php re-distribute values (T315693)]] (duration: 03m 32s) |
[production] |
13:09 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance |
[production] |
13:09 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance |
[production] |
13:07 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
13:07 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
13:07 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1179 (T312972)', diff saved to https://phabricator.wikimedia.org/P32737 and previous config saved to /var/cache/conftool/dbconfig/20220822-130732-marostegui.json |
[production] |
13:03 |
<jynus> |
disabled backup scheduling for backup1002, backup2002 T315864 |
[production] |
13:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1023 (re)pooling @ 5%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32735 and previous config saved to /var/cache/conftool/dbconfig/20220822-130144-root.json |
[production] |
12:52 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P32734 and previous config saved to /var/cache/conftool/dbconfig/20220822-125226-marostegui.json |
[production] |
12:52 |
<jayme@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubemaster2002.codfw.wmnet |
[production] |
12:48 |
<jayme@cumin1001> |
END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-worker-eqiad |
[production] |
12:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1023 (re)pooling @ 2%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32732 and previous config saved to /var/cache/conftool/dbconfig/20220822-124640-root.json |
[production] |
12:45 |
<jayme@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host kubemaster2002.codfw.wmnet |
[production] |
12:39 |
<jayme@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubemaster2001.codfw.wmnet |
[production] |
12:37 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P32731 and previous config saved to /var/cache/conftool/dbconfig/20220822-123720-marostegui.json |
[production] |
12:33 |
<jayme@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host kubemaster2001.codfw.wmnet |
[production] |
12:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1023 (re)pooling @ 1%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32730 and previous config saved to /var/cache/conftool/dbconfig/20220822-123135-root.json |
[production] |
12:26 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ldap-replica2006.wikimedia.org |
[production] |
12:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1179 (T312972)', diff saved to https://phabricator.wikimedia.org/P32729 and previous config saved to /var/cache/conftool/dbconfig/20220822-122214-marostegui.json |
[production] |
12:20 |
<jayme> |
kubernetes1016:~$ sudo systemctl reset-failed ifup@ens13.service - T273026 |
[production] |
12:20 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.reboot-vm for VM ldap-replica2006.wikimedia.org |
[production] |
12:20 |
<moritzm> |
fix up network config for ldap-replica2006 T273026 |
[production] |
12:17 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
12:16 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
12:16 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
12:16 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
12:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es1023 for reboot T315542', diff saved to https://phabricator.wikimedia.org/P32728 and previous config saved to /var/cache/conftool/dbconfig/20220822-121401-root.json |
[production] |
12:13 |
<marostegui@deploy1002> |
Synchronized wmf-config/db-production.php: Enable writes on es5 T315542 (duration: 03m 18s) |
[production] |
12:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote es1024 to es5 primary T315542', diff saved to https://phabricator.wikimedia.org/P32727 and previous config saved to /var/cache/conftool/dbconfig/20220822-120611-root.json |
[production] |
12:05 |
<marostegui> |
Starting es5 eqiad failover from es1023 to es1024 - T315542 |
[production] |
12:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set es1024 with weight 10 T315542', diff saved to https://phabricator.wikimedia.org/P32726 and previous config saved to /var/cache/conftool/dbconfig/20220822-120141-root.json |
[production] |
12:00 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
11:58 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
11:58 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
11:54 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
11:51 |
<marostegui@deploy1002> |
Synchronized wmf-config/db-production.php: Disable writes on es5 T315542 (duration: 03m 08s) |
[production] |
11:47 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Switchover es5 T315542 |
[production] |
11:47 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Switchover es5 T315542 |
[production] |
11:36 |
<moritzm> |
installing libdatetime-timezone-perl updates from SUA update |
[production] |
11:33 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1020 (re)pooling @ 100%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32725 and previous config saved to /var/cache/conftool/dbconfig/20220822-113352-root.json |
[production] |