2021-11-25
ยง
|
17:12 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1148 (T296143)', diff saved to https://phabricator.wikimedia.org/P17864 and previous config saved to /var/cache/conftool/dbconfig/20211125-171202-ladsgroup.json |
[production] |
16:57 |
<volans@deploy1002> |
Finished deploy [netbox/deploy@87a36a7]: Deploy v2.10.4-wmf6 (duration: 06m 59s) |
[production] |
16:56 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1148 (T296143)', diff saved to https://phabricator.wikimedia.org/P17863 and previous config saved to /var/cache/conftool/dbconfig/20211125-165657-ladsgroup.json |
[production] |
16:50 |
<volans@deploy1002> |
Started deploy [netbox/deploy@87a36a7]: Deploy v2.10.4-wmf6 |
[production] |
16:49 |
<jynus@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1163', diff saved to https://phabricator.wikimedia.org/P17862 and previous config saved to /var/cache/conftool/dbconfig/20211125-164941-jynus.json |
[production] |
16:46 |
<volans@deploy1002> |
Finished deploy [netbox/deploy@87a36a7]: Test v2.10.4-wmf6 on netbox-next (duration: 01m 04s) |
[production] |
16:45 |
<volans@deploy1002> |
Started deploy [netbox/deploy@87a36a7]: Test v2.10.4-wmf6 on netbox-next |
[production] |
16:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1148 (T296143)', diff saved to https://phabricator.wikimedia.org/P17861 and previous config saved to /var/cache/conftool/dbconfig/20211125-164153-ladsgroup.json |
[production] |
16:18 |
<jynus@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1163++', diff saved to https://phabricator.wikimedia.org/P17860 and previous config saved to /var/cache/conftool/dbconfig/20211125-161833-jynus.json |
[production] |
16:14 |
<jynus@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1163+', diff saved to https://phabricator.wikimedia.org/P17859 and previous config saved to /var/cache/conftool/dbconfig/20211125-161404-jynus.json |
[production] |
16:10 |
<klausman> |
restarting pybal on lvs2009 T289835 |
[production] |
15:57 |
<vgutierrez> |
restarting pybal on lvs2010 - T289835 |
[production] |
15:55 |
<jynus@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1163', diff saved to https://phabricator.wikimedia.org/P17856 and previous config saved to /var/cache/conftool/dbconfig/20211125-155538-jynus.json |
[production] |
15:47 |
<jynus> |
reenable gtid on db1163 |
[production] |
15:29 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1148 (T296143)', diff saved to https://phabricator.wikimedia.org/P17853 and previous config saved to /var/cache/conftool/dbconfig/20211125-152906-ladsgroup.json |
[production] |
15:29 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1148.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
15:29 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1148.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
15:28 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17852 and previous config saved to /var/cache/conftool/dbconfig/20211125-152858-ladsgroup.json |
[production] |
15:22 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ping1001.eqiad.wmnet |
[production] |
15:19 |
<klausman@cumin1001> |
conftool action : set/pooled=yes:weight=1; selector: cluster=ml_serve,service=kubesvc |
[production] |
15:13 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17851 and previous config saved to /var/cache/conftool/dbconfig/20211125-151354-ladsgroup.json |
[production] |
15:13 |
<ayounsi@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts ping1001.eqiad.wmnet |
[production] |
15:12 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ping3001.esams.wmnet |
[production] |
15:05 |
<ayounsi@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts ping3001.esams.wmnet |
[production] |
15:04 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ping2001.codfw.wmnet |
[production] |
14:58 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17850 and previous config saved to /var/cache/conftool/dbconfig/20211125-145849-ladsgroup.json |
[production] |
14:54 |
<ayounsi@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts ping2001.codfw.wmnet |
[production] |
14:43 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17849 and previous config saved to /var/cache/conftool/dbconfig/20211125-144344-ladsgroup.json |
[production] |
14:42 |
<XioNoX> |
Update ping redirect to point to new ping VMs - T295767 |
[production] |
14:25 |
<jayme> |
uncordoned kubestage1003.eqiad.wmnet kubestage1004.eqiad.wmnet - T293729 |
[production] |
14:17 |
<klausman@deploy1002> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . |
[production] |
14:16 |
<klausman@deploy1002> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . |
[production] |
14:12 |
<klausman@deploy1002> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . |
[production] |
13:40 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ping1002.eqiad.wmnet |
[production] |
13:32 |
<ayounsi@cumin1001> |
START - Cookbook sre.ganeti.makevm for new host ping1002.eqiad.wmnet |
[production] |
13:30 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ping2002.codfw.wmnet |
[production] |
13:28 |
<Amir1> |
killing lingering process from mwmaint to depooled db1147 |
[production] |
13:20 |
<ayounsi@cumin1001> |
START - Cookbook sre.ganeti.makevm for new host ping2002.codfw.wmnet |
[production] |
13:14 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ping3002.esams.wmnet |
[production] |
13:05 |
<ayounsi@cumin1001> |
START - Cookbook sre.ganeti.makevm for new host ping3002.esams.wmnet |
[production] |
12:27 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase202[1-3].codfw.wmnet: Restarting for certificate updates - hnowlan@cumin1001 |
[production] |
12:14 |
<arturo> |
update repo bullseye-wikimedia/thirdparty/ceph-octopus (T296175) |
[production] |
12:14 |
<jynus> |
disable temp. gtid on db1163 |
[production] |
12:11 |
<jynus@cumin1001> |
dbctl commit (dc=all): 'Temp. depool db1163 fully', diff saved to https://phabricator.wikimedia.org/P17847 and previous config saved to /var/cache/conftool/dbconfig/20211125-121138-jynus.json |
[production] |
12:04 |
<jynus@cumin1001> |
dbctl commit (dc=all): 'Reduce db1163 load even more', diff saved to https://phabricator.wikimedia.org/P17846 and previous config saved to /var/cache/conftool/dbconfig/20211125-120435-jynus.json |
[production] |
11:56 |
<hnowlan@cumin1001> |
START - Cookbook sre.cassandra.roll-restart for nodes matching restbase202[1-3].codfw.wmnet: Restarting for certificate updates - hnowlan@cumin1001 |
[production] |
11:56 |
<jynus@cumin1001> |
dbctl commit (dc=all): 'Reduce db1163 load', diff saved to https://phabricator.wikimedia.org/P17845 and previous config saved to /var/cache/conftool/dbconfig/20211125-115602-jynus.json |
[production] |
11:04 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17844 and previous config saved to /var/cache/conftool/dbconfig/20211125-110443-ladsgroup.json |
[production] |
11:04 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1147.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
11:04 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1147.eqiad.wmnet with reason: Maintenance T296143 |
[production] |