2022-04-21
ยง
|
13:02 |
<vgutierrez> |
restart ats-be and varnish-fe on cp2036 to clear restarted service alerts |
[production] |
12:55 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host thumbor2005.codfw.wmnet |
[production] |
12:55 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor2004.codfw.wmnet |
[production] |
12:48 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P25960 and previous config saved to /var/cache/conftool/dbconfig/20220421-124852-ladsgroup.json |
[production] |
12:45 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host thumbor2004.codfw.wmnet |
[production] |
12:44 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor2003.codfw.wmnet |
[production] |
12:34 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host thumbor2003.codfw.wmnet |
[production] |
12:33 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1110 (T298565)', diff saved to https://phabricator.wikimedia.org/P25959 and previous config saved to /var/cache/conftool/dbconfig/20220421-123347-ladsgroup.json |
[production] |
12:30 |
<moritzm> |
installing fribidi security updates |
[production] |
12:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 100%: After maintenance', diff saved to https://phabricator.wikimedia.org/P25958 and previous config saved to /var/cache/conftool/dbconfig/20220421-122859-root.json |
[production] |
12:27 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1110 (T298565)', diff saved to https://phabricator.wikimedia.org/P25957 and previous config saved to /var/cache/conftool/dbconfig/20220421-122722-ladsgroup.json |
[production] |
12:27 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance |
[production] |
12:27 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance |
[production] |
12:26 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1161 (T298565)', diff saved to https://phabricator.wikimedia.org/P25956 and previous config saved to /var/cache/conftool/dbconfig/20220421-122627-ladsgroup.json |
[production] |
12:25 |
<moritzm> |
installing flac security updates |
[production] |
12:24 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance |
[production] |
12:24 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance |
[production] |
12:24 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance |
[production] |
12:23 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance |
[production] |
12:20 |
<moritzm> |
installing openjpeg2 security updates |
[production] |
12:13 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 75%: After maintenance', diff saved to https://phabricator.wikimedia.org/P25955 and previous config saved to /var/cache/conftool/dbconfig/20220421-121355-root.json |
[production] |
12:11 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P25954 and previous config saved to /var/cache/conftool/dbconfig/20220421-121122-ladsgroup.json |
[production] |
12:10 |
<moritzm> |
installing subversion security updates |
[production] |
11:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 50%: After maintenance', diff saved to https://phabricator.wikimedia.org/P25953 and previous config saved to /var/cache/conftool/dbconfig/20220421-115851-root.json |
[production] |
11:56 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P25952 and previous config saved to /var/cache/conftool/dbconfig/20220421-115617-ladsgroup.json |
[production] |
11:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 25%: After maintenance', diff saved to https://phabricator.wikimedia.org/P25951 and previous config saved to /var/cache/conftool/dbconfig/20220421-114347-root.json |
[production] |
11:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1161 (T298565)', diff saved to https://phabricator.wikimedia.org/P25950 and previous config saved to /var/cache/conftool/dbconfig/20220421-114112-ladsgroup.json |
[production] |
11:39 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance |
[production] |
11:39 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance |
[production] |
11:35 |
<moritzm> |
installing zlib security updates on stretch (buster/bullseye already fixed) |
[production] |
11:34 |
<kart_> |
Updated cxserver to 2022-04-21-081331-production (T287655, T304855, T304862, T304866, T305115) |
[production] |
11:30 |
<kartik@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/cxserver: apply |
[production] |
11:29 |
<kartik@deploy1002> |
helmfile [eqiad] START helmfile.d/services/cxserver: apply |
[production] |
11:28 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 10%: After maintenance', diff saved to https://phabricator.wikimedia.org/P25949 and previous config saved to /var/cache/conftool/dbconfig/20220421-112843-root.json |
[production] |
11:28 |
<kartik@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/cxserver: apply |
[production] |
11:27 |
<kartik@deploy1002> |
helmfile [codfw] START helmfile.d/services/cxserver: apply |
[production] |
11:26 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1109 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P25948 and previous config saved to /var/cache/conftool/dbconfig/20220421-112648-root.json |
[production] |
11:23 |
<kartik@deploy1002> |
helmfile [staging] DONE helmfile.d/services/cxserver: apply |
[production] |
11:22 |
<kartik@deploy1002> |
helmfile [staging] START helmfile.d/services/cxserver: apply |
[production] |
11:14 |
<jynus@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2004.codfw.wmnet with OS bullseye |
[production] |
11:13 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 5%: After maintenance', diff saved to https://phabricator.wikimedia.org/P25947 and previous config saved to /var/cache/conftool/dbconfig/20220421-111340-root.json |
[production] |
11:13 |
<marostegui> |
dbmaint s2@codfw T306604 |
[production] |
11:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1109 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P25946 and previous config saved to /var/cache/conftool/dbconfig/20220421-111144-root.json |
[production] |
11:05 |
<jynus@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1002.eqiad.wmnet with OS bullseye |
[production] |
10:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 1%: After maintenance', diff saved to https://phabricator.wikimedia.org/P25945 and previous config saved to /var/cache/conftool/dbconfig/20220421-105835-root.json |
[production] |
10:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1109 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P25944 and previous config saved to /var/cache/conftool/dbconfig/20220421-105638-root.json |
[production] |
10:55 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
10:55 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
10:54 |
<jynus@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2004.codfw.wmnet with reason: host reimage |
[production] |
10:52 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping3002.esams.wmnet |
[production] |