2021-11-26
§
|
16:11 |
<arnoldokoth> |
drain kubestage1002 node in prep for decommissioning |
[production] |
16:05 |
<arnoldokoth> |
drain kubestage1001 node in prep for decommissioning |
[production] |
15:46 |
<elukey> |
move /var/tmp/core/* to /srv/coredumps on ores1008 to free root space |
[production] |
14:30 |
<jelto@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' . |
[production] |
14:25 |
<jelto@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' . |
[production] |
14:21 |
<jelto@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' . |
[production] |
13:48 |
<jelto@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' . |
[production] |
13:46 |
<jelto@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' . |
[production] |
13:25 |
<akosiaris@deploy1002> |
helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
13:25 |
<akosiaris@deploy1002> |
helmfile [staging-codfw] START helmfile.d/admin 'apply'. |
[production] |
12:21 |
<vgutierrez> |
restarting HAProxy on O:cache::upload_haproxy - T290005 |
[production] |
11:41 |
<akosiaris> |
T296303 cleanup weird state of calico-codfw cluster |
[production] |
11:41 |
<akosiaris@deploy1002> |
helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
11:41 |
<akosiaris@deploy1002> |
helmfile [staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
11:39 |
<akosiaris@deploy1002> |
helmfile [staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
11:25 |
<vgutierrez> |
restarting HAProxy on O:cache::(text|upload)_haproxy - T290005 |
[production] |
10:23 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repool after fixing users T296274', diff saved to https://phabricator.wikimedia.org/P17880 and previous config saved to /var/cache/conftool/dbconfig/20211126-102340-ladsgroup.json |
[production] |
10:17 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1111 (T296274)', diff saved to https://phabricator.wikimedia.org/P17879 and previous config saved to /var/cache/conftool/dbconfig/20211126-101714-ladsgroup.json |
[production] |
10:17 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1111.eqiad.wmnet with reason: Maintenance T296274 |
[production] |
10:17 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1111.eqiad.wmnet with reason: Maintenance T296274 |
[production] |
10:14 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repool after fixing users T296274', diff saved to https://phabricator.wikimedia.org/P17878 and previous config saved to /var/cache/conftool/dbconfig/20211126-101423-ladsgroup.json |
[production] |
10:05 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1177 (T296274)', diff saved to https://phabricator.wikimedia.org/P17877 and previous config saved to /var/cache/conftool/dbconfig/20211126-100547-ladsgroup.json |
[production] |
10:05 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance T296274 |
[production] |
10:05 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance T296274 |
[production] |
10:04 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
10:04 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
08:28 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17876 and previous config saved to /var/cache/conftool/dbconfig/20211126-082834-ladsgroup.json |
[production] |
08:13 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17875 and previous config saved to /var/cache/conftool/dbconfig/20211126-081329-ladsgroup.json |
[production] |
07:58 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17874 and previous config saved to /var/cache/conftool/dbconfig/20211126-075824-ladsgroup.json |
[production] |
07:43 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17873 and previous config saved to /var/cache/conftool/dbconfig/20211126-074320-ladsgroup.json |
[production] |
06:28 |
<Amir1> |
killing extensions/MachineVision/maintenance/fetchSuggestions.php in mwmaint |
[production] |
06:19 |
<Amir1> |
killing lingering process from mwmaint to depooled db (db1160) that was depooled nine hours ago |
[production] |
2021-11-25
§
|
20:43 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17872 and previous config saved to /var/cache/conftool/dbconfig/20211125-204357-ladsgroup.json |
[production] |
20:43 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
20:43 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
19:28 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
19:28 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
19:28 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1149 (T296143)', diff saved to https://phabricator.wikimedia.org/P17871 and previous config saved to /var/cache/conftool/dbconfig/20211125-192850-ladsgroup.json |
[production] |
19:13 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1149 (T296143)', diff saved to https://phabricator.wikimedia.org/P17870 and previous config saved to /var/cache/conftool/dbconfig/20211125-191345-ladsgroup.json |
[production] |
18:58 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1149 (T296143)', diff saved to https://phabricator.wikimedia.org/P17869 and previous config saved to /var/cache/conftool/dbconfig/20211125-185841-ladsgroup.json |
[production] |
18:43 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1149 (T296143)', diff saved to https://phabricator.wikimedia.org/P17868 and previous config saved to /var/cache/conftool/dbconfig/20211125-184336-ladsgroup.json |
[production] |
17:27 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1149 (T296143)', diff saved to https://phabricator.wikimedia.org/P17867 and previous config saved to /var/cache/conftool/dbconfig/20211125-172714-ladsgroup.json |
[production] |
17:27 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1149.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
17:27 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1149.eqiad.wmnet with reason: Maintenance T296143 |
[production] |
17:27 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1148 (T296143)', diff saved to https://phabricator.wikimedia.org/P17866 and previous config saved to /var/cache/conftool/dbconfig/20211125-172707-ladsgroup.json |
[production] |
17:12 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=true; selector: dnsdisc=inference |
[production] |