8001-8050 of 10000 results (40ms)
2021-02-22 §
11:26 <dcaro@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cloudvirt-wdqs[1001-1003].eqiad.wmnet with reason: Restarting cloudcanary instances [production]
11:26 <dcaro@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on cloudvirt-wdqs[1001-1003].eqiad.wmnet with reason: Restarting cloudcanary instances [production]
11:22 <godog> roll restart prometheus on cloudmetrics* [production]
11:21 <godog> roll restart prometheus on prometheus* [production]
11:12 <godog> restart prometheus on prometheus2004 to apply changes - T273278 [production]
11:10 <marostegui@cumin1001> dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Slowly repool db1166', diff saved to https://phabricator.wikimedia.org/P14433 and previous config saved to /var/cache/conftool/dbconfig/20210222-111032-root.json [production]
10:55 <marostegui@cumin1001> dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Slowly repool db1166', diff saved to https://phabricator.wikimedia.org/P14432 and previous config saved to /var/cache/conftool/dbconfig/20210222-105528-root.json [production]
10:49 <_joe_> removing stray old builds from compiler1003 [production]
10:40 <marostegui@cumin1001> dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: Slowly repool db1166', diff saved to https://phabricator.wikimedia.org/P14431 and previous config saved to /var/cache/conftool/dbconfig/20210222-104025-root.json [production]
10:36 <_joe_> manually removed the restbase-http ipvs entry from the load balancers [production]
10:30 <akosiaris@cumin1001> conftool action : set/pooled=false; selector: name=codfw,dnsdisc=sessionstore [production]
10:29 <akosiaris> depool sessionstore in codfw for sessionstore certificate refresh. T274564 [production]
10:25 <marostegui@cumin1001> dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Slowly repool db1166', diff saved to https://phabricator.wikimedia.org/P14430 and previous config saved to /var/cache/conftool/dbconfig/20210222-102521-root.json [production]
10:16 <_joe_> restarting pybal on lvs1015 to pick up restbase http removal [production]
10:12 <_joe_> restarting pybal on lvs1016 to pick up restbase http removal [production]
10:10 <marostegui@cumin1001> dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: Slowly repool db1166', diff saved to https://phabricator.wikimedia.org/P14429 and previous config saved to /var/cache/conftool/dbconfig/20210222-101018-root.json [production]
10:06 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1166 for schema change', diff saved to https://phabricator.wikimedia.org/P14428 and previous config saved to /var/cache/conftool/dbconfig/20210222-100653-marostegui.json [production]
09:51 <_joe_> restarting low-traffic pybals in codfw to remove the restbase http endpoint [production]
09:35 <marostegui> Deploy schema change on s3 codfw master, there will be lag on s3 codfw - T273359 [production]
09:30 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1008.eqiad.wmnet [production]
09:20 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host stat1008.eqiad.wmnet [production]
09:08 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1005.eqiad.wmnet [production]
09:04 <moritzm> installing screen security updates on Buster [production]
09:00 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host stat1005.eqiad.wmnet [production]
08:40 <godog> swift codfw-prod: more weight to ms-be20[58-61] - T269337 [production]
08:39 <gehel> depool elastic2045 and ban from clsuters - T275345 [production]
08:12 <urbanecm@deploy1001> Synchronized wmf-config/flaggedrevs.php: cea41a2f7736aa29dee8f10de4c0c17353ece963: fiwiki: Assign stablesettings to reviewers in IS.php rather than FR-specific file (T275017; 2/2) (duration: 00m 55s) [production]
08:11 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: cea41a2f7736aa29dee8f10de4c0c17353ece963: fiwiki: Assign stablesettings to reviewers in IS.php rather than FR-specific file (T275017; 1/2) (duration: 01m 08s) [production]
07:54 <marostegui@cumin1001> dbctl commit (dc=all): 'Remove db1090* from dbctl T274333', diff saved to https://phabricator.wikimedia.org/P14426 and previous config saved to /var/cache/conftool/dbconfig/20210222-075437-marostegui.json [production]
07:38 <moritzm> installing openldap security updates on LDAP replicas [production]
07:29 <hashar> Restarting CI Jenkins to downgrade plugin # T271683 [production]
07:14 <hashar> Restarting CI Jenkins for plugin upgrade # T271683 [production]
07:11 <elukey> powercycle elastic2045 - com2 available, no ssh, no root login (hangs indefinitely), no prometheus metrics reported [production]
2021-02-21 §
16:02 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1162 - crashed', diff saved to https://phabricator.wikimedia.org/P14424 and previous config saved to /var/cache/conftool/dbconfig/20210221-160258-marostegui.json [production]
10:07 <ariel@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1008.eqiad.wmnet with reason: REIMAGE [production]
10:05 <ariel@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1008.eqiad.wmnet with reason: REIMAGE [production]
09:32 <ariel@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1008.eqiad.wmnet with reason: REIMAGE [production]
09:30 <ariel@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1008.eqiad.wmnet with reason: REIMAGE [production]
09:29 <ariel@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dumpsdata1002.eqiad.wmnet [production]
09:23 <ariel@cumin1001> START - Cookbook sre.hosts.reboot-single for host dumpsdata1002.eqiad.wmnet [production]
2021-02-20 §
00:17 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1317.eqiad.wmnet [production]
00:16 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1317.eqiad.wmnet [production]
00:15 <ebernhardson> start batch processing images through MachineVision fetchSuggestions.php for T274220 on mwmaint1002 [production]
00:15 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1333.eqiad.wmnet [production]
00:13 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1333.eqiad.wmnet [production]
00:13 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1339.eqiad.wmnet [production]
00:13 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1342.eqiad.wmnet [production]
00:11 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1342.eqiad.wmnet [production]
2021-02-19 §
23:09 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1339.eqiad.wmnet [production]
23:05 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1317.eqiad.wmnet with reason: REIMAGE [production]