6401-6450 of 10000 results (37ms)
2021-02-05 §
11:08 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host ncredir1001.eqiad.wmnet [production]
11:06 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir1002.eqiad.wmnet [production]
10:56 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host ncredir1002.eqiad.wmnet [production]
10:53 <marostegui@cumin1001> dbctl commit (dc=all): 'db1075 (re)pooling @ 100%: Slowly pooling db1075 after cloning db1157', diff saved to https://phabricator.wikimedia.org/P14222 and previous config saved to /var/cache/conftool/dbconfig/20210205-105345-root.json [production]
10:38 <marostegui@cumin1001> dbctl commit (dc=all): 'db1075 (re)pooling @ 75%: Slowly pooling db1075 after cloning db1157', diff saved to https://phabricator.wikimedia.org/P14221 and previous config saved to /var/cache/conftool/dbconfig/20210205-103841-root.json [production]
10:32 <godog> swift codfw-prod decrease HDD weight for ms-be20[16-27] - T272837 [production]
10:27 <vgutierrez@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99) [production]
10:27 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
10:23 <marostegui@cumin1001> dbctl commit (dc=all): 'db1075 (re)pooling @ 50%: Slowly pooling db1075 after cloning db1157', diff saved to https://phabricator.wikimedia.org/P14220 and previous config saved to /var/cache/conftool/dbconfig/20210205-102338-root.json [production]
10:08 <marostegui@cumin1001> dbctl commit (dc=all): 'db1075 (re)pooling @ 25%: Slowly pooling db1075 after cloning db1157', diff saved to https://phabricator.wikimedia.org/P14219 and previous config saved to /var/cache/conftool/dbconfig/20210205-100834-root.json [production]
10:06 <gehel> repooling wdqs1013 - catched up on lag [production]
09:53 <marostegui@cumin1001> dbctl commit (dc=all): 'db1075 (re)pooling @ 10%: Slowly pooling db1075 after cloning db1157', diff saved to https://phabricator.wikimedia.org/P14218 and previous config saved to /var/cache/conftool/dbconfig/20210205-095331-root.json [production]
09:45 <dcausse> reloading categories from scratch on wdqs1010 [production]
09:38 <marostegui@cumin1001> dbctl commit (dc=all): 'db1075 (re)pooling @ 5%: Slowly pooling db1075 after cloning db1157', diff saved to https://phabricator.wikimedia.org/P14217 and previous config saved to /var/cache/conftool/dbconfig/20210205-093827-root.json [production]
08:46 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1094 T273710', diff saved to https://phabricator.wikimedia.org/P14214 and previous config saved to /var/cache/conftool/dbconfig/20210205-084625-marostegui.json [production]
08:29 <dcausse> reloading categories from scratch on wdqs1009 [production]
07:55 <gehel> cleanup of left over ttl dumps on wdqs1009 and wdqs1010 [production]
07:47 <gehel> depooling wdqs1013 and restarting blazegraph [production]
07:28 <oblivian@cumin1001> END (PASS) - Cookbook sre.network.cf (exit_code=0) [production]
07:28 <oblivian@cumin1001> START - Cookbook sre.network.cf [production]
06:36 <marostegui> Stop MySQL on db1075 to clone db1157 T258361 [production]
06:35 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1075 T258361', diff saved to https://phabricator.wikimedia.org/P14212 and previous config saved to /var/cache/conftool/dbconfig/20210205-063554-marostegui.json [production]
03:42 <aaron@deploy1001> Synchronized wmf-config/mc.php: af5b0effb5e88ac4ca4a06c2c409d303ec405305 (duration: 01m 06s) [production]
03:34 <aaron@deploy1001> Synchronized php-1.36.0-wmf.27/includes/libs/rdbms: 4b386661a9820a002b43bfcef3e18241ea883870 (duration: 01m 12s) [production]
02:03 <Krinkle> krinkle@mwmaint1002 Prune globalimagelinks references on s4 database for the deleted ukwikimedia wiki, ref T218170. [production]
01:01 <ebernhardson@deploy1001> Finished deploy [wikimedia/discovery/analytics@85713c1]: restore data range specifier in extract job partition spec (duration: 01m 12s) [production]
00:59 <ebernhardson@deploy1001> Started deploy [wikimedia/discovery/analytics@85713c1]: restore data range specifier in extract job partition spec [production]
00:36 <legoktm@cumin1001> conftool action : set/pooled=no; selector: name=mw1278.eqiad.wmnet [production]
00:35 <legoktm> enabled remote IPMI access on mw1349.mgmt.eqiad.wmnet and mw1380.mgmt.eqiad.wmnet [production]
00:24 <ebernhardson@deploy1001> Finished deploy [wikimedia/discovery/analytics@9858513]: transfer_to_es: Wait for link reco, and write to weighted_tags as well (duration: 02m 43s) [production]
00:21 <ebernhardson@deploy1001> Started deploy [wikimedia/discovery/analytics@9858513]: transfer_to_es: Wait for link reco, and write to weighted_tags as well [production]
2021-02-04 §
23:59 <ebernhardson@deploy1001> Finished deploy [wikimedia/discovery/analytics@93bf374]: correct hql in ores_predictions_init_v3 (duration: 01m 06s) [production]
23:58 <ebernhardson@deploy1001> Started deploy [wikimedia/discovery/analytics@93bf374]: correct hql in ores_predictions_init_v3 [production]
23:26 <legoktm@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1278.eqiad.wmnet with reason: REIMAGE [production]
23:24 <legoktm@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1278.eqiad.wmnet with reason: REIMAGE [production]
23:05 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1397.eqiad.wmnet [production]
23:05 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1396.eqiad.wmnet [production]
23:02 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1396.eqiad.wmnet [production]
23:02 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1397.eqiad.wmnet [production]
23:01 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1311.eqiad.wmnet [production]
22:55 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1263.eqiad.wmnet [production]
22:39 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1263.eqiad.wmnet [production]
22:38 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1311.eqiad.wmnet [production]
22:38 <ebernhardson@deploy1001> Finished deploy [wikimedia/discovery/analytics@700cd49]: partition ores staging tables by data source (duration: 01m 19s) [production]
22:37 <ebernhardson@deploy1001> Started deploy [wikimedia/discovery/analytics@700cd49]: partition ores staging tables by data source [production]
22:31 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1396.eqiad.wmnet with reason: REIMAGE [production]
22:29 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1397.eqiad.wmnet with reason: REIMAGE [production]
22:28 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1396.eqiad.wmnet with reason: REIMAGE [production]
22:27 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1397.eqiad.wmnet with reason: REIMAGE [production]
21:59 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1399.eqiad.wmnet [production]