2024-03-07
§
|
22:17 |
<rzl@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2124.codfw.wmnet with reason: index corruption |
[production] |
22:16 |
<rzl@cumin2002> |
START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on db2124.codfw.wmnet with reason: index corruption |
[production] |
22:10 |
<rzl@cumin2002> |
dbctl commit (dc=all): 'Depool db2124', diff saved to https://phabricator.wikimedia.org/P58659 and previous config saved to /var/cache/conftool/dbconfig/20240307-221056-rzl.json |
[production] |
22:08 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'db2156 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P58658 and previous config saved to /var/cache/conftool/dbconfig/20240307-220824-ladsgroup.json |
[production] |
21:53 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'db2156 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P58657 and previous config saved to /var/cache/conftool/dbconfig/20240307-215319-ladsgroup.json |
[production] |
21:38 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'db2156 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P58656 and previous config saved to /var/cache/conftool/dbconfig/20240307-213814-ladsgroup.json |
[production] |
21:19 |
<brennen@deploy2002> |
Finished scap: Backport for [[gerrit:1009337|Fixes: Less_Exception_Compiler (T359414 T357740)]] (duration: 14m 41s) |
[production] |
21:09 |
<brennen@deploy2002> |
brennen and jdlrobson: Continuing with sync |
[production] |
21:07 |
<brennen@deploy2002> |
brennen and jdlrobson: Backport for [[gerrit:1009337|Fixes: Less_Exception_Compiler (T359414 T357740)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
21:04 |
<brennen@deploy2002> |
Started scap: Backport for [[gerrit:1009337|Fixes: Less_Exception_Compiler (T359414 T357740)]] |
[production] |
20:50 |
<dancy@deploy2002> |
Finished deploy [cassandra/logstash-logback-encoder@c200e79]: (no justification provided) (duration: 00m 35s) |
[production] |
20:50 |
<dancy@deploy2002> |
Started deploy [cassandra/logstash-logback-encoder@c200e79]: (no justification provided) |
[production] |
20:49 |
<dancy@deploy2002> |
Finished deploy [cassandra/logstash-logback-encoder@162f72f]: (no justification provided) (duration: 00m 56s) |
[production] |
20:49 |
<dancy@deploy2002> |
Started deploy [cassandra/logstash-logback-encoder@162f72f]: (no justification provided) |
[production] |
18:49 |
<btullis> |
running a wikidata dump manually on snapshot1009 for partitions 25,27 |
[production] |
18:22 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 60 days, 0:00:00 on wdqs[1022-1025].eqiad.wmnet with reason: T337013 |
[production] |
18:22 |
<bking@cumin2002> |
START - Cookbook sre.hosts.downtime for 60 days, 0:00:00 on wdqs[1022-1025].eqiad.wmnet with reason: T337013 |
[production] |
18:19 |
<bearloga@deploy2002> |
Finished deploy [airflow-dags/analytics_product@15edf4a]: (no justification provided) (duration: 00m 08s) |
[production] |
18:19 |
<bearloga@deploy2002> |
Started deploy [airflow-dags/analytics_product@15edf4a]: (no justification provided) |
[production] |
17:43 |
<cwhite> |
set aside WAL for prometheus@k8s in codfw and restart - T354399 |
[production] |
17:28 |
<cwhite> |
set aside WAL for prometheus@k8s in eqiad and restart - T354399 |
[production] |
17:25 |
<dancy@deploy2002> |
Finished scap: testing T358117 (duration: 11m 15s) |
[production] |
17:22 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P58654 and previous config saved to /var/cache/conftool/dbconfig/20240307-172227-ladsgroup.json |
[production] |
17:14 |
<dancy@deploy2002> |
Started scap: testing T358117 |
[production] |
17:07 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P58653 and previous config saved to /var/cache/conftool/dbconfig/20240307-170720-ladsgroup.json |
[production] |
16:52 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P58652 and previous config saved to /var/cache/conftool/dbconfig/20240307-165213-ladsgroup.json |
[production] |
16:48 |
<cgoubert@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply |
[production] |
16:47 |
<cgoubert@deploy2002> |
helmfile [codfw] START helmfile.d/services/mw-parsoid: apply |
[production] |
16:47 |
<cgoubert@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply |
[production] |
16:47 |
<cgoubert@deploy2002> |
helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply |
[production] |
16:44 |
<dancy@deploy2002> |
Installation of scap version "4.70.0" completed for 373 hosts |
[production] |
16:43 |
<dancy@deploy2002> |
Installing scap version "4.70.0" for 373 hosts |
[production] |
16:38 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.reimage for host dbprov2006.codfw.wmnet with OS bullseye |
[production] |
16:38 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye |
[production] |
16:37 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P58651 and previous config saved to /var/cache/conftool/dbconfig/20240307-163706-ladsgroup.json |
[production] |
16:29 |
<cdanis> |
T343529 ✔ cdanis@prometheus2005.codfw.wmnet ~ 🕦☕sudo systemctl restart thanos-sidecar@k8s.service |
[production] |
16:20 |
<jnuche@deploy2002> |
rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.21 refs T354439 |
[production] |
16:19 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2112.codfw.wmnet with reason: Maintenance |
[production] |
16:19 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db2112.codfw.wmnet with reason: Maintenance |
[production] |
16:19 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: Maintenance |
[production] |
16:19 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: Maintenance |
[production] |
16:18 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: Maintenance |
[production] |
16:18 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: Maintenance |
[production] |
16:18 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: Maintenance |
[production] |
16:18 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: Maintenance |
[production] |
16:17 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P58650 and previous config saved to /var/cache/conftool/dbconfig/20240307-161720-arnaudb.json |
[production] |
16:06 |
<claime> |
bouncing prometheus@k8s.service - T343529 |
[production] |
16:04 |
<dzahn@cumin2002> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts etherpad2001.codfw.wmnet |
[production] |
16:04 |
<dzahn@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
16:02 |
<dzahn@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |