2024-07-17
§
|
08:58 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P66705 and previous config saved to /var/cache/conftool/dbconfig/20240717-085857-root.json |
[production] |
08:57 |
<elukey@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4037.ulsfo.wmnet |
[production] |
08:48 |
<elukey@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host cp4037.ulsfo.wmnet |
[production] |
08:47 |
<elukey@puppetserver1001> |
conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet |
[production] |
08:43 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P66704 and previous config saved to /var/cache/conftool/dbconfig/20240717-084351-root.json |
[production] |
08:06 |
<kartik@deploy1002> |
Finished scap: Backport for [[gerrit:1054699|TranslatablePageState: Check if banner namespaces are configured (T370219)]] (duration: 14m 26s) |
[production] |
08:00 |
<kartik@deploy1002> |
abi, kartik: Continuing with sync |
[production] |
07:54 |
<kartik@deploy1002> |
abi, kartik: Backport for [[gerrit:1054699|TranslatablePageState: Check if banner namespaces are configured (T370219)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
07:51 |
<kartik@deploy1002> |
Started scap sync-world: Backport for [[gerrit:1054699|TranslatablePageState: Check if banner namespaces are configured (T370219)]] |
[production] |
07:50 |
<jayme@deploy1002> |
helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
07:50 |
<jayme@deploy1002> |
helmfile [staging-eqiad] START helmfile.d/admin 'apply'. |
[production] |
07:50 |
<jayme@deploy1002> |
helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
07:49 |
<elukey> |
restart hadoop-mapreduce-historyserver.service on an-master1003 - failed for Java OOM |
[production] |
07:49 |
<jayme@deploy1002> |
helmfile [staging-codfw] START helmfile.d/admin 'apply'. |
[production] |
07:38 |
<elukey@cumin1002> |
END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-d1-codfw |
[production] |
07:37 |
<jayme> |
imported helm3 3.11.3 to bullseye-wikimedia and buster-wikimedia |
[production] |
07:36 |
<elukey@cumin1002> |
START - Cookbook sre.network.tls for network device lsw1-d1-codfw |
[production] |
06:48 |
<ayounsi@cumin1002> |
END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'clear' for AS: 17072 |
[production] |
06:48 |
<ayounsi@cumin1002> |
START - Cookbook sre.network.peering with action 'clear' for AS: 17072 |
[production] |
05:36 |
<marostegui> |
Deploy schema change on s7 eqiad db1181 dbmaint T367856 |
[production] |
05:35 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Long schema change |
[production] |
05:35 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Long schema change |
[production] |
05:34 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db1181 T370121', diff saved to https://phabricator.wikimedia.org/P66703 and previous config saved to /var/cache/conftool/dbconfig/20240717-053359-marostegui.json |
[production] |
05:33 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write T370121', diff saved to https://phabricator.wikimedia.org/P66702 and previous config saved to /var/cache/conftool/dbconfig/20240717-053302-root.json |
[production] |
05:32 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T370121', diff saved to https://phabricator.wikimedia.org/P66701 and previous config saved to /var/cache/conftool/dbconfig/20240717-053230-root.json |
[production] |
05:32 |
<marostegui> |
Starting s7 eqiad failover from db1181 to db1236 - T370121 |
[production] |
05:14 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 T370121 |
[production] |
05:14 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Set db1236 with weight 0 T370121', diff saved to https://phabricator.wikimedia.org/P66700 and previous config saved to /var/cache/conftool/dbconfig/20240717-051419-root.json |
[production] |
05:14 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s7 T370121 |
[production] |
02:56 |
<eileen> |
civicrm upgraded from 4f919c1e to 1ac3e7be |
[production] |
00:42 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_eqiad |
[production] |
00:42 |
<bking@cumin2002> |
START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_eqiad |
[production] |
2024-07-16
§
|
23:33 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2209 (T367781)', diff saved to https://phabricator.wikimedia.org/P66699 and previous config saved to /var/cache/conftool/dbconfig/20240716-233336-arnaudb.json |
[production] |
23:25 |
<cstone> |
civicrm upgraded from 8dbcdfb7 to 4f919c1e |
[production] |
23:18 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P66698 and previous config saved to /var/cache/conftool/dbconfig/20240716-231829-arnaudb.json |
[production] |
23:04 |
<eileen> |
config revision changed from a1ed167f to 85336766 |
[production] |
23:03 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P66697 and previous config saved to /var/cache/conftool/dbconfig/20240716-230322-arnaudb.json |
[production] |
22:48 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2209 (T367781)', diff saved to https://phabricator.wikimedia.org/P66696 and previous config saved to /var/cache/conftool/dbconfig/20240716-224815-arnaudb.json |
[production] |
22:40 |
<tzatziki> |
removing 9 files for legal compliance |
[production] |
22:37 |
<eileen> |
* civicrm upgraded from 3287ced0 to 8dbcdfb7 |
[production] |
22:26 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depooling db2209 (T367781)', diff saved to https://phabricator.wikimedia.org/P66695 and previous config saved to /var/cache/conftool/dbconfig/20240716-222638-arnaudb.json |
[production] |
22:26 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2209.codfw.wmnet with reason: Maintenance |
[production] |
22:26 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db2209.codfw.wmnet with reason: Maintenance |
[production] |
22:26 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2194 (T367781)', diff saved to https://phabricator.wikimedia.org/P66694 and previous config saved to /var/cache/conftool/dbconfig/20240716-222616-arnaudb.json |
[production] |
22:11 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P66693 and previous config saved to /var/cache/conftool/dbconfig/20240716-221109-arnaudb.json |
[production] |
21:59 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy2008.codfw.wmnet with OS bookworm |
[production] |
21:56 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P66692 and previous config saved to /var/cache/conftool/dbconfig/20240716-215601-arnaudb.json |
[production] |
21:40 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2194 (T367781)', diff saved to https://phabricator.wikimedia.org/P66691 and previous config saved to /var/cache/conftool/dbconfig/20240716-214054-arnaudb.json |
[production] |
21:19 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depooling db2194 (T367781)', diff saved to https://phabricator.wikimedia.org/P66690 and previous config saved to /var/cache/conftool/dbconfig/20240716-211914-arnaudb.json |
[production] |
21:19 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2194.codfw.wmnet with reason: Maintenance |
[production] |