1851-1900 of 10000 results (102ms)
2024-07-17 §
07:50 <jayme@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
07:49 <elukey> restart hadoop-mapreduce-historyserver.service on an-master1003 - failed for Java OOM [production]
07:49 <jayme@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
07:38 <elukey@cumin1002> END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-d1-codfw [production]
07:37 <jayme> imported helm3 3.11.3 to bullseye-wikimedia and buster-wikimedia [production]
07:36 <elukey@cumin1002> START - Cookbook sre.network.tls for network device lsw1-d1-codfw [production]
06:48 <ayounsi@cumin1002> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'clear' for AS: 17072 [production]
06:48 <ayounsi@cumin1002> START - Cookbook sre.network.peering with action 'clear' for AS: 17072 [production]
05:36 <marostegui> Deploy schema change on s7 eqiad db1181 dbmaint T367856 [production]
05:35 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Long schema change [production]
05:35 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Long schema change [production]
05:34 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db1181 T370121', diff saved to https://phabricator.wikimedia.org/P66703 and previous config saved to /var/cache/conftool/dbconfig/20240717-053359-marostegui.json [production]
05:33 <marostegui@cumin1002> dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write T370121', diff saved to https://phabricator.wikimedia.org/P66702 and previous config saved to /var/cache/conftool/dbconfig/20240717-053302-root.json [production]
05:32 <marostegui@cumin1002> dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T370121', diff saved to https://phabricator.wikimedia.org/P66701 and previous config saved to /var/cache/conftool/dbconfig/20240717-053230-root.json [production]
05:32 <marostegui> Starting s7 eqiad failover from db1181 to db1236 - T370121 [production]
05:14 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 T370121 [production]
05:14 <marostegui@cumin1002> dbctl commit (dc=all): 'Set db1236 with weight 0 T370121', diff saved to https://phabricator.wikimedia.org/P66700 and previous config saved to /var/cache/conftool/dbconfig/20240717-051419-root.json [production]
05:14 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s7 T370121 [production]
02:56 <eileen> civicrm upgraded from 4f919c1e to 1ac3e7be [production]
00:42 <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_eqiad [production]
00:42 <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_eqiad [production]
2024-07-16 §
23:33 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2209 (T367781)', diff saved to https://phabricator.wikimedia.org/P66699 and previous config saved to /var/cache/conftool/dbconfig/20240716-233336-arnaudb.json [production]
23:25 <cstone> civicrm upgraded from 8dbcdfb7 to 4f919c1e [production]
23:18 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P66698 and previous config saved to /var/cache/conftool/dbconfig/20240716-231829-arnaudb.json [production]
23:04 <eileen> config revision changed from a1ed167f to 85336766 [production]
23:03 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P66697 and previous config saved to /var/cache/conftool/dbconfig/20240716-230322-arnaudb.json [production]
22:48 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2209 (T367781)', diff saved to https://phabricator.wikimedia.org/P66696 and previous config saved to /var/cache/conftool/dbconfig/20240716-224815-arnaudb.json [production]
22:40 <tzatziki> removing 9 files for legal compliance [production]
22:37 <eileen> * civicrm upgraded from 3287ced0 to 8dbcdfb7 [production]
22:26 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db2209 (T367781)', diff saved to https://phabricator.wikimedia.org/P66695 and previous config saved to /var/cache/conftool/dbconfig/20240716-222638-arnaudb.json [production]
22:26 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2209.codfw.wmnet with reason: Maintenance [production]
22:26 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2209.codfw.wmnet with reason: Maintenance [production]
22:26 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2194 (T367781)', diff saved to https://phabricator.wikimedia.org/P66694 and previous config saved to /var/cache/conftool/dbconfig/20240716-222616-arnaudb.json [production]
22:11 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P66693 and previous config saved to /var/cache/conftool/dbconfig/20240716-221109-arnaudb.json [production]
21:59 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy2008.codfw.wmnet with OS bookworm [production]
21:56 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P66692 and previous config saved to /var/cache/conftool/dbconfig/20240716-215601-arnaudb.json [production]
21:40 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2194 (T367781)', diff saved to https://phabricator.wikimedia.org/P66691 and previous config saved to /var/cache/conftool/dbconfig/20240716-214054-arnaudb.json [production]
21:19 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db2194 (T367781)', diff saved to https://phabricator.wikimedia.org/P66690 and previous config saved to /var/cache/conftool/dbconfig/20240716-211914-arnaudb.json [production]
21:19 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2194.codfw.wmnet with reason: Maintenance [production]
21:18 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2194.codfw.wmnet with reason: Maintenance [production]
21:18 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2190 (T367781)', diff saved to https://phabricator.wikimedia.org/P66689 and previous config saved to /var/cache/conftool/dbconfig/20240716-211852-arnaudb.json [production]
21:03 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P66688 and previous config saved to /var/cache/conftool/dbconfig/20240716-210345-arnaudb.json [production]
20:54 <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:1050083|[July 16th] Enable dark mode for logged out users (tier 1) (T367150)]] (duration: 08m 43s) [production]
20:49 <urbanecm@deploy1002> urbanecm, jdlrobson: Continuing with sync [production]
20:48 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P66687 and previous config saved to /var/cache/conftool/dbconfig/20240716-204838-arnaudb.json [production]
20:48 <urbanecm@deploy1002> urbanecm, jdlrobson: Backport for [[gerrit:1050083|[July 16th] Enable dark mode for logged out users (tier 1) (T367150)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
20:45 <urbanecm@deploy1002> Started scap sync-world: Backport for [[gerrit:1050083|[July 16th] Enable dark mode for logged out users (tier 1) (T367150)]] [production]
20:39 <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:1054558|Ensure every test-config has valid defaults]], [[gerrit:1054553|Merge partial config with defaults (T368606)]], [[gerrit:1054554|Merge partial config with defaults (T368606)]] (duration: 09m 55s) [production]
20:38 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host dbproxy2008.codfw.wmnet with OS bookworm [production]
20:34 <urbanecm@deploy1002> urbanecm, migr: Continuing with sync [production]