2001-2050 of 10000 results (74ms)
2023-04-12 §
21:35 <urandom> restarting Cassandra —sessionstore1001— to reenable native transport — T327954 [production]
21:35 <brett@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2007.codfw.wmnet with reason: host reimage [production]
21:33 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2169:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46616 and previous config saved to /var/cache/conftool/dbconfig/20230412-213325-ladsgroup.json [production]
21:33 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance [production]
21:33 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance [production]
21:33 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46615 and previous config saved to /var/cache/conftool/dbconfig/20230412-213301-ladsgroup.json [production]
21:17 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P46614 and previous config saved to /var/cache/conftool/dbconfig/20230412-211755-ladsgroup.json [production]
21:16 <brett@cumin2002> START - Cookbook sre.hosts.reimage for host lvs2007.codfw.wmnet with OS bullseye [production]
21:04 <mutante> gerrit1001 - pushing data over to gerrit1003 via rsync, with bwlimit option: rsync -avp --bwlimit=1m /srv/gerrit/ rsync://gerrit1003.wikimedia.org/gerrit-data/ (T326368) [production]
21:02 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P46613 and previous config saved to /var/cache/conftool/dbconfig/20230412-210249-ladsgroup.json [production]
21:01 <brett@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host lvs2007.codfw.wmnet with OS bullseye [production]
21:01 <brett@cumin2002> START - Cookbook sre.hosts.reimage for host lvs2007.codfw.wmnet with OS bullseye [production]
20:58 <brett> Disable Puppet/PyBal on lvs2007 in preparation for reimaging - T321309 [production]
20:47 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46612 and previous config saved to /var/cache/conftool/dbconfig/20230412-204742-ladsgroup.json [production]
20:47 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye [production]
20:45 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2168:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46611 and previous config saved to /var/cache/conftool/dbconfig/20230412-204508-ladsgroup.json [production]
20:45 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance [production]
20:44 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance [production]
20:44 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2159 (T333332)', diff saved to https://phabricator.wikimedia.org/P46610 and previous config saved to /var/cache/conftool/dbconfig/20230412-204445-ladsgroup.json [production]
20:38 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye [production]
20:29 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P46609 and previous config saved to /var/cache/conftool/dbconfig/20230412-202939-ladsgroup.json [production]
20:15 <zabe@deploy2002> Finished scap: Backport for [[gerrit:907511|Drop unused VectorPageTools feature flag (T332090)]], [[gerrit:907539|Set Vector 2022 as default skin on Welsh Wikipedia (T334279)]] (duration: 10m 19s) [production]
20:14 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P46608 and previous config saved to /var/cache/conftool/dbconfig/20230412-201432-ladsgroup.json [production]
20:06 <zabe@deploy2002> zabe and jdlrobson: Backport for [[gerrit:907511|Drop unused VectorPageTools feature flag (T332090)]], [[gerrit:907539|Set Vector 2022 as default skin on Welsh Wikipedia (T334279)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet [production]
20:05 <zabe@deploy2002> Started scap: Backport for [[gerrit:907511|Drop unused VectorPageTools feature flag (T332090)]], [[gerrit:907539|Set Vector 2022 as default skin on Welsh Wikipedia (T334279)]] [production]
19:59 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2159 (T333332)', diff saved to https://phabricator.wikimedia.org/P46606 and previous config saved to /var/cache/conftool/dbconfig/20230412-195926-ladsgroup.json [production]
19:54 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2159 (T333332)', diff saved to https://phabricator.wikimedia.org/P46605 and previous config saved to /var/cache/conftool/dbconfig/20230412-195453-ladsgroup.json [production]
19:54 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance [production]
19:54 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance [production]
19:54 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance [production]
19:54 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance [production]
19:54 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2150 (T333332)', diff saved to https://phabricator.wikimedia.org/P46604 and previous config saved to /var/cache/conftool/dbconfig/20230412-195423-ladsgroup.json [production]
19:51 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye [production]
19:43 <zabe@deploy2002> Finished scap: Backport for [[gerrit:908292|Revert "Ensure ApiHelp correctly types values in TOCData objects"]], [[gerrit:908293|Revert "Ensure ApiHelp correctly types values in TOCData objects"]] (duration: 06m 40s) [production]
19:41 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye [production]
19:41 <otto@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
19:40 <otto@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
19:39 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P46603 and previous config saved to /var/cache/conftool/dbconfig/20230412-193917-ladsgroup.json [production]
19:38 <zabe@deploy2002> zabe: Backport for [[gerrit:908292|Revert "Ensure ApiHelp correctly types values in TOCData objects"]], [[gerrit:908293|Revert "Ensure ApiHelp correctly types values in TOCData objects"]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet [production]
19:37 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye [production]
19:37 <zabe@deploy2002> Started scap: Backport for [[gerrit:908292|Revert "Ensure ApiHelp correctly types values in TOCData objects"]], [[gerrit:908293|Revert "Ensure ApiHelp correctly types values in TOCData objects"]] [production]
19:37 <urandom> sessionstore1001: systemctl stop cassandra-a.service && systemctl start cassandra-a.service — T327954 [production]
19:36 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye [production]
19:35 <zabe@deploy2002> Sync cancelled. [production]
19:32 <zabe@deploy2002> jforrester and zabe: Backport for [[gerrit:908291|composer.json: Explicitly pin psr/http-message to 1.0.1 (T333993)]], [[gerrit:908290|Ensure ApiHelp correctly types values in TOCData objects (T334551)]], [[gerrit:908289|Ensure ApiHelp correctly types values in TOCData objects (T334551)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002. [production]
19:30 <zabe@deploy2002> Started scap: Backport for [[gerrit:908291|composer.json: Explicitly pin psr/http-message to 1.0.1 (T333993)]], [[gerrit:908290|Ensure ApiHelp correctly types values in TOCData objects (T334551)]], [[gerrit:908289|Ensure ApiHelp correctly types values in TOCData objects (T334551)]] [production]
19:28 <urandom> restart Cassandra —sessionstore1001— to disable native transport for testing — T327954 [production]
19:24 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P46602 and previous config saved to /var/cache/conftool/dbconfig/20230412-192411-ladsgroup.json [production]
19:17 <eevans@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on sessionstore1001.eqiad.wmnet with reason: Reproducing dissonant cluster state [production]
19:16 <eevans@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on sessionstore1001.eqiad.wmnet with reason: Reproducing dissonant cluster state [production]