651-700 of 10000 results (100ms)
2024-07-23 ยง
22:03 <jclark@cumin1002> START - Cookbook sre.dns.netbox [production]
21:53 <ladsgroup@cumin1002> dbctl commit (dc=all): 'db1203 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P66896 and previous config saved to /var/cache/conftool/dbconfig/20240723-215338-ladsgroup.json [production]
21:53 <ladsgroup@cumin1002> dbctl commit (dc=all): 'db1202 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P66895 and previous config saved to /var/cache/conftool/dbconfig/20240723-215309-ladsgroup.json [production]
21:52 <ladsgroup@cumin1002> dbctl commit (dc=all): 'db1195 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P66894 and previous config saved to /var/cache/conftool/dbconfig/20240723-215225-ladsgroup.json [production]
20:54 <tgr|away> UTC late deploys done [production]
20:53 <tgr@deploy1002> Finished scap: Backport for [[gerrit:1056204|Respect wgTranslateNumerals in Cite footnote markers (T370585)]], [[gerrit:1056205|Respect wgTranslateNumerals in Cite footnote markers (T370585)]] (duration: 09m 34s) [production]
20:48 <tgr@deploy1002> wmde-fisch, tgr: Continuing with sync [production]
20:46 <tgr@deploy1002> wmde-fisch, tgr: Backport for [[gerrit:1056204|Respect wgTranslateNumerals in Cite footnote markers (T370585)]], [[gerrit:1056205|Respect wgTranslateNumerals in Cite footnote markers (T370585)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
20:44 <tgr@deploy1002> Started scap sync-world: Backport for [[gerrit:1056204|Respect wgTranslateNumerals in Cite footnote markers (T370585)]], [[gerrit:1056205|Respect wgTranslateNumerals in Cite footnote markers (T370585)]] [production]
20:38 <ryankemper@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_eqiad [production]
20:38 <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_eqiad [production]
20:21 <tgr@deploy1002> Finished scap: Backport for [[gerrit:1030590|debug: Enable Special:WikimediaDebug (T350094)]] (duration: 09m 28s) [production]
20:16 <tgr@deploy1002> tgr: Continuing with sync [production]
20:14 <tgr@deploy1002> tgr: Backport for [[gerrit:1030590|debug: Enable Special:WikimediaDebug (T350094)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
20:12 <tgr@deploy1002> Started scap sync-world: Backport for [[gerrit:1030590|debug: Enable Special:WikimediaDebug (T350094)]] [production]
18:59 <milimetric@deploy1002> Finished deploy [airflow-dags/analytics@01e1952]: (no justification provided) (duration: 00m 30s) [production]
18:58 <milimetric@deploy1002> Started deploy [airflow-dags/analytics@01e1952]: (no justification provided) [production]
18:45 <mutante> puppetmaster1001/puppetmaster2001 - rm /var/run/confd-template/*.err to clear pybal icinga alerts after T367949 [production]
18:42 <mutante> puppetmaster1001/puppetmaster2001 - rm /var/run/confd-template/_srv_config-master_pybal_codfw_api-https.err to clear pybal icinga alerts after T367949 [production]
18:40 <pt1979@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephmon1004.eqiad.wmnet with OS bullseye [production]
18:14 <dduvall@deploy1002> rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.15 refs T366960 [production]
18:13 <swfrench-wmf> sudo cumin 'A:lvs-secondary-eqiad or A:lvs-low-traffic-eqiad' 'ipvsadm --delete-service --tcp-service 10.2.2.1:443' (appservers-https eqiad) - T367949 [production]
18:12 <aokoth@cumin1002> END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1001.eqiad.wmnet [production]
18:11 <swfrench-wmf> sudo cumin 'A:lvs-secondary-eqiad or A:lvs-low-traffic-eqiad' 'ipvsadm --delete-service --tcp-service 10.2.2.22:443' (api-https eqiad) - T367949 [production]
18:11 <swfrench-wmf> sudo cumin 'A:lvs-secondary-eqiad or A:lvs-low-traffic-eqiad' 'ipvsa [production]
18:10 <aokoth@cumin1002> START - Cookbook sre.vrts.upgrade on VRTS host vrts1001.eqiad.wmnet [production]
18:10 <swfrench-wmf> sudo cumin 'A:lvs-secondary-codfw or A:lvs-low-traffic-codfw' 'ipvsa [production]
18:08 <swfrench-wmf> sudo cumin 'A:lvs-secondary-codfw or A:lvs-low-traffic-codfw' 'ipvsa [production]
18:01 <aokoth@cumin1002> END (FAIL) - Cookbook sre.vrts.upgrade (exit_code=99) on VRTS host vrts1001.eqiad.wmnet [production]
18:01 <aokoth@cumin1002> START - Cookbook sre.vrts.upgrade on VRTS host vrts1001.eqiad.wmnet [production]
17:58 <swfrench-wmf> sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service' - T367949 [production]
17:51 <swfrench-wmf> sudo cumin 'A:lvs-secondary-eqiad' 'systemctl restart pybal.service' - T367949 [production]
17:46 <logmsgbot> nshahquinn-wmf@deploy1002 Finished deploy [airflow-dags/analytics_product@ebd9e13]: (no justification provided) (duration: 00m 07s) [production]
17:46 <logmsgbot> nshahquinn-wmf@deploy1002 Started deploy [airflow-dags/analytics_product@ebd9e13]: (no justification provided) [production]
17:44 <swfrench-wmf> sudo cumin 'A:lvs-low-traffic-codfw' 'systemctl restart pybal.service' - T367949 [production]
17:41 <sukhe@cumin1002> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2014.codfw.wmnet [production]
17:41 <sukhe@cumin1002> START - Cookbook sre.hosts.remove-downtime for lvs2014.codfw.wmnet [production]
17:40 <swfrench@cumin2002> END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-codfw (T367949) [production]
17:37 <pt1979@cumin1002> START - Cookbook sre.hosts.reimage for host cloudcephmon1004.eqiad.wmnet with OS bullseye [production]
17:33 <swfrench@cumin2002> START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-codfw (T367949) [production]
17:28 <swfrench-wmf> run-puppet-agent on O:lvs::balancer to pick up switch to service_setup, removal of profile::lvs::realserver::pools - T367949 [production]
17:17 <swfrench-wmf> run-puppet-agent on A:dnsbox to pick up switch to lvs_setup - T367949 [production]
17:06 <swfrench-wmf> ran authdns-update on dns1004 to pick up removal of appservers / api records - T367949 [production]
17:04 <dancy@deploy1002> sync-world aborted: testing (duration: 00m 51s) [production]
17:03 <dancy@deploy1002> Started scap sync-world: testing [production]
17:02 <pt1979@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephmon1004.eqiad.wmnet with OS bullseye [production]
16:59 <jhathaway> applying varnish change on cp4037, 1030591 [production]
16:58 <hnowlan@deploy1002> helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply [production]
16:57 <hnowlan@deploy1002> helmfile [eqiad] START helmfile.d/services/shellbox-video: apply [production]
16:16 <pt1979@cumin1002> START - Cookbook sre.hosts.reimage for host cloudcephmon1004.eqiad.wmnet with OS bullseye [production]