1201-1250 of 10000 results (30ms)
2024-02-07 §
19:34 <wmbot~superpes@tools-sgebastion-10> Restarted StewardBot not feeding on IRC [tools.stewardbots]
19:32 <joal@deploy2002> Started deploy [analytics/refinery@80b329b]: Analytics Hotfix [analytics/refinery@80b329b5] [production]
19:30 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1202 (T355609)', diff saved to https://phabricator.wikimedia.org/P56470 and previous config saved to /var/cache/conftool/dbconfig/20240207-193016-marostegui.json [production]
19:30 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance [production]
19:29 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance [production]
19:29 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1194 (T355609)', diff saved to https://phabricator.wikimedia.org/P56469 and previous config saved to /var/cache/conftool/dbconfig/20240207-192953-marostegui.json [production]
19:19 <mutante> people1004 systemctl stop confd; running puppet; checking to remove confd remnants from people* hosts - T356296 [production]
19:14 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P56468 and previous config saved to /var/cache/conftool/dbconfig/20240207-191446-marostegui.json [production]
19:01 <brennen> train 1.42.0-wmf.17 (T354435): a couple of blockers currently, waiting on resolution before rolling [production]
18:59 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P56467 and previous config saved to /var/cache/conftool/dbconfig/20240207-185940-marostegui.json [production]
18:49 <btullis@cumin1002> END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons. [production]
18:44 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1194 (T355609)', diff saved to https://phabricator.wikimedia.org/P56466 and previous config saved to /var/cache/conftool/dbconfig/20240207-184433-marostegui.json [production]
18:42 <wmbot~bd808@tools-sgebastion-11> Restarted webservice after report of down/hang via irc [tools.mix-n-match]
18:39 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1194 (T355609)', diff saved to https://phabricator.wikimedia.org/P56465 and previous config saved to /var/cache/conftool/dbconfig/20240207-183912-marostegui.json [production]
18:39 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance [production]
18:38 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance [production]
18:38 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1191 (T355609)', diff saved to https://phabricator.wikimedia.org/P56464 and previous config saved to /var/cache/conftool/dbconfig/20240207-183849-marostegui.json [production]
18:30 <btullis@cumin1002> START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons. [production]
18:25 <btullis@cumin1002> END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad [production]
18:23 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P56463 and previous config saved to /var/cache/conftool/dbconfig/20240207-182342-marostegui.json [production]
18:08 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P56462 and previous config saved to /var/cache/conftool/dbconfig/20240207-180835-marostegui.json [production]
18:00 <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.reboot for all workers [tools]
17:59 <wmbot~lucaswerkmeister@tools-sgebastion-10> started webservice again (and patched the startup probe into it); took a while to come up but now it seems to be working [tools.lexeme-forms]
17:58 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-9 [tools]
17:58 <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-9 [tools]
17:53 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1191 (T355609)', diff saved to https://phabricator.wikimedia.org/P56461 and previous config saved to /var/cache/conftool/dbconfig/20240207-175328-marostegui.json [production]
17:52 <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw [production]
17:52 <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw [production]
17:49 <wmbot~lucaswerkmeister@tools-sgebastion-10> stopped webservice, restart wasn’t working so let’s try harder [tools.lexeme-forms]
17:48 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1191 (T355609)', diff saved to https://phabricator.wikimedia.org/P56460 and previous config saved to /var/cache/conftool/dbconfig/20240207-174807-marostegui.json [production]
17:48 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance [production]
17:47 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance [production]
17:47 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1174 (T355609)', diff saved to https://phabricator.wikimedia.org/P56459 and previous config saved to /var/cache/conftool/dbconfig/20240207-174745-marostegui.json [production]
17:45 <wmbot~lucaswerkmeister@tools-sgebastion-10> restarted webservice, log was full of various errors [tools.lexeme-forms]
17:32 <jgiannelos@deploy2002> Finished deploy [restbase/deploy@1007273]: Disabling storage for jawiki (duration: 07m 19s) [production]
17:32 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P56458 and previous config saved to /var/cache/conftool/dbconfig/20240207-173238-marostegui.json [production]
17:26 <btullis> roll-restarting kafka-jumbo for T356382 [analytics]
17:26 <btullis@cumin1002> START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad [production]
17:25 <jgiannelos@deploy2002> Started deploy [restbase/deploy@1007273]: Disabling storage for jawiki [production]
17:24 <taavi@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=99) for all workers [tools]
17:23 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.reboot for all workers [tools]
17:17 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P56457 and previous config saved to /var/cache/conftool/dbconfig/20240207-171732-marostegui.json [production]
17:11 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes:weight=10; selector: service=thumbor [production]
17:05 <taavi@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=99) for all workers [tools]
17:05 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.reboot for all workers [tools]
17:04 <sbailey@deploy2002> helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply [production]
17:04 <sbailey@deploy2002> helmfile [codfw] START helmfile.d/services/wikifeeds: apply [production]
17:03 <sbailey@deploy2002> helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply [production]
17:03 <taavi@cloudcumin1001> END (ERROR) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=97) for all workers [tools]
17:03 <sbailey@deploy2002> helmfile [eqiad] START helmfile.d/services/wikifeeds: apply [production]