production SAL

1301-1350 of 10000 results (131ms)

2025-05-14 §
13:20	<Lucas_WMDE>	UTC afternoon backport+config window done	[production]
13:19	<lucaswerkmeister-wmde@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1139489\|manage-dblist: Rename to manage-dblist.php (T392819)]] (duration: 12m 48s)	[production]
13:15	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1258 (re)pooling @ 40%: Repooling', diff saved to https://phabricator.wikimedia.org/P76148 and previous config saved to /var/cache/conftool/dbconfig/20250514-131510-root.json	[production]
13:13	<aqu@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply	[production]
13:13	<aqu@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply	[production]
13:12	<lucaswerkmeister-wmde@deploy1003>	lucaswerkmeister-wmde: Continuing with sync	[production]
13:11	<lucaswerkmeister-wmde@deploy1003>	lucaswerkmeister-wmde: Backport for [[gerrit:1139489\|manage-dblist: Rename to manage-dblist.php (T392819)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
13:07	<godog>	correction, restart grafana-server on grafana1002	[production]
13:06	<lucaswerkmeister-wmde@deploy1003>	Started scap sync-world: Backport for [[gerrit:1139489\|manage-dblist: Rename to manage-dblist.php (T392819)]]	[production]
13:05	<godog>	reboot grafana1002 - hard down	[production]
13:01	<stevemunene@cumin1002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1068.eqiad.wmnet	[production]
13:00	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1258 (re)pooling @ 30%: Repooling', diff saved to https://phabricator.wikimedia.org/P76146 and previous config saved to /var/cache/conftool/dbconfig/20250514-130004-root.json	[production]
12:55	<brouberol@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
12:54	<brouberol@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
12:44	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1258 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P76145 and previous config saved to /var/cache/conftool/dbconfig/20250514-124458-root.json	[production]
12:29	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1258 (re)pooling @ 20%: Repooling', diff saved to https://phabricator.wikimedia.org/P76144 and previous config saved to /var/cache/conftool/dbconfig/20250514-122952-root.json	[production]
12:28	<joal@deploy1003>	Finished deploy [analytics/refinery@9d620d0] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@9d620d06] (duration: 00m 46s)	[production]
12:28	<joal@deploy1003>	Started deploy [analytics/refinery@9d620d0] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@9d620d06]	[production]
12:27	<joal@deploy1003>	Finished deploy [analytics/refinery@9d620d0] (thin): Analytics webrequest migration THIN [analytics/refinery@9d620d06] (duration: 01m 35s)	[production]
12:26	<joal@deploy1003>	Started deploy [analytics/refinery@9d620d0] (thin): Analytics webrequest migration THIN [analytics/refinery@9d620d06]	[production]
12:25	<joal@deploy1003>	Finished deploy [analytics/refinery@9d620d0]: Regular analytics weekly train [analytics/refinery@9d620d06] (duration: 02m 17s)	[production]
12:23	<joal@deploy1003>	Started deploy [analytics/refinery@9d620d0]: Regular analytics weekly train [analytics/refinery@9d620d06]	[production]
12:14	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1258 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P76143 and previous config saved to /var/cache/conftool/dbconfig/20250514-121446-root.json	[production]
11:47	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Remove db2243 from s8 (T351820)', diff saved to https://phabricator.wikimedia.org/P76142 and previous config saved to /var/cache/conftool/dbconfig/20250514-114724-ladsgroup.json	[production]
11:47	<moritzm>	installing librabbitmq securit updates	[production]
11:41	<ladsgroup@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1145844\|Move production term store traffic to x3 (T351820)]] (duration: 20m 48s)	[production]
11:41	<stevemunene@cumin1002>	START - Cookbook sre.hosts.reboot-single for host an-worker1068.eqiad.wmnet	[production]
11:38	<jmm@cumin2002>	END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad	[production]
11:35	<ladsgroup@deploy1003>	ladsgroup: Continuing with sync	[production]
11:27	<ladsgroup@deploy1003>	ladsgroup: Backport for [[gerrit:1145844\|Move production term store traffic to x3 (T351820)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
11:21	<ladsgroup@deploy1003>	Started scap sync-world: Backport for [[gerrit:1145844\|Move production term store traffic to x3 (T351820)]]	[production]
11:17	<kcvelaga@deploy1003>	Finished deploy [airflow-dags/analytics_product@22aa307]: T393561 (duration: 01m 10s)	[production]
11:17	<kcvelaga@deploy1003>	Started deploy [airflow-dags/analytics_product@22aa307]: T393561	[production]
11:15	<jmm@cumin2002>	START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad	[production]
11:15	<mvolz@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/citoid: apply	[production]
11:14	<mvolz@deploy1003>	helmfile [eqiad] START helmfile.d/services/citoid: apply	[production]
11:13	<mvolz@deploy1003>	helmfile [codfw] DONE helmfile.d/services/citoid: apply	[production]
11:12	<mvolz@deploy1003>	helmfile [codfw] START helmfile.d/services/citoid: apply	[production]
11:12	<btullis@cumin1002>	END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling restart_daemons on A:cephosd	[production]
11:10	<mvolz@deploy1003>	helmfile [staging] DONE helmfile.d/services/citoid: apply	[production]
11:10	<mvolz@deploy1003>	helmfile [staging] START helmfile.d/services/citoid: apply	[production]
11:03	<dcausse@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
11:01	<dcausse@deploy1003>	helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply	[production]
10:50	<dcausse@deploy1003>	helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
10:49	<dcausse@deploy1003>	helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply	[production]
10:47	<btullis@cumin1002>	START - Cookbook sre.ceph.roll-restart-reboot-server rolling restart_daemons on A:cephosd	[production]
10:44	<dcausse@deploy1003>	helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
10:43	<dcausse@deploy1003>	helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply	[production]
10:41	<dcausse@deploy1003>	helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
10:41	<dcausse@deploy1003>	helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply	[production]