production SAL

7551-7600 of 10000 results (118ms)

2024-09-19 §
09:44	<jiji@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/ipoid: apply	[production]
09:44	<jiji@deploy1003>	helmfile [eqiad] START helmfile.d/services/ipoid: apply	[production]
09:42	<jiji@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/ipoid: apply	[production]
09:42	<jiji@deploy1003>	helmfile [eqiad] START helmfile.d/services/ipoid: apply	[production]
09:42	<jiji@deploy1003>	helmfile [codfw] DONE helmfile.d/services/ipoid: apply	[production]
09:41	<jiji@deploy1003>	helmfile [codfw] START helmfile.d/services/ipoid: apply	[production]
09:23	<arnaudb@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1246.eqiad.wmnet with OS bookworm	[production]
09:23	<btullis@cumin1002>	START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster	[production]
09:10	<arnaudb@cumin1002>	START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm	[production]
09:10	<arnaudb@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1246.eqiad.wmnet with OS bookworm	[production]
08:51	<arnaudb@cumin1002>	START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm	[production]
08:50	<arnaudb@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1246.eqiad.wmnet with OS bookworm	[production]
08:45	<hashar>	Restarting CI Jenkins with Java 17 # T359795	[production]
08:31	<arnaudb@cumin1002>	START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm	[production]
08:31	<_joe_>	deployed conftool 3.2.4 T375059 T373449	[production]
08:30	<ayounsi@cumin1002>	END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox	[production]
08:29	<ayounsi@cumin1002>	START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox	[production]
08:16	<jnuche@deploy1003>	rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.23 refs T373642	[production]
08:04	<ayounsi@cumin1002>	END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary	[production]
08:04	<ayounsi@cumin1002>	START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary	[production]
07:59	<jmm@cumin2002>	END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2018.codfw.wmnet	[production]
07:48	<jmm@cumin2002>	START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2018.codfw.wmnet	[production]
07:43	<ayounsi@cumin1002>	END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 16509	[production]
07:38	<ayounsi@cumin1002>	START - Cookbook sre.network.peering with action 'configure' for AS: 16509	[production]
07:35	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db2229 (re)pooling @ 100%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69318 and previous config saved to /var/cache/conftool/dbconfig/20240919-073543-arnaudb.json	[production]
07:20	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db2229 (re)pooling @ 75%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69317 and previous config saved to /var/cache/conftool/dbconfig/20240919-072037-arnaudb.json	[production]
07:19	<ayounsi@cumin1002>	START - Cookbook sre.network.peering with action 'configure' for AS: 16509	[production]
07:05	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db2229 (re)pooling @ 50%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69316 and previous config saved to /var/cache/conftool/dbconfig/20240919-070532-arnaudb.json	[production]
06:53	<moritzm>	adding Tiziano to pwstore	[production]
06:50	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db2229 (re)pooling @ 25%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69315 and previous config saved to /var/cache/conftool/dbconfig/20240919-065026-arnaudb.json	[production]
06:47	<moritzm>	cleanup some old Bacula restores (4G) on seaborgium	[production]
06:35	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db2229 (re)pooling @ 15%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69314 and previous config saved to /var/cache/conftool/dbconfig/20240919-063521-arnaudb.json	[production]
06:20	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db2229 (re)pooling @ 10%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69313 and previous config saved to /var/cache/conftool/dbconfig/20240919-062016-arnaudb.json	[production]
06:05	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db2229 (re)pooling @ 5%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69312 and previous config saved to /var/cache/conftool/dbconfig/20240919-060510-arnaudb.json	[production]
05:01	<eileen>	civicrm upgraded from ac29ff45 to 8af371aa	[production]
01:25	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
01:25	<pt1979@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns name for frack new switches - pt1979@cumin2002"	[production]
01:24	<pt1979@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns name for frack new switches - pt1979@cumin2002"	[production]
01:21	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
00:46	<sukhe>	sudo cumin 'puppetserver1003* or puppetserver2003*' 'systemctl start sync-puppet-volatile.service'	[production]
00:45	<sukhe>	sukhe@puppetserver1002:~$ sudo systemctl start sync-puppet-volatile.service	[production]
00:41	<swfrench-wmf>	force-reboot of puppetserver1001 via ipmitool (unresponsive for over 30m)	[production]
2024-09-18 §
22:43	<swfrench@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync	[production]
22:43	<swfrench@deploy1003>	helmfile [eqiad] START helmfile.d/services/eventstreams: sync	[production]
22:19	<jynus>	inserting without binlog missing heartbeat reecod on x1 codfw hosts	[production]
22:11	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from eqiad to codfw	[production]
21:55	<mutante>	seaborgium - apt-get clean (disk space before: 98% used, now: 76% used, was alerting)	[production]
20:59	<ladsgroup@cumin1002>	START - Cookbook sre.switchdc.databases.prepare for the switch from eqiad to codfw	[production]
20:45	<toyofuku@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1073836\|Enable dark mode for all logged in users on all projects (T370099)]], [[gerrit:1073835\|Deploy Vector 2022 on several Wikimedia wikis (T374255)]], [[gerrit:1073839\|Limit quick surveys to wikis with messages defined (T374654)]] (duration: 12m 52s)	[production]
20:40	<toyofuku@deploy1003>	toyofuku, jdlrobson: Continuing with sync	[production]