__all__ SAL

2401-2450 of 10000 results (48ms)

2021-04-28 §
11:06	<dcaro>	All ceph server side upgraded to Octopus! \o/ (T280641)	[admin]
11:05	<aborrero@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: REIMAGE	[production]
11:03	<aborrero@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1001.eqiad.wmnet with reason: REIMAGE	[production]
11:03	<aborrero@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: REIMAGE	[production]
11:01	<aborrero@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1001.eqiad.wmnet with reason: REIMAGE	[production]
10:57	<dcaro>	Got a PG getting stuck on 'remapping' after the OSD came up, had to unset the norebalance and then set it again to get it unstuck (T280641)	[admin]
10:44	<jbond42>	updated the check-raid nrpe script to python3	[production]
10:34	<dcaro>	Slow/blocked opns from cloudcephmon03, "osd_failure(failed timeout osd.32..." (cloudcephosd1005), unset the cluster noout/norebalance and went away in a few secs, setting it again and continuing... (T280641)	[admin]
09:40	<moritzm>	restarting Tomcat on idp-test1001 to pick up Java security updates	[production]
09:21	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 100%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P15618 and previous config saved to /var/cache/conftool/dbconfig/20210428-092103-root.json	[production]
09:19	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint1001.wikimedia.org	[production]
09:12	<jmm@cumin2001>	START - Cookbook sre.hosts.reboot-single for host contint1001.wikimedia.org	[production]
09:09	<moritzm>	restarting jenkins* on releases to pick up Java security updates	[production]
09:08	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint2001.wikimedia.org	[production]
09:06	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 75%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P15617 and previous config saved to /var/cache/conftool/dbconfig/20210428-090559-root.json	[production]
09:03	<dcaro>	Waiting for slow heartbeats from osd.58(cloudcephosd1002) to recover... (T280641)	[admin]
08:59	<dcaro>	During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58, currently on cloudcephosd1002 (T280641)	[admin]
08:59	<jmm@cumin2001>	START - Cookbook sre.hosts.reboot-single for host contint2001.wikimedia.org	[production]
08:58	<dcaro>	During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58 (T280641)	[admin]
08:58	<dcaro>	During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) (T280641)	[admin]
08:50	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 50%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P15616 and previous config saved to /var/cache/conftool/dbconfig/20210428-085056-root.json	[production]
08:42	<urbanecm@deploy1002>	Synchronized wmf-config/InterwikiSortOrders.php: 96ad0d4ad294c442b4936a63ae1cd9de9c098aa9: Add alt, bcl, diq, mad, mni, mnw, nia, skr, tay and trv to InterwikiSortOrders (duration: 01m 08s)	[production]
08:41	<urbanecm@deploy1002>	sync-file aborted: 96ad0d4ad294c442b4936a63ae1cd9de9c098aa9: Add alt, bcl, diq, mad, mni, mnw, nia, skr, tay and trv to InterwikiSortOrders (duration: 00m 02s)	[production]
08:36	<marostegui@cumin1001>	dbctl commit (dc=all): 'Fully repool db1098:3317', diff saved to https://phabricator.wikimedia.org/P15615 and previous config saved to /var/cache/conftool/dbconfig/20210428-083625-marostegui.json	[production]
08:35	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 25%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P15614 and previous config saved to /var/cache/conftool/dbconfig/20210428-083552-root.json	[production]
08:34	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 25%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P15613 and previous config saved to /var/cache/conftool/dbconfig/20210428-083458-root.json	[production]
08:26	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 100%: Repool db1098:3317', diff saved to https://phabricator.wikimedia.org/P15612 and previous config saved to /var/cache/conftool/dbconfig/20210428-082625-root.json	[production]
08:25	<effie>	update php7.2 on jobrunners and parsoid servers && rolling php7.2-fpm restarts	[production]
08:21	<dcaro>	Upgrading all the ceph osds on eqiad (T280641)	[admin]
08:21	<dcaro>	The clock skew seems intermittent, there's another task to follw it T275860 (T280641)	[admin]
08:18	<dcaro>	All equiad ceph mons and mgrs upgraded (T280641)	[admin]
08:18	<dcaro>	During the upgrade, ceph detected a clock skew on cloudcephmon1002, cloudcephmon1001, they are back (T280641)	[admin]
08:15	<dcaro>	During the upgrade, ceph detected a clock skew on cloudcephmon1002, it went away, I'm guessing systemd-timesyncd fixed it (T280641)	[admin]
08:14	<dcaro>	During the upgrade, ceph detected a clock skew on cloudcephmon1002, looking (T280641)	[admin]
08:11	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 75%: Repool db1098:3317', diff saved to https://phabricator.wikimedia.org/P15611 and previous config saved to /var/cache/conftool/dbconfig/20210428-081121-root.json	[production]
07:58	<dcaro>	Upgrading ceph services on eqiad, starting with mons/managers (T280641)	[admin]
07:56	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 50%: Repool db1098:3317', diff saved to https://phabricator.wikimedia.org/P15610 and previous config saved to /var/cache/conftool/dbconfig/20210428-075618-root.json	[production]
07:52	<effie>	update php7.2 on api servers && rolling php7.2-fpm restarts	[production]
07:41	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 25%: Repool db1098:3317', diff saved to https://phabricator.wikimedia.org/P15609 and previous config saved to /var/cache/conftool/dbconfig/20210428-074114-root.json	[production]
07:40	<marostegui>	Deploy schema change on db1098:3316 and db1098:3316 T266486 T268392 T273360	[production]
07:27	<effie>	update php7.2 on appservers && rolling php7.2-fpm restarts	[production]
07:26	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1098 for schema change and kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15608 and previous config saved to /var/cache/conftool/dbconfig/20210428-072609-marostegui.json	[production]
07:26	<hashar>	contint2001: sudo -u jenkins find quibble -path '/archive/log/rawSeleniumVideoGrabs/' -delete # T249268	[releng]
07:26	<hashar>	contint2001: sudo -u jenkins find quibble -path '/archive/log/rawSeleniumVideoGrabs/' -delete	[releng]
07:19	<elukey@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
07:19	<hashar>	contint2001: sudo -u jenkins find /srv/jenkins/builds/mediawiki-fresnel-patch-docker -name "*trace.json" -exec gzip {} \+ # T249268	[releng]
07:14	<elukey@cumin1001>	START - Cookbook sre.dns.netbox	[production]
07:12	<elukey>	add AAAA record for kafka-main200[3,4,5].codfw.wmnet	[production]
07:10	<elukey@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
07:05	<elukey@cumin1001>	START - Cookbook sre.dns.netbox	[production]