__all__ SAL

401-450 of 10000 results (38ms)

2021-03-03 §
17:29	<jayme@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1001.eqiad.wmnet with reason: REIMAGE	[production]
17:26	<andrewbogott>	resizing deployment-maps08 to (identical) flavor g2.cores4.ram8.disk80	[maps]
17:16	<andrewbogott>	restarting rabbitmq-server on cloudcontrol1003,1004,1005; trying to explain amqp errors in scheduler logs	[admin]
17:16	<dwisehaupt>	correction for last log with correct host - stopping mysql replication on frdb1004 and starting utf8mb4 table alters under a root screen session	[production]
17:15	<dwisehaupt>	stopping mysql replication on frdb2001 and starting utf8mb4 table alters under a root screen session	[production]
17:13	<otto@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Set destination_event_serivce: eventgate-main for rdf-streaming-updater streams - T273901 (duration: 01m 08s)	[production]
17:13	<elukey@cumin1001>	END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0)	[production]
17:10	<elukey>	update druid datasource on aqs (roll restart of aqs on aqs100*)	[analytics]
17:09	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps1009.eqiad.wmnet with reason: Resyncing database from scratch	[production]
17:09	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps1009.eqiad.wmnet with reason: Resyncing database from scratch	[production]
17:09	<elukey@cumin1001>	START - Cookbook sre.aqs.roll-restart	[production]
17:06	<razzi>	rebalance kafka partitions for webrequest_upload partition 8	[analytics]
16:49	<James_F>	Zuul: [mediawiki/extensions/DiscussionTools] Run phan with Echo	[releng]
16:40	<dzahn@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
16:36	<dzahn@cumin1001>	START - Cookbook sre.dns.netbox	[production]
16:33	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts gitlab1001.eqiad.wmnet	[production]
16:30	<dzahn@cumin1001>	START - Cookbook sre.hosts.decommission for hosts gitlab1001.eqiad.wmnet	[production]
16:29	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts gitlab1002.eqiad.wmnet	[production]
16:28	<otto@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: canary_events_enabled: true for rdf-streaming-updater streams - T273901 (duration: 01m 49s)	[production]
16:26	<mutante>	deleting gitlab VMs - we have to start over and decom old VMs, then create new VMs with public IPs (T274459)	[production]
16:25	<dzahn@cumin1001>	START - Cookbook sre.hosts.decommission for hosts gitlab1002.eqiad.wmnet	[production]
16:23	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on gitlab1002.eqiad.wmnet with reason: decom	[production]
16:23	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on gitlab1002.eqiad.wmnet with reason: decom	[production]
16:23	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on gitlab1001.eqiad.wmnet with reason: decom	[production]
16:23	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on gitlab1001.eqiad.wmnet with reason: decom	[production]
16:18	<jayme@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubestagetcd1006.eqiad.wmnet	[production]
16:15	<jayme@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubestagetcd1005.eqiad.wmnet	[production]
16:14	<jayme@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kubestagetcd1006.eqiad.wmnet	[production]
16:12	<jayme@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kubestagetcd1005.eqiad.wmnet	[production]
16:11	<jayme@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubestagetcd1004.eqiad.wmnet	[production]
16:09	<jayme@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kubestagetcd1004.eqiad.wmnet	[production]
16:07	<jayme@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts neon.eqiad.wmnet	[production]
16:05	<jayme@cumin1001>	START - Cookbook sre.hosts.decommission for hosts neon.eqiad.wmnet	[production]
16:03	<dcaro>	draining cloudvirt1022 for T275753	[admin]
16:03	<dcaro>	draining cloudvirt1022 for TT275753	[admin]
16:00	<arturo>	move cloudvirt1013 into the 'toobusy' host aggregate, it has 221% cpu subscription and 82% MEM subscription	[admin]
15:55	<aborrero@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt1021.eqiad.wmnet	[production]
15:34	<arturo>	rebooting cloudvirt1021 for T275753	[admin]
15:34	<aborrero@cumin1001>	START - Cookbook sre.hosts.reboot-single for host cloudvirt1021.eqiad.wmnet	[production]
15:27	<jayme>	staging.svc.eqiad.wmnet now (temporarily) points to the staging-codfw kubernetes cluster (during upgrade in eqiad)	[production]
15:27	<jayme@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'mathoid' for release 'staging' .	[production]
15:26	<jayme@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .	[production]
15:24	<jayme@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .	[production]
15:23	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1027.eqiad.wmnet	[production]
15:19	<liw@deploy1002>	Synchronized php: group1 wikis to 1.36.0-wmf.33 (duration: 01m 08s)	[production]
15:18	<jiji@cumin1001>	START - Cookbook sre.hosts.reboot-single for host mc1027.eqiad.wmnet	[production]
15:18	<liw@deploy1002>	rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.33	[production]
15:17	<arturo>	shutting down tools-sgebastion-07 in an attempt to fix nova state and finish hypervisor migration	[tools]
15:13	<urbanecm@deploy1002>	Synchronized php-1.36.0-wmf.33/extensions/CentralAuth/: af899b6818223928e2da421122c19e64126370da: Transform the first parameter to string (T276316) (duration: 01m 11s)	[production]
15:11	<arturo>	tools-sgebastion-07 triggered a neutron exception (unauthorized) while being live-migrated from cloudvirt1021 to 1029. Resetting nova state with `nova reset-state bd685d48-1011-404e-a755-372f6022f345 --active` and try again	[tools]