__all__ SAL

4001-4050 of 10000 results (56ms)

2022-05-12 §
19:57	<hashar>	Restarting Gerrit	[production]
19:53	<mutante>	gitlab2001 - systemctl start backup-restore - systemd[1]: Started GitLab Backup Restore. after gerrit:791410 for T308089	[production]
19:53	<hashar>	gerrit: triggering full replication to gerrit2001 to test T307137	[releng]
19:36	<wm-bot2>	Set cloudvirt 'cloudvirt1022.eqiad.wmnet' maintenance. - cookbook ran by andrew@buster	[admin]
19:35	<wm-bot2>	Draining 'cloudvirt1022.eqiad.wmnet'. - cookbook ran by andrew@buster	[admin]
19:35	<wm-bot2>	Safe rebooting 'cloudvirt1022.eqiad.wmnet'. - cookbook ran by andrew@buster	[admin]
18:57	<jelto>	restart gitlab2001	[production]
18:30	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
18:26	<krinkle@deploy1002>	Synchronized w/static.php: Ic0a5eae4f721a16403071d1b2136cf23d78e4fa9 (duration: 00m 49s)	[production]
18:26	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4001.ulsfo.wmnet with OS bullseye	[production]
18:26	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
18:26	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
18:25	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
18:11	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4001.ulsfo.wmnet with reason: host reimage	[production]
18:08	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4001.ulsfo.wmnet with reason: host reimage	[production]
17:52	<cmooney@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
17:51	<robh@cumin1001>	START - Cookbook sre.hosts.reimage for host ganeti4001.ulsfo.wmnet with OS bullseye	[production]
17:50	<jgiannelos@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mobileapps: apply	[production]
17:50	<razzi@deploy1002>	Finished deploy [analytics/turnilo/deploy@5047d7d]: (no justification provided) (duration: 00m 08s)	[production]
17:50	<razzi@deploy1002>	Started deploy [analytics/turnilo/deploy@5047d7d]: (no justification provided)	[production]
17:50	<razzi@deploy1002>	Finished deploy [analytics/turnilo/deploy@9cfdfaf]: (no justification provided) (duration: 29m 32s)	[production]
17:50	<jgiannelos@deploy1002>	helmfile [codfw] START helmfile.d/services/mobileapps: apply	[production]
17:47	<jgiannelos@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply	[production]
17:46	<jgiannelos@deploy1002>	helmfile [eqiad] START helmfile.d/services/mobileapps: apply	[production]
17:45	<jgiannelos@deploy1002>	helmfile [staging] DONE helmfile.d/services/mobileapps: apply	[production]
17:44	<jgiannelos@deploy1002>	helmfile [staging] START helmfile.d/services/mobileapps: apply	[production]
17:43	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
17:31	<klausman@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ores1006.eqiad.wmnet with OS buster	[production]
17:26	<jmm@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti4001.ulsfo.wmnet with OS bullseye	[production]
17:21	<razzi@deploy1002>	Started deploy [analytics/turnilo/deploy@9cfdfaf]: (no justification provided)	[production]
17:08	<jmm@cumin1001>	START - Cookbook sre.hosts.reimage for host ganeti4001.ulsfo.wmnet with OS bullseye	[production]
17:00	<klausman@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ores1006.eqiad.wmnet with reason: host reimage	[production]
16:57	<klausman@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on ores1006.eqiad.wmnet with reason: host reimage	[production]
16:53	<razzi@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Attempting OS upgrade	[production]
16:53	<razzi@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Attempting OS upgrade	[production]
16:35	<klausman@cumin1001>	START - Cookbook sre.hosts.reimage for host ores1006.eqiad.wmnet with OS buster	[production]
16:22	<TheresNoTime>	Deployed b30b346 & restarted SULWatcher	[tools.stewardbots]
16:21	<mutante>	gitlab2001 - trying to stop 'puma' for debugging T308089	[production]
16:14	<cmooney@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
16:07	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
16:06	<cmooney@cumin1001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
16:05	<andrew@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host labstore1006.wikimedia.org	[production]
16:00	<hashar>	contint2001 and contint1001 now automatically run `docker system prune --force` every day and `docker system prune --force` on Sunday \| https://gerrit.wikimedia.org/r/c/operations/puppet/+/773784/	[releng]
15:57	<andrew@cumin1001>	START - Cookbook sre.hosts.reboot-single for host labstore1006.wikimedia.org	[production]
15:57	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
15:56	<andrew@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host labstore1007.wikimedia.org	[production]
15:53	<andrew@cumin1001>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host labstore1005.eqiad.wmnet	[production]
15:52	<TheresNoTime>	Deployed ef01194 (within the last hour)	[tools.stewardbots]
15:06	<razzi@cumin1001>	conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet	[production]
15:06	<andrewbogott>	stopping nfs-server on labstore1004 in preparation for reboot	[admin]