production SAL

3701-3750 of 10000 results (55ms)

2022-01-21 §
11:38	<elukey@deploy1002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.	[production]
11:34	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
11:34	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
11:31	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
11:31	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
11:18	<hnowlan@cumin1001>	START - Cookbook sre.hosts.reimage for host restbase1016.eqiad.wmnet with OS buster	[production]
11:18	<hnowlan@cumin1001>	START - Cookbook sre.hosts.reimage for host restbase2024.codfw.wmnet with OS buster	[production]
11:17	<hnowlan@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=restbase2023.codfw.wmnet	[production]
11:15	<vgutierrez>	pool cp3063 running envoy as TLS termination layer - T271421	[production]
11:14	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2023.codfw.wmnet with OS buster	[production]
10:58	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3063.esams.wmnet with OS buster	[production]
10:33	<moritzm>	migrate primary/secondary instances off ganeti1013	[production]
10:14	<moritzm>	switch kubetcd1006 back to plain disks	[production]
10:14	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1006.eqiad.wmnet with reason: Switch back to plain disks	[production]
10:14	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1006.eqiad.wmnet with reason: Switch back to plain disks	[production]
10:09	<moritzm>	switch kubetcd1005 back to plain disks	[production]
10:08	<hnowlan@cumin1001>	START - Cookbook sre.hosts.reimage for host restbase2023.codfw.wmnet with OS buster	[production]
10:07	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: Switch back to plain disks	[production]
10:07	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: Switch back to plain disks	[production]
09:51	<moritzm>	switch kubetcd1004 back to plain disks	[production]
09:50	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1004.eqiad.wmnet with reason: Switch back to plain disks	[production]
09:50	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1004.eqiad.wmnet with reason: Switch back to plain disks	[production]
09:41	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.reimage for host cp3063.esams.wmnet with OS buster	[production]
09:40	<vgutierrez@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3063.esams.wmnet with OS buster	[production]
09:31	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1032 (re)pooling @ 100%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18970 and previous config saved to /var/cache/conftool/dbconfig/20220121-093120-root.json	[production]
09:19	<jayme@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
09:19	<jayme@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
09:16	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1032 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18969 and previous config saved to /var/cache/conftool/dbconfig/20220121-091617-root.json	[production]
09:11	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
09:07	<ayounsi@cumin1001>	START - Cookbook sre.dns.netbox	[production]
09:06	<ayounsi@cumin1001>	END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)	[production]
09:06	<ayounsi@cumin1001>	START - Cookbook sre.dns.netbox	[production]
09:04	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.reimage for host cp3063.esams.wmnet with OS buster	[production]
09:01	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1032 (re)pooling @ 60%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18968 and previous config saved to /var/cache/conftool/dbconfig/20220121-090113-root.json	[production]
09:00	<vgutierrez>	depool cp3063 to be reimaged as cache::upload_envoy - T271421	[production]
08:46	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1032 (re)pooling @ 50%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18967 and previous config saved to /var/cache/conftool/dbconfig/20220121-084609-root.json	[production]
08:37	<jmm@cumin2002>	END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1018.eqiad.wmnet to ganeti01.svc.eqiad.wmnet	[production]
08:35	<jmm@cumin2002>	START - Cookbook sre.ganeti.addnode for new host ganeti1018.eqiad.wmnet to ganeti01.svc.eqiad.wmnet	[production]
08:31	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1018.eqiad.wmnet	[production]
08:31	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1032 (re)pooling @ 40%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18966 and previous config saved to /var/cache/conftool/dbconfig/20220121-083106-root.json	[production]
08:27	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host ganeti1018.eqiad.wmnet	[production]
08:16	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1032 (re)pooling @ 25%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18965 and previous config saved to /var/cache/conftool/dbconfig/20220121-081602-root.json	[production]
08:00	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1032 (re)pooling @ 20%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18964 and previous config saved to /var/cache/conftool/dbconfig/20220121-080058-root.json	[production]
07:58	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18963 and previous config saved to /var/cache/conftool/dbconfig/20220121-075801-root.json	[production]
07:45	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1032 (re)pooling @ 10%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18962 and previous config saved to /var/cache/conftool/dbconfig/20220121-074555-root.json	[production]
07:42	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18961 and previous config saved to /var/cache/conftool/dbconfig/20220121-074257-root.json	[production]
07:30	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1032 (re)pooling @ 5%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18960 and previous config saved to /var/cache/conftool/dbconfig/20220121-073051-root.json	[production]
07:30	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1032.eqiad.wmnet with OS bullseye	[production]
07:27	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1022 (re)pooling @ 60%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18959 and previous config saved to /var/cache/conftool/dbconfig/20220121-072754-root.json	[production]
07:26	<elukey>	elukey@stat1007:~$ sudo systemctl reset-failed product-analytics-movement-metrics.service	[production]