production SAL

751-800 of 10000 results (26ms)

2021-11-24 §
23:44	<mutante>	puppetmaster1001:~] $ sudo puppet cert sign gitlab-runner1001.eqiad.wmnet \| sudo install_console gitlab-runner1001.eqiad.wmnet (T295481)	[production]
23:26	<mutante>	ganeti - bringing up new VM - sudo gnt-instance start gitlab-runner1001.eqiad.wmnet ; ran puppet on install1003; installing OS T295481	[production]
23:22	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host elastic2065.codfw.wmnet with OS buster	[production]
23:11	<pt1979@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2064.codfw.wmnet with OS buster	[production]
23:09	<mutante>	mwmaint1002 - sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size 1M -delete - to fix Icinga alert about large files in client bucket	[production]
23:08	<dzahn@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host gitlab-runner1001.eqiad.wmnet	[production]
23:03	<mutante>	wcqs1001 - sudo systemctl restart wcqs-blazegraph - after <+jinxer-wm> (BlazegraphFreeAllocatorsDecreasingRapidly) firing: Blazegraph instance wcqs1001:9195 is burning free allocators	[production]
22:52	<dzahn@cumin1001>	START - Cookbook sre.ganeti.makevm for new host gitlab-runner1001.eqiad.wmnet	[production]
22:50	<mutante>	Creating a new Ganeti VM and wondering which row to put it? [ganeti1009:~] $ for row in A B C D; do echo "row ${row}: $(sudo gnt-instance list -o name -F "pnode.group == 'row_${row}'" \| wc -l) VMs"; done	[production]
22:43	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts gitlab-runner1001.wikimedia.org	[production]
22:41	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host elastic2064.codfw.wmnet with OS buster	[production]
22:39	<pt1979@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2063.codfw.wmnet with OS buster	[production]
22:38	<mutante>	running decom cookbook on gitlab-runner1001.wikimedia.org VM which was in state "ADMIN_down" and not used yet. to make room to recreate it as gitlab-runner1001.eqiad.wmnet T295481	[production]
22:36	<dzahn@cumin1001>	START - Cookbook sre.hosts.decommission for hosts gitlab-runner1001.wikimedia.org	[production]
22:08	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host elastic2063.codfw.wmnet with OS buster	[production]
22:03	<pt1979@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2062.codfw.wmnet with OS buster	[production]
21:40	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
21:37	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
21:35	<legoktm@deploy1002>	Synchronized wmf-config/: Improve docs on $wmgUseGlobalAbuseFilters and sort list of wikis (duration: 00m 57s)	[production]
21:33	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host elastic2062.codfw.wmnet with OS buster	[production]
21:21	<pt1979@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2061.codfw.wmnet with OS buster	[production]
21:00	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
20:58	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
20:54	<legoktm@deploy1002>	Synchronized wmf-config/: Update configuration related to disabling Score functionality (duration: 00m 57s)	[production]
20:51	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host elastic2061.codfw.wmnet with OS buster	[production]
19:48	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1144:3314 (T296143)', diff saved to https://phabricator.wikimedia.org/P17834 and previous config saved to /var/cache/conftool/dbconfig/20211124-194857-ladsgroup.json	[production]
19:38	<razzi>	`sudo maintain-views --all-databases --replace-all` on clouddb1018 for T292594	[production]
19:33	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1144:3314 (T296143)', diff saved to https://phabricator.wikimedia.org/P17833 and previous config saved to /var/cache/conftool/dbconfig/20211124-193352-ladsgroup.json	[production]
19:19	<razzi>	run `maintain-views --all-databases --replace-all` on clouddb1013 for T292594	[production]
19:18	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1144:3314 (T296143)', diff saved to https://phabricator.wikimedia.org/P17832 and previous config saved to /var/cache/conftool/dbconfig/20211124-191847-ladsgroup.json	[production]
19:03	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1144:3314 (T296143)', diff saved to https://phabricator.wikimedia.org/P17831 and previous config saved to /var/cache/conftool/dbconfig/20211124-190343-ladsgroup.json	[production]
18:57	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ncredir2002.codfw.wmnet	[production]
18:51	<vgutierrez@cumin1001>	START - Cookbook sre.ganeti.reboot-vm for VM ncredir2002.codfw.wmnet	[production]
18:48	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ncredir2001.codfw.wmnet	[production]
18:43	<vgutierrez@cumin1001>	START - Cookbook sre.ganeti.reboot-vm for VM ncredir2001.codfw.wmnet	[production]
18:42	<vgutierrez@cumin1001>	END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM ncredir2001.codfw.wmnet	[production]
18:42	<vgutierrez@cumin1001>	START - Cookbook sre.ganeti.reboot-vm for VM ncredir2001.codfw.wmnet	[production]
18:42	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief-test2001.codfw.wmnet	[production]
18:36	<vgutierrez@cumin1001>	START - Cookbook sre.ganeti.reboot-vm for VM acmechief-test2001.codfw.wmnet	[production]
18:36	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief2001.codfw.wmnet	[production]
18:30	<vgutierrez@cumin1001>	START - Cookbook sre.ganeti.reboot-vm for VM acmechief2001.codfw.wmnet	[production]
17:47	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1144:3314 (T296143)', diff saved to https://phabricator.wikimedia.org/P17830 and previous config saved to /var/cache/conftool/dbconfig/20211124-174723-ladsgroup.json	[production]
17:47	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1144.eqiad.wmnet with reason: Maintenance T296143	[production]
17:47	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1144.eqiad.wmnet with reason: Maintenance T296143	[production]
17:46	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1143 (T296143)', diff saved to https://phabricator.wikimedia.org/P17829 and previous config saved to /var/cache/conftool/dbconfig/20211124-174615-ladsgroup.json	[production]
17:35	<ladsgroup@deploy1002>	Synchronized php-1.38.0-wmf.9/includes/libs/rdbms/: Backport: [[gerrit:741134\|rdbms: Add full query to transaction profiler (T295706)]] (duration: 00m 56s)	[production]
17:34	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
17:34	<jhathaway@cumin1001>	conftool action : set/pooled=true; selector: name=codfw,dnsdisc=puppetboard	[production]
17:31	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
17:31	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1143 (T296143)', diff saved to https://phabricator.wikimedia.org/P17828 and previous config saved to /var/cache/conftool/dbconfig/20211124-173110-ladsgroup.json	[production]