production SAL

2051-2100 of 10000 results (29ms)

2020-06-01 §
09:05	<filippo@cumin1001>	START - Cookbook sre.hosts.decommission	[production]
09:04	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)	[production]
09:03	<filippo@cumin1001>	START - Cookbook sre.hosts.decommission	[production]
08:58	<godog>	prometheus eqiad lvextend --resizefs --size +100G vg-ssd/prometheus-ops	[production]
08:43	<mutante>	deneb - apt-get remove --purge apt-listchanges (packages was in status "rc" causing DPKG alert, should be removed but config was not purged)	[production]
08:41	<mutante>	deneb - apt-get remove python3-debconf (package was in status "ri" causing DPKG icinga alert. ri means it should be removed but is not)	[production]
08:33	<XioNoX>	restart cr1-codfw:fpc0 - T254110	[production]
08:22	<mutante>	mw1331 re-enabled puppet (SAL told me about an experiment a little while ago)	[production]
08:19	<jynus>	disabling puppet on all db/es/pc hosts for deploy of gerrit:599596	[production]
07:05	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1142 to clone db1147 T252512', diff saved to https://phabricator.wikimedia.org/P11339 and previous config saved to /var/cache/conftool/dbconfig/20200601-070519-marostegui.json	[production]
05:03	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool enwiki db2071 slave to test new index - T238966', diff saved to https://phabricator.wikimedia.org/P11338 and previous config saved to /var/cache/conftool/dbconfig/20200601-050354-marostegui.json	[production]
04:54	<marostegui>	Drop testreduce_0715 from m5 master T245408	[production]
04:44	<marostegui>	Depool db1141 from Analytics role - T249188	[production]
2020-05-31 §
09:56	<Urbanecm>	mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki 'Vox Golf' 'Colonel Chicken' (T254068)	[production]
2020-05-29 §
22:32	<bstorm_>	updated views on labsdb1010 T252219	[production]
20:55	<bstorm_>	updating views on labsdb1011 T252219	[production]
19:27	<ryankemper>	Successfully finished a rolling restart of the `cloudelastic` clusters (chi, psi, omega) as part of elasticsearch plugins upgrade. Host and service checks re-enabled.	[production]
17:28	<bstorm_>	updating views on labsdb1009 T252219	[production]
16:50	<ryankemper>	Performing a rolling restart of the `cloudelastic` clusters (chi, psi, omega) as part of elasticsearch plugins upgrade. Host and service checks disabled.	[production]
16:00	<bstorm_>	Updating views on labsdb1012 T252219	[production]
15:59	<ryankemper>	Concluded rolling restart of the `relforge` clusters as part of elasticsearch plugins upgrade. Both hosts `relforge1001` and `relforge1002` are back up. Downtime lifted.	[production]
15:29	<ryankemper>	Performing a rolling restart of the `relforge` clusters as part of elasticsearch plugins upgrade	[production]
14:59	<cdanis>	disabling puppet on netflow* to deploy Ic71e96f0 T253128	[production]
14:47	<akosiaris@deploy1001>	helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .	[production]
14:47	<akosiaris@deploy1001>	helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .	[production]
14:41	<akosiaris@deploy1001>	helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .	[production]
14:41	<akosiaris@deploy1001>	helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .	[production]
14:35	<akosiaris@deploy1001>	helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .	[production]
14:35	<akosiaris@deploy1001>	helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .	[production]
14:27	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:24	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:15	<mdholloway>	ran extensions/MachineVision/maintenance/removeBlacklistedSuggestions.php on commonswiki (T253821)	[production]
12:49	<hnowlan>	reimaging restbase2009 after disk replacement	[production]
12:37	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
12:35	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
12:15	<godog>	roll-restart to upgrade thanos to 0.13.0rc0 - T252186 T233956	[production]
11:32	<moritzm>	installing cups security updates (client-side libs/tools)	[production]
11:01	<ema>	upload prometheus-rdkafka-exporter 0.2 to buster-wikimedia T253551	[production]
10:53	<moritzm>	updating mwdebug2002 to 7.2.31	[production]
10:02	<marostegui>	Compress InnoDB on db1138 T232446	[production]
08:30	<godog>	update swift uid/gid on thanos hosts - T123918	[production]
08:04	<mutante>	phabricator - restarted apache2 - back for me now	[production]
08:03	<XioNoX>	add new AMS-IX link to LACP bundle	[production]
08:01	<mutante>	phabricator - broken due to "PhabricatorRepositoryMirrorEngine::pushToGitRepository" starting git process that uses 100% CPU, stopped phd service	[production]
07:56	<mutante>	phabricator - killed pid 25070 (git) which used 100% of CPU, restarted phd service	[production]
07:25	<moritzm>	updating perf on buster systems to new version from 10.4 point release	[production]
07:15	<moritzm>	installing el-api update from latest Buster point release	[production]
07:12	<moritzm>	installing xdg-utils update from latest Buster point release	[production]
07:11	<mutante>	mw1293 (canary jobrunner ) replace apache2.conf with version from mwdebug1001, restart apache, to debug for T190111	[production]
07:00	<moritzm>	installing rake security updates	[production]