production SAL

9401-9450 of 10000 results (48ms)

2020-12-14 §
12:43	<lucaswerkmeister-wmde@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:643874\|Add log channel Wikibase.IdGenerator (T268625)]] (duration: 00m 54s)	[production]
12:39	<lucaswerkmeister-wmde@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:649307\|Enable QuickSurveys on commonswiki (T258419)]] (duration: 00m 55s)	[production]
12:09	<lucaswerkmeister-wmde@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:640934\|Add Media Search survey (T258419)]] (duration: 00m 55s)	[production]
11:34	<jdrewniak@deploy1001>	Synchronized portals: Wikimedia Portals Update: [[gerrit:649304\| Bumping portals to master (T128546)]] (duration: 00m 54s)	[production]
11:33	<jdrewniak@deploy1001>	Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:649304\| Bumping portals to master (T128546)]] (duration: 00m 56s)	[production]
10:34	<godog>	add 100G to prometheus 'global' in codfw	[production]
10:32	<akosiaris>	Adding kubernetes codfw staging cluster configuration to cr*-codfw	[production]
10:17	<marostegui>	Stop mysql on db2131 to clone db2142	[production]
10:16	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db2131 to clone db2142', diff saved to https://phabricator.wikimedia.org/P13542 and previous config saved to /var/cache/conftool/dbconfig/20201214-101611-marostegui.json	[production]
10:12	<ladsgroup@deploy1001>	Synchronized php-1.36.0-wmf.21/extensions/Wikibase/client/includes: [[gerrit:648283\|Avoid loading the whole item in every client page view (T269960)]] (duration: 00m 25s)	[production]
10:03	<ladsgroup@deploy1001>	Scap failed!: 4/9 canaries failed their endpoint checks(https://en.wikipedia.org)	[production]
09:51	<godog>	swift codfw-prod: more weight to ms-be20[58-61] - T269337	[production]
09:45	<aborrero@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on cloudvirt1024.eqiad.wmnet with reason: T269419	[production]
09:45	<aborrero@cumin1001>	START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on cloudvirt1024.eqiad.wmnet with reason: T269419	[production]
08:40	<godog>	swift eqiad-prod: add weight to ms-be106[0-3] - T268435	[production]
2020-12-11 §
22:05	<dduvall@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .	[production]
22:02	<dduvall@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
21:59	<dduvall@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
21:57	<akosiaris>	add docker-ce_18.06.3~ce~3-0~debian_amd64.deb to apt.wikimedia.org stretch-wikimedia/thirdparty/k8s	[production]
21:46	<Amir1>	Running schema changes on wikitech database for T269348	[production]
21:45	<akosiaris@deploy1001>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
21:42	<akosiaris@deploy1001>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
21:41	<dduvall@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
21:38	<dduvall@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
21:35	<akosiaris@deploy1001>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
21:33	<dduvall@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .	[production]
20:27	<razzi@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)	[production]
20:11	<otto@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Un-migrtate Growth EventLogging schema HomepageVisit back to EventLogging-backend on all wikis (this is a server side event which is not yet ready to migrate) - T267333 (duration: 00m 58s)	[production]
19:28	<razzi@cumin1001>	START - Cookbook sre.ganeti.makevm	[production]
19:18	<razzi@cumin1001>	END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97)	[production]
18:47	<razzi@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)	[production]
18:30	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=0) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
18:19	<elukey@cumin1001>	START - Cookbook sre.hadoop.upgrade-bigtop-distro for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
18:19	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
18:13	<elukey@cumin1001>	START - Cookbook sre.hadoop.stop-cluster for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
18:13	<mutante>	doc1001 restarted apache2 just in case after DOC_PATH change	[production]
17:53	<razzi@cumin1001>	START - Cookbook sre.hosts.decommission	[production]
17:52	<razzi@cumin1001>	START - Cookbook sre.ganeti.makevm	[production]
17:48	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=99) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
17:41	<elukey@cumin1001>	START - Cookbook sre.hadoop.upgrade-bigtop-distro for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
16:40	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=99) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
16:28	<elukey@cumin1001>	START - Cookbook sre.hadoop.upgrade-bigtop-distro for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
16:15	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
16:10	<elukey@cumin1001>	START - Cookbook sre.hadoop.stop-cluster for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
15:35	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE	[production]
15:33	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE	[production]
15:20	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=99) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
15:15	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
15:12	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
15:10	<jayme@deploy1001>	helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.	[production]