production SAL

51-100 of 10000 results (19ms)

2021-05-07 §
05:15	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1087 T282093', diff saved to https://phabricator.wikimedia.org/P15840 and previous config saved to /var/cache/conftool/dbconfig/20210507-051519-marostegui.json	[production]
05:08	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1130 (re)pooling @ 25%: Repool db1130', diff saved to https://phabricator.wikimedia.org/P15839 and previous config saved to /var/cache/conftool/dbconfig/20210507-050839-root.json	[production]
04:33	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1130 for schema change', diff saved to https://phabricator.wikimedia.org/P15837 and previous config saved to /var/cache/conftool/dbconfig/20210507-043350-marostegui.json	[production]
2021-05-06 §
23:50	<brennen@deploy1002>	rebuilt and synchronized wikiversions files: Rollback group1 and group2 to 1.37.0-wmf.3 (T282193)	[production]
22:52	<legoktm>	upgrading mailman3 and hyperkitty on lists1001 (T282092)	[production]
22:11	<brennen@deploy1002>	Synchronized php-1.37.0-wmf.4/includes/specials/SpecialWatchlist.php: Backport: [[gerrit:685890\|Reorder tables in SpecialWatchlist (T282181)]] (duration: 00m 57s)	[production]
21:48	<legoktm>	upgraded mailman3 and hyperkitty on lists1002 (T282092)	[production]
21:46	<legoktm>	uploaded new mailman3 and hyperkitty packages to apt.wm.o (T282092)	[production]
21:11	<hashar>	restarted CI Jenkins due to T281737	[production]
19:05	<brennen@deploy1002>	rebuilt and synchronized wikiversions files: all wikis to 1.37.0-wmf.4	[production]
19:04	<ejegg>	updated fundraising CiviCRM from 8034e47008 to 2052d79248	[production]
18:58	<otto@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:685906\|Migrate WikidataCompletionSearchClicks to event platform on all wikis (T282140)]] (duration: 01m 04s)	[production]
18:55	<urbanecm@deploy1002>	Synchronized wmf-config/Wikibase.php: 338d1df5903cdc963b9eef22ec2c1750b7b3a02b: Wikibase: Use wikidataclient-test dblist for testwikidata localClientDatabases (T282160) (duration: 01m 05s)	[production]
18:46	<urbanecm@deploy1002>	Synchronized wmf-config/Wikibase.php: 7e21cf0d96541d0ab5cb18cd7741756ab1dfe7b8: NO-OP: Wikibase: Use wikidataclient dblist directly for repo localClientDatabases (T282160) (duration: 01m 04s)	[production]
18:31	<otto@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Declare WikidataCompletionSearchClicks stream and migrate on testwiki - T282140 (duration: 01m 06s)	[production]
17:59	<volans@cumin2001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cumin1001.eqiad.wmnet	[production]
17:59	<volans@cumin2001>	START - Cookbook sre.hosts.remove-downtime for cumin1001.eqiad.wmnet	[production]
17:47	<volans@cumin2001>	END (FAIL) - Cookbook sre.hosts.remove-downtime (exit_code=99) for cumin1001.eqiad.wmnet	[production]
17:47	<volans@cumin2001>	START - Cookbook sre.hosts.remove-downtime for cumin1001.eqiad.wmnet	[production]
17:35	<jgiannelos@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' .	[production]
17:33	<jgiannelos@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .	[production]
17:27	<bblack@cumin1001>	conftool action : set/pooled=no; selector: name=cp203[34].codfw.wmnet	[production]
17:20	<jgiannelos@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' .	[production]
17:15	<volans>	upgrade spicerack on cumin* to 0.0.52	[production]
17:15	<ryankemper>	[Elastic] Set `elastic2043` as the only banned node in Cirrussearch Elasticsearch clusters (`elastic2058-production-search-codfw`, `elastic2058-production-search-omega-codfw`, `elastic2058-production-search-psi-codfw`)	[production]
17:13	<papaul>	powerdown ms-be2057 for relocation	[production]
17:13	<jgiannelos@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .	[production]
17:12	<volans>	uploaded spicerack_0.0.52 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia	[production]
17:00	<papaul>	powerdown elastic2058 for relocation	[production]
16:43	<vgutierrez>	Enforce Puppet Internal CA validation on trafficserver@ulsfo - T281673	[production]
16:12	<papaul>	powerdown mc-gp2002 for relocation	[production]
16:09	<ryankemper>	[Elastic] Set `elastic2058` as the only banned node in Cirrussearch Elasticsearch clusters (`elastic2058-production-search-codfw`, `elastic2058-production-search-omega-codfw`, `elastic2058-production-search-psi-codfw`)	[production]
15:58	<Amir1>	starting upgrade of public mailing lists in group d and e (T280322)	[production]
15:50	<ryankemper@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1012.eqiad.wmnet with reason: REIMAGE	[production]
15:47	<ryankemper@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1012.eqiad.wmnet with reason: REIMAGE	[production]
15:42	<papaul>	powerdown logstash2027 for relocation	[production]
15:41	<mvolz@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' .	[production]
15:40	<ryankemper@cumin1001>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw reboot - ryankemper@cumin1001 - T280563	[production]
15:34	<XioNoX>	push cloud-gw-transport-eqiad to asw2-b-eqiad and cloudsw	[production]
15:33	<ryankemper@cumin1001>	START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw reboot - ryankemper@cumin1001 - T280563	[production]
15:32	<ryankemper>	T280382 `sudo -i wmf-auto-reimage-host -p T280382 wdqs1012.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `reimage`	[production]
15:32	<ryankemper>	T280382 `sudo -i wmf-auto-reimage-host -p T280382 wdqs2003.codfw.wmnet` on `ryankemper@cumin1001` tmux session `reimage`	[production]
15:31	<mvolz@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' .	[production]
15:29	<cdanis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin1001.eqiad.wmnet with reason: quiz	[production]
15:29	<cdanis@cumin1001>	START - Cookbook sre.hosts.downtime for 0:05:00 on cumin1001.eqiad.wmnet with reason: quiz	[production]
15:26	<ryankemper>	T280382 [WDQS] Pooled `wdqs1007` and `wdqs2004`	[production]
15:26	<ryankemper>	T280382 `wdqs2004.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2 2.6T 998G 1.5T 40% /srv`	[production]
15:26	<ryankemper>	T280382 `wdqs1007.eqiad.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2 2.6T 998G 1.5T 40% /srv`	[production]
15:20	<mvolz@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .	[production]
15:16	<mvolz@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .	[production]