production SAL

801-850 of 10000 results (38ms)

2017-04-26 §
13:58	<andrewbogott>	put labservices1001 into downtime to minimize (but probably not totally eliminate) alert spam	[production]
13:56	<andrewbogott>	disabled instance creation on Horizon via https://gerrit.wikimedia.org/r/#/c/350414/ and on wikitech via a strategic edit in extensions/OpenStackManager/special/SpecialNovaInstance.php	[production]
13:56	<godog>	downtime and poweroff ms-be 21 26 27 37 38 39 before switch relocation - T148506	[production]
13:54	<gehel>	downtime "ElasticSearch health check for shards" checks for logstash and elasticsearch eqiad - T148506	[production]
13:53	<elukey>	stop kafka on kafka1020 and kafka1018 for row-d extended maintenance (D2)	[production]
13:44	<_joe_>	shutting down mc1013-18 for row D maintenance	[production]
13:40	<aude@naos>	Synchronized wmf-config/CommonSettings-labs.php: (no justification provided) (duration: 00m 57s)	[production]
13:32	<aude@naos>	Synchronized wmf-config/Wikibase-production.php: disable tabular-data for now on wikidata and enable echo notification on test wikis (duration: 01m 06s)	[production]
13:29	<marostegui>	Deploy alter table on db1069 (wikidatawiki) https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548	[production]
13:27	<marostegui>	Deploy alter table labsdb1001 https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548	[production]
13:23	<marostegui>	Deploy alter table db1045 - https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548	[production]
13:22	<elukey>	restart HDFS on analytics100[12] (Hadoop master nodes) to pick up recent topology changes for the cluster	[production]
13:10	<aude@naos>	Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 23s)	[production]
13:02	<ema@neodymium>	conftool action : set/pooled=yes; selector: name=cp2014.codfw.wmnet,service=varnish-be	[production]
13:00	<ema>	cp2017: restart varnish-be	[production]
12:56	<marostegui>	Shutdown db1092 for maintenance - https://phabricator.wikimedia.org/T162681	[production]
12:55	<gehel>	restart elasticsearch on relforge1001 to validate new config - T161830	[production]
12:46	<moritzm>	installing mysql security updates (5.5 as packaged in Debian jessie)	[production]
12:43	<ema@neodymium>	conftool action : set/pooled=no; selector: name=cp2014.codfw.wmnet,service=varnish-be	[production]
11:32	<jynus>	applying new events_coredb_slave.sql on db2055 T160984	[production]
11:31	<moritzm>	rebooting mwlog2001 for update to Linux 4.9	[production]
10:47	<ladsgroup@naos>	Synchronized wmf-config/Wikibase-labs.php: T142104, part II (duration: 00m 56s)	[production]
10:45	<ladsgroup@naos>	Synchronized static/images/wikibase/echoIcon.svg: T142104, part I (duration: 01m 04s)	[production]
10:44	<marostegui>	Deploy alter table on s5, on db1063 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166	[production]
10:39	<jynus@naos>	Synchronized wmf-config/db-eqiad.php: switch s5 eqiad master from db1049 to db1063 (duration: 01m 24s)	[production]
09:48	<jynus>	migrating s5 eqiad replicas under db1063	[production]
09:42	<jynus>	restarting mariadb at db1063	[production]
09:24	<marostegui>	Shutdown db1094, db1093, db1091 for maintenance - T162681	[production]
09:16	<marostegui>	Shutdown es1019 for maintenance - T162681	[production]
08:32	<elukey>	Gracefully stopping hadoop daemons on Hadoop nodes affected by Row-D maintenance	[production]
08:29	<marostegui>	Deploy alter table on change_tag and tag_summary on silver and labtestweb2001 - T147166	[production]
08:27	<marostegui@naos>	Synchronized wmf-config/db-eqiad.php: Depool hosts that need to be moved for the network maintenance - T162681 (duration: 02m 25s)	[production]
08:22	<moritzm>	reimaging terbium to jessie	[production]
07:59	<jynus>	shutting down mariadb on db1040 as a backup before decommissioning	[production]
07:48	<marostegui>	Deploy alter table on s1, on db1052 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166	[production]
07:30	<marostegui>	Deploy alter table on s7, on db1062 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166	[production]
07:24	<marostegui>	Deploy alter table on s4, on db1068 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166	[production]
07:09	<marostegui>	Deploy alter table on s6, on db1061 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166	[production]
06:56	<marostegui@naos>	Synchronized wmf-config/db-eqiad.php: Repool db1071 - T162539 T163548 (duration: 02m 24s)	[production]
06:45	<marostegui>	Deploy alter table on s2, on db1054 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166	[production]
06:10	<marostegui>	Deploy alter table on s3, on db1075 (eqiad master) for tables: change_tag and tag_summary - T147166	[production]
05:57	<marostegui>	Deploy alter table enwiki.revision on labsdb1011 - T132416	[production]
00:20	<catrope@naos>	Synchronized php-1.29.0-wmf.21/extensions/Flow/modules/flow/ui/widgets/mw.flow.ui.ReplyWidget.js: T163749 (duration: 01m 24s)	[production]
2017-04-25 §
22:24	<mutante>	mediawiki maintenance servers: last log entry was _before_ merging https://gerrit.wikimedia.org/r/#/c/342777/ and making a change	[production]
22:23	<andrewbogott>	re-enabling dns on labservices1001	[production]
22:22	<mutante>	mediawiki maintenance servers: making wasat identical to terbium. wasat is currently the active server running crons. no change there at all. on terbium where crons are inactive, some log files were removed	[production]
22:13	<twentyafterfour@naos>	rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.29.0-wmf.21	[production]
22:08	<madhuvishy>	Reenabled labs instance creation and deletion on horizon	[production]
22:05	<twentyafterfour@naos>	Finished scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #5) (duration: 21m 52s)	[production]
22:02	<andrewbogott>	causing an intentional outage of labs-ns0 and labs-recursor0 to make sure we're properly girded for tomorrow's switch replacement.	[production]