production SAL

2401-2450 of 10000 results (39ms)

2020-09-04 §
01:23	<ryankemper>	(Following the restart of blazegraph, service has been restored to `wdqs2003`. See https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&var-cluster_name=wdqs&from=1599182219699&to=1599182547699)	[production]
01:16	<ryankemper>	Glancing at https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&var-cluster_name=wdqs&from=1599170628749&to=1599182011243, looks like `wdqs2003`'s blazegaph isn't happy based off the null data entries. Restarting blazegraph: `ryankemper@wdqs2003:~$ sudo systemctl restart wdqs-blazegraph`	[production]
00:48	<milimetric@deploy1001>	Started deploy [analytics/aqs/deploy@95d6432]: AQS: Deploying new geoeditors endpoints	[production]
2020-09-03 §
23:31	<urbanecm@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: 93947391e97be11a9cd7eb4713b274b05d5b371a: Start logging log-ins on select wikis (T253802) (duration: 00m 56s)	[production]
21:18	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
21:15	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
19:55	<milimetric@deploy1001>	deploy aborted: AQS: Deploying new geoeditors endpoints (duration: 00m 13s)	[production]
19:54	<milimetric@deploy1001>	Started deploy [analytics/aqs/deploy@95d6432]: AQS: Deploying new geoeditors endpoints	[production]
19:07	<milimetric@deploy1001>	Finished deploy [analytics/refinery@e4d5149] (thin): Regular analytics weekly train THIN [analytics/refinery@e4d5149] (duration: 00m 08s)	[production]
19:07	<milimetric@deploy1001>	Started deploy [analytics/refinery@e4d5149] (thin): Regular analytics weekly train THIN [analytics/refinery@e4d5149]	[production]
19:06	<milimetric@deploy1001>	Finished deploy [analytics/refinery@e4d5149]: Regular analytics weekly train [analytics/refinery@e4d5149] (duration: 09m 06s)	[production]
18:57	<milimetric@deploy1001>	Started deploy [analytics/refinery@e4d5149]: Regular analytics weekly train [analytics/refinery@e4d5149]	[production]
17:50	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
17:48	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
17:47	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
17:46	<cmjohnson@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
17:46	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
17:45	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
17:44	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
17:43	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
17:43	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
17:41	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
17:36	<mholloway-shell@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .	[production]
17:36	<mholloway-shell@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' .	[production]
17:32	<mholloway-shell@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' .	[production]
17:32	<mholloway-shell@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .	[production]
17:28	<mholloway-shell@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .	[production]
17:19	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
17:16	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
17:02	<papaul>	power down ores2009 for DIMM upgrade	[production]
16:45	<papaul>	power down ores2008 for DIMM upgrade	[production]
16:33	<papaul>	power down ores2007 for DIMM upgrade	[production]
16:24	<elukey>	roll restart aqs on aqs1* to pick up new druid settings	[production]
16:05	<papaul>	power down ores2006 for DIMM upgrade	[production]
15:51	<papaul>	power down ores2005 for DIMM upgrade	[production]
15:33	<papaul>	power down ores2004 for DIMM upgrade	[production]
15:30	<moritzm>	installing nginx updates on apt* and htmldumper1001	[production]
15:25	<moritzm>	installing firejail update (along with restarts) on thumbor1001, maps1001, restbase1016 (and -dev)	[production]
15:21	<papaul>	power down ores2003 for DIMM upgrade	[production]
15:17	<moritzm>	installing firejail security updates on parsoid servers	[production]
15:08	<papaul>	power down ores2002 for DIMM upgrade	[production]
14:53	<papaul>	power down ores2001 for DIMM upgrade	[production]
14:36	<hnowlan@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .	[production]
14:30	<hnowlan@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .	[production]
14:29	<jmm@deploy1001>	Finished deploy [debmonitor/deploy@fb64c52]: deploy to new buster host (duration: 00m 06s)	[production]
14:29	<jmm@deploy1001>	Started deploy [debmonitor/deploy@fb64c52]: deploy to new buster host	[production]
14:13	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:11	<filippo@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:04	<marostegui@cumin1001>	dbctl commit (dc=all): 'Set wikitech back to RW after maintenance T260324', diff saved to https://phabricator.wikimedia.org/P12490 and previous config saved to /var/cache/conftool/dbconfig/20200903-140451-marostegui.json	[production]
14:04	<marostegui@cumin1001>	dbctl commit (dc=all): 'Promote db1128 to wikitech master T260324', diff saved to https://phabricator.wikimedia.org/P12489 and previous config saved to /var/cache/conftool/dbconfig/20200903-140436-marostegui.json	[production]