production SAL

1951-2000 of 10000 results (42ms)

2016-05-02 §
22:13	<cwd>	updated crm from b386a6821c71310950ccdcdcf2616add727e1af4 to f5e8f98d07a2280118b7153bc342bf52ee67edd5	[production]
22:10	<papaul>	restbase200[7-9]- signing puppet certs, salt-key, initial run	[production]
21:26	<cscott>	updated OCG to version b775e612520f9cd4acaea42226bcf34df07439f7	[production]
21:21	<cscott>	starting OCG deploy (a little late)	[production]
20:23	<gehel>	restarting elasticsearch server elastic2001.codfw.wmnet (T110236)	[production]
20:21	<gehel>	starting rolling restart of elasticsearch codfw cluster to disable multicast (T110236)	[production]
20:15	<subbu>	finished deploying parsoid version 0a26f3a4	[production]
20:09	<subbu>	synced code + restarted parsoid on wtp1001 as canary	[production]
20:05	<subbu>	starting deploy of parsoid version 0a26f3a4	[production]
19:27	<aaron@tin>	Synchronized php-1.27.0-wmf.22/includes/filebackend/FileBackendMultiWrite.php: 63b2d7b2eae (duration: 00m 32s)	[production]
19:17	<mutante>	manually removing 2fa from my own wikitech account, adding it back ..	[production]
18:24	<gehel>	deploying latest WDQS version	[production]
17:23	<robh>	restbase2004 offline for next few hours for comparison work for new systems T132976	[production]
16:01	<krenair@tin>	Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/286286/ (duration: 00m 26s)	[production]
15:53	<krenair@tin>	Synchronized wikiversions-labs.json: https://gerrit.wikimedia.org/r/#/c/283689/ (duration: 00m 25s)	[production]
15:53	<krenair@tin>	Synchronized dblists/all-labs.dblist: https://gerrit.wikimedia.org/r/#/c/283689/ (duration: 00m 26s)	[production]
15:44	<krenair@tin>	Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/286287/ (duration: 00m 25s)	[production]
15:40	<krenair@tin>	Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/286285/ (duration: 00m 25s)	[production]
15:32	<krenair@tin>	Synchronized php-1.27.0-wmf.22/extensions/Wikidata: https://gerrit.wikimedia.org/r/#/c/286434/2 (duration: 02m 02s)	[production]
15:28	<bblack>	re-pooling esams	[production]
15:22	<jynus>	restarting db1040 for reimage	[production]
15:21	<krenair@tin>	Synchronized php-1.27.0-wmf.22/extensions/Math/MathRestbaseInterface.php: https://gerrit.wikimedia.org/r/#/c/286412/ (duration: 00m 26s)	[production]
15:07	<krenair@tin>	Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/285700/ (duration: 00m 42s)	[production]
14:52	<moritzm>	rolling restart of zookeeper to pick up Java update	[production]
14:22	<bblack>	starting gdnsd on esams (esams is marked down there)	[production]
14:20	<bblack>	stopped gdnsd on eeden	[production]
13:13	<jynus>	stopping db1040 mysql for backup before cloning	[production]
12:15	<elukey>	deployed Varnish change to force HTTP 503 for datasets.wikimedia.org, stats.wikimedia.org, metrics.wikimedia.org as prep-step for OS reimage.	[production]
12:13	<elukey>	deployed Varnish cache::misc change to force HTTP 503 for datasets.wikimedia.org, stats.wikimedia.org, metrics.wikimedia.org as prep-step for OS reimage.	[production]
12:12	<elukey>	Merged Varnish cache::misc change to force HTTP 503 for datasets.wikimedia.org, stats.wikimedia.org, metrics.wikimedia.org as prep-step for OS reimage.	[production]
11:21	<elukey>	deployed the last version of Event Logging from tin. Service also restarted.	[production]
11:06	<moritzm>	rolling restart of hhvm in eqiad for pcre security update	[production]
10:42	<moritzm>	rolling restart of hhvm in codfw for pcre security update	[production]
09:58	<moritzm>	uploaded openldap 2.4.41+wmf1 for jessie-wikimedia to carbon (T130593)	[production]
08:14	<hashar>	Restarted stuck Jenkins (due to IRC plugin)	[production]
07:44	<moritzm>	rebooting hasseleh/hassium for kernel upgrade to 4.4	[production]
07:10	<moritzm>	installing poppler security updates	[production]
06:46	<_joe_>	rebooting serpens from ganeti, unreachable	[production]
02:30	<l10nupdate@tin>	ResourceLoader cache refresh completed at Mon May 2 02:30:33 UTC 2016 (duration 9m 18s)	[production]
02:21	<mwdeploy@tin>	sync-l10n completed (1.27.0-wmf.22) (duration: 09m 31s)	[production]
2016-05-01 §
19:37	<SMalyshev>	enabled wdqs1002, put wdqs1001 in maintenance mode for reload	[production]
16:20	<volans>	changing live configuration of db1042 thread_pool_stall_limit to 10 to avoid connection timeout errors	[production]
16:18	<volans>	changing live configuration of db1042 thread_pool_stall_limit back to 100 to test impact on connection timeout	[production]
16:08	<volans>	changing live configuration of db1042 thread_pool_stall_limit to 10 to test impact on connection timout	[production]
15:24	<jynus>	alter table puppet.fact_values to a bigint unsigned for m1 T107753	[production]
15:07	<volans@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1040 for investigation T134114 (duration: 01m 22s)	[production]
14:44	<volans>	truncated puppet.fact_values table to fix puppet (as documented on wikitech)	[production]
10:58	<godog>	reboot furud.codfw.wmnet, ganeti instance with increasing load and 100% iowait, kvm/ganeti idle instance bug likely T134098	[production]
2016-04-30 §
13:41	<elukey>	disabled puppet on analytics1047 and scheduled downtime for the host, IO errors in the dmesg for /dev/sdd. Stopped also Hadoop daemons to remove it from the cluster temporarily (not sure how to do it properly, will write docs).	[production]
10:45	<volans>	Reset slave on sanitarium:3311 due to corrupted relay log after skipping query for duplicate key T132416	[production]