production SAL

5901-5950 of 10000 results (50ms)

2021-04-15 §
04:14	<ryankemper@cumin2001>	START - Cookbook sre.wdqs.data-transfer	[production]
04:14	<ryankemper>	T280108 T267927 `wdqs2008` (source) caught up on lag, xfering to `wdqs1004`: `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs1004.eqiad.wmnet --reason "transferring wikidata journal following reload from dumps" --blazegraph_instance blazegraph --task-id T267927`	[production]
04:06	<ryankemper>	T280108 T267927 Merged https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/679320, will verify correct behavior of `data-transfer` cookbook	[production]
01:19	<Amir1>	mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php wikidatawiki --property-id P8671 --new-data-type external-id (T278427)	[production]
00:50	<ejegg>	updated fundraising CiviCRM from c3342aa4ea to 35a8dd33ba	[production]
2021-04-14 §
23:27	<ladsgroup@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:679350\|Disable legacy javascript global variables in ruwiki (T72470)]] (duration: 01m 16s)	[production]
21:44	<legoktm>	manually started debmonitor-client.service on ml-serve2004 after 502 Bad gateway error	[production]
20:55	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wtp[1037-1039].eqiad.wmnet with reason: reimage	[production]
20:54	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wtp[1037-1039].eqiad.wmnet with reason: reimage	[production]
20:38	<mutante>	wtp1037, wtp1038, wtp1039 - scap pull	[production]
19:52	<dzahn@cumin1001>	conftool action : set/weight=20; selector: name=mw2395.codfw.wmnet,cluster=jobrunner	[production]
19:52	<dzahn@cumin1001>	conftool action : set/weight=20; selector: name=mw2394.codfw.wmnet,cluster=jobrunner	[production]
19:51	<dzahn@cumin1001>	conftool action : set/weight=20; selector: name=mw2410.codfw.wmnet,cluster=videoscaler	[production]
19:51	<dzahn@cumin1001>	conftool action : set/weight=20; selector: name=mw2411.codfw.wmnet,cluster=videoscaler	[production]
19:50	<cstone>	civicrm revision changed from ec2a3bcff6 to c3342aa4ea	[production]
19:50	<dzahn@cumin1001>	conftool action : set/weight=15; selector: name=mw2411.codfw.wmnet,cluster=videoscaler	[production]
19:50	<dzahn@cumin1001>	conftool action : set/weight=15; selector: name=mw2410.codfw.wmnet,cluster=videoscaler	[production]
19:49	<dzahn@cumin1001>	conftool action : set/weight=10; selector: name=mw2395.codfw.wmnet,cluster=videoscaler	[production]
19:48	<dzahn@cumin1001>	conftool action : set/weight=10; selector: name=mw2394.codfw.wmnet,cluster=videoscaler	[production]
19:47	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2411.codfw.wmnet,cluster=jobrunner	[production]
19:42	<herron>	migrating kafka-logging broker logstash1011 to kafka-logging1002 T279342	[production]
19:06	<jhuneidi@deploy1002>	Synchronized php: group1 wikis to 1.37.0-wmf.1 refs T278345 (duration: 02m 03s)	[production]
19:04	<jhuneidi@deploy1002>	rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.1 refs T278345	[production]
18:58	<mutante>	urldownloader1002 - icinga alerted about disk space, ran 'apt-get clean' which is my usual go to in that case. it reduced usage from 97% to 89%	[production]
17:56	<urbanecm@deploy1002>	Synchronized php-1.37.0-wmf.1/extensions/GrowthExperiments/: ce44792: 84107c5: GrowthExperiments backports related to DatabaseMentorStore (T279957; T279959) (duration: 01m 55s)	[production]
15:00	<shdubsh>	run new curator actions on codfw - T274394	[production]
14:48	<shdubsh>	O:logstash::elasticsearch7 update elasticsearch-curator to 5.8.1	[production]
14:13	<rzl>	mcrouter cert renewal complete, puppet re-enabled T276029	[production]
14:11	<zpapierski@deploy1002>	Finished deploy [wikimedia/discovery/analytics@8ae53e3]: T273847 export queries to relforge dag deployment - start date update (duration: 02m 14s)	[production]
14:11	<moritzm>	installing intel-microcode updates on Buster	[production]
14:09	<zpapierski@deploy1002>	Started deploy [wikimedia/discovery/analytics@8ae53e3]: T273847 export queries to relforge dag deployment - start date update	[production]
13:48	<rzl>	disabling puppet on C:mcrouter for cert renewal	[production]
13:43	<marostegui@cumin1001>	dbctl commit (dc=all): 'Remove weight from es5 master', diff saved to https://phabricator.wikimedia.org/P15342 and previous config saved to /var/cache/conftool/dbconfig/20210414-134331-marostegui.json	[production]
13:34	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1025 (re)pooling @ 100%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15341 and previous config saved to /var/cache/conftool/dbconfig/20210414-133411-root.json	[production]
13:29	<zpapierski@deploy1002>	Finished deploy [wikimedia/discovery/analytics@825c60a]: T273847 export queries to relforge dag deployment - schedule change (duration: 02m 08s)	[production]
13:27	<zpapierski@deploy1002>	Started deploy [wikimedia/discovery/analytics@825c60a]: T273847 export queries to relforge dag deployment - schedule change	[production]
13:19	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1025 (re)pooling @ 75%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15340 and previous config saved to /var/cache/conftool/dbconfig/20210414-131908-root.json	[production]
13:12	<moritzm>	installing OpenSSL updates on buster	[production]
13:12	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)	[production]
13:04	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1025 (re)pooling @ 50%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15339 and previous config saved to /var/cache/conftool/dbconfig/20210414-130404-root.json	[production]
13:02	<hnowlan@cumin1001>	START - Cookbook sre.cassandra.roll-restart	[production]
13:01	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)	[production]
13:01	<godog>	extend prometheus global @ codfw by 100G	[production]
12:49	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1025 (re)pooling @ 25%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15338 and previous config saved to /var/cache/conftool/dbconfig/20210414-124901-root.json	[production]
12:39	<elukey>	update kafka term for analytics-in{4,6} on cr{1,2}-eqiad to include kafka-logging1001 - ref: https://gerrit.wikimedia.org/r/c/operations/homer/public/+/679296	[production]
12:36	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1039.eqiad.wmnet with reason: REIMAGE	[production]
12:34	<jiji@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1039.eqiad.wmnet with reason: REIMAGE	[production]
12:34	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1038.eqiad.wmnet with reason: REIMAGE	[production]
12:33	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1025 (re)pooling @ 10%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15337 and previous config saved to /var/cache/conftool/dbconfig/20210414-123357-root.json	[production]
12:32	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1037.eqiad.wmnet with reason: REIMAGE	[production]