production SAL

7251-7300 of 10000 results (53ms)

2017-03-06 §
12:44	<moritzm>	upgrading apache on graphite*	[production]
11:49	<moritzm>	installing imagemagick security updates	[production]
11:36	<moritzm>	upgrading apache on krypton	[production]
11:30	<moritzm>	upgrading apache on planet.wikimedia.org	[production]
11:05	<elukey>	reimage the first Hadoop worker node (an1040) to Debian Jessie	[production]
10:46	<moritzm>	upgrading apache on mediawiki servers in codfw	[production]
10:36	<gehel>	upgrade to elasticsearch 5.2.2 on relforge cluster - T156150	[production]
10:24	<elukey>	(shamefully) replaced /etc/init.d/hadoop-hdfs-datanode script with "exit 0" to prevent the HDFS datanode daemon to start on analytics1028 (broken disk) and leave the rest running (puppet included) - T159632	[production]
10:12	<gehel>	postgresql upgrade on maps* (postgresql-9.4 postgresql-9.4-postgis-2.3 postgresql-9.4-postgis-2.3-scripts postgresql-client-9.4 postgresql-client-common postgresql-common postgresql-contrib-9.4)	[production]
10:06	<ariel@tin>	Finished deploy [dumps/dumps@8521be0]: fix: retries of broken runs could except on uninited var (duration: 00m 01s)	[production]
10:06	<ariel@tin>	Started deploy [dumps/dumps@8521be0]: fix: retries of broken runs could except on uninited var	[production]
09:46	<gehel>	postgresql upgrade on maps-test* (postgresql-9.4 postgresql-9.4-postgis-2.3 postgresql-9.4-postgis-2.3-scripts postgresql-client-9.4 postgresql-client-common postgresql-common postgresql-contrib-9.4)	[production]
09:14	<ariel@tin>	Finished deploy [dumps/dumps@04794df]: move default config into a file and clean up (duration: 00m 02s)	[production]
09:14	<ariel@tin>	Started deploy [dumps/dumps@04794df]: move default config into a file and clean up	[production]
09:09	<gehel>	killing stuck tilerator notification on maps-test2001 - T145534	[production]
07:22	<marostegui>	Resume pt-table-checksum on plwiki (s2) - T154485	[production]
06:59	<marostegui>	Deploy ALTER table on db2046 (s6) for the revision table - T159414	[production]
06:46	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Depool db2046 - T159414 (duration: 00m 51s)	[production]
02:24	<l10nupdate@tin>	ResourceLoader cache refresh completed at Mon Mar 6 02:24:24 UTC 2017 (duration 5m 19s)	[production]
02:19	<l10nupdate@tin>	scap sync-l10n completed (1.29.0-wmf.14) (duration: 07m 15s)	[production]
01:29	<cwd>	updated staging civicrm database and triggers	[production]
2017-03-05 §
22:23	<Reedy>	Generating some more captchas again T159581	[production]
10:19	<elukey>	disabled puppet on analytics1028 to avoid puppet to start the HDFS daemon (T159632)	[production]
02:24	<l10nupdate@tin>	ResourceLoader cache refresh completed at Sun Mar 5 02:24:02 UTC 2017 (duration 5m 20s)	[production]
02:18	<l10nupdate@tin>	scap sync-l10n completed (1.29.0-wmf.14) (duration: 07m 07s)	[production]
2017-03-04 §
16:43	<Reedy>	Manually generating even more captchas (going upto 10k total) in screen as reedy on terbium T159581	[production]
16:35	<Reedy>	Manually generating some more captchas T159581	[production]
03:28	<legoktm>	pausing refreshLinks.php run due to increase in job queue	[production]
03:05	<mutante>	planet2001 - and this time it just worked and i can't reproduce the issue. install finished. re-adding to puppet, signing certs...	[production]
03:00	<mutante>	planet2001 - reinstalling once more (T159432)	[production]
02:36	<l10nupdate@tin>	ResourceLoader cache refresh completed at Sat Mar 4 02:36:25 UTC 2017 (duration 5m 19s)	[production]
02:31	<l10nupdate@tin>	scap sync-l10n completed (1.29.0-wmf.14) (duration: 12m 10s)	[production]
00:52	<mutante>	conf2002 - ran "systemctl reset-failed" to fix Icinga alert about broken systemd state due to formerly existing but failed service etcdmirror-eqiad-wmnet. turns out you need this to remove missing units. found on http://serverfault.com/questions/606520/how-to-remove-missing-systemd-units (T131959)	[production]
2017-03-03 §
23:23	<RainbowSprinkles>	phabricator: restarted apache 1 last time, removed hack	[production]
23:19	<mutante>	icinga: for special external hosts benefactorevents and eventdonations, "submit passive check result for this host" -> "check_tcp -p 80" to avoid "crit hosts" that just don't respond to ICMP (http://www.htmlgraphic.com/nagios-check-host-without-ping/)	[production]
23:12	<RainbowSprinkles>	phabricator: restarting apache real quick	[production]
22:03	<hashar>	rebooting contint2001	[production]
21:54	<hashar>	restarting Jenkins	[production]
21:51	<hashar>	enabling puppet on contint1001 and puppet-run	[production]
21:05	<hashar>	disabled puppet on contint1001	[production]
20:26	<mattflaschen@tin>	Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster only (duration: 00m 40s)	[production]
19:35	<ebernhardson>	restart elasticsearch on relforge1002 to update remote reindex whitelist	[production]
19:33	<ebernhardson>	restart elasticsearch on relforge1001 to update remote reindex whitelist	[production]
19:11	<legoktm>	running refreshLinks.php across small wikis	[production]
18:43	<addshore@tin>	Synchronized php-1.29.0-wmf.14/extensions/RevisionSlider/modules/ext.RevisionSlider.css: T159428 [[gerrit:340794\|Quick fix for misplaced tooltips on RTL wikis]] (duration: 00m 42s)	[production]
17:34	<hashar>	CI is mostly recovered. It could not spawn instance anymore. The queue is being processed and will take a while to be completed. Check status on https://integration.wikimedia.org/zuul/ \| T159543	[production]
16:17	<hashar>	Stopped Jenkins from processing builds while instances are being recycled	[production]
13:37	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Repool db2067 - T159414 (duration: 00m 50s)	[production]
13:12	<elukey>	removed apache2 (rc state) and apache2-utils from analtytics1027	[production]
11:11	<elukey@tin>	Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 00m 14s)	[production]