2016-10-18
§
|
07:40 |
<yuvipanda> |
complete moving all general tools exec nodes to tools-puppetmaster-02 |
[tools] |
07:33 |
<yuvipanda> |
restarted puppetmaster on tools-puppetmaster-01 |
[tools] |
06:16 |
<elukey> |
created the oozie coordinator 0035451-160922102909979-oozie-oozi-C for webrequest-load-check_sequence_statistics-wf-upload-2016-10-18-5 |
[analytics] |
05:51 |
<elukey> |
created the oozie coordinator 0035415-160922102909979-oozie-oozi-C for webrequest-load-check_sequence_statistics-wf-upload-2016-10-18-[234] |
[analytics] |
03:19 |
<mutante> |
restarted grrrit-wm |
[production] |
03:18 |
<mutante> |
gerrit has logs now in /var/log/gerrit/ |
[production] |
03:15 |
<mutante> |
restarting gerrit for logging config change |
[production] |
02:37 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Tue Oct 18 02:37:01 UTC 2016 (duration 5m 49s) |
[production] |
02:31 |
<mwdeploy@tin> |
scap sync-l10n completed (1.28.0-wmf.22) (duration: 10m 20s) |
[production] |
00:48 |
<bblack> |
restarting API hhvms with >40% mem usage via salt every 10 minutes in a loop from here forward. screen session on neodymium, named api-hhvm-restarts |
[production] |
00:39 |
<mutante> |
restarted hhvm on mw1281 (was at 47.7% usage) |
[production] |
00:31 |
<bblack> |
restarting hhvm on API nodes where it's using >30% mem |
[production] |
00:22 |
<bblack> |
restarting hhvm on *API* nodes where it's using >50% mem |
[production] |
00:22 |
<bblack> |
restarting hhvm on nodes where it's using >50% mem |
[production] |
00:05 |
<mutante> |
restarted hhvm on mw1194,mw1197,mw1198 |
[production] |
2016-10-17
§
|
23:27 |
<Pchelolo> |
running import deletions script on restbase1007 |
[production] |
22:26 |
<mutante> |
restarted gerrit on cobalt |
[production] |
22:07 |
<Pchelolo> |
running restriction import script on restbase1007 |
[production] |
20:58 |
<mutante> |
tegmen - stopped duplicate icinga-wm (ircecho) |
[production] |
20:53 |
<mutante> |
maintenance servers, terbium and wasat, now have IPv6 connectivity |
[production] |
20:33 |
<bearND> |
deployed mobileapps 13fa4b4 |
[production] |
20:32 |
<Krenair> |
updated status.wm.o apache config on wikitech-static box to correctly serve static assets again (T148438) |
[production] |
20:30 |
<bearND> |
starting mobileapps deploy |
[production] |
19:31 |
<cwd> |
disabled all dedupe jobs besides "contacts" |
[production] |
18:38 |
<gehel> |
deploying latest gui and binaries for wdqs |
[production] |
18:35 |
<Jeff_Green> |
switch payments-listener back to eqiad |
[production] |
18:17 |
<_joe_> |
dumping core on mw1194 |
[production] |
17:32 |
<Jeff_Green> |
switch payments-listener to codfw |
[production] |
17:20 |
<_joe_> |
restarting lvs on lvs1003/1006 for the api change |
[production] |
17:04 |
<Krenair> |
Restarted wm-bot instance, was not responding to HTTP or SSH |
[bots] |
16:57 |
<elukey> |
started the oozie coordinator 0034720-160922102909979-oozie-oozi-C to re-execute webrequest-load-wf-upload-2016-10-17-14 |
[analytics] |
16:42 |
<ottomata> |
restarting hadoop nodemanagers 1 at a time |
[analytics] |
16:42 |
<ottomata> |
restarting hadoop nodemanagers 1 at a time |
[production] |
16:18 |
<ori> |
Restarted HHVM on API cluster EQIAD |
[production] |
15:54 |
<bd808> |
Stopped and started meetbot job to move from precise to trusty |
[tools.meetbot] |
15:32 |
<ottomata> |
rebootting analytics1030 |
[production] |
15:32 |
<ottomata> |
rebootting analytics1030 |
[analytics] |
15:13 |
<elukey> |
ran kafka preferred-replica-election to allow kafka1018 to be back as broker replica leader |
[production] |
14:38 |
<elukey> |
mw1167 back in service after reimage (MW Jobrunner) |
[production] |
14:37 |
<chasemp> |
remove bdsync-deb and bdsync-deb-2 errornously created in Tools and now defunct anyway |
[tools] |
14:30 |
<ori@tin> |
Synchronized php-1.28.0-wmf.22/extensions/EducationProgram/includes/Events/EditEventCreator.php: Id02366ef: Fix-up for Ia3d767e86 (duration: 00m 52s) |
[production] |
14:06 |
<ori@tin> |
Synchronized wmf-config/InitialiseSettings.php: I8562f8e1: Enable AbuseFilterCachingParser on metawiki and commonswiki (duration: 00m 56s) |
[production] |
14:05 |
<chasemp> |
restart puppetmaster on tools-puppetmaster-01 (instances sticking on puppet runs for a long time) |
[tools] |
14:01 |
<chasemp> |
reboot tools-exec-1215 and tools-exec-1410 as unresponsive |
[tools] |
13:06 |
<elukey> |
reimage mw1167 to Debian (MW Jobrunner) |
[production] |
12:31 |
<marostegui> |
Stopping MySQL db2055 (S1-codfw) to import S1 to dbstore2001 - T146261 |
[production] |
11:39 |
<akosiaris> |
T148830 poweroff sca1001, sca1002, sca2001, sca2002 |
[production] |
11:38 |
<jynus> |
stopping db1048 for general upgrade & reconfiguration |
[production] |
10:57 |
<godog> |
deploy thumbor 0.1.28 to thumbor100[12] |
[production] |
10:38 |
<moritzm> |
uploaded openssl 1.1.0b1+wmf1 for jessie-wikimedia to carbon (patched to be co-installable with our default 1.0.2 packages, build against libssl11-dev to use openssl 1.1) |
[production] |