2016-01-11
§
|
16:03 |
<thcipriani@tin> |
thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add enwiki as transwiki import source for ta.wikipedia [[gerrit:262352]] (duration: 00m 33s) |
[production] |
15:05 |
<godog> |
repool restbase1004 in pybal, fully bootstrapped and running latest code |
[production] |
11:14 |
<_joe_> |
upgrading etcd to 2.2.1 in production |
[production] |
10:36 |
<_joe_> |
updating nodejs on restbase-test2002 |
[production] |
07:17 |
<_joe_> |
restarting HHVM on a few jobrunners |
[production] |
02:32 |
<l10nupdate@tin> |
l10nupdate@tin ResourceLoader cache refresh completed at Mon Jan 11 02:32:37 UTC 2016 (duration 6m 55s) |
[production] |
02:25 |
<mwdeploy@tin> |
mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 39s) |
[production] |
01:11 |
<paravoid> |
deactivating eqiad<->GTT BGP peering, reported network issues (P2469) |
[production] |
2016-01-10
§
|
22:00 |
<gwicke> |
restbase: 1005-1009 now on node 4.2 |
[production] |
19:44 |
<paravoid> |
powercycling mw1004, mw1008, mw1012 |
[production] |
19:38 |
<paravoid> |
restarting hhvm on jobrunners again |
[production] |
12:40 |
<mwdeploy@tin> |
mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 626m 20s) |
[production] |
10:13 |
<ori> |
disabled categoryMembershipChange on mw1165 too, then restart jobrunner / jobchron / hhvm on mw1165 and mw1164 |
[production] |
08:55 |
<ori> |
mw1166 -- disabled puppet; disabled categoryMembershipChange jobs |
[production] |
08:48 |
<ori> |
mw1167 -- disabled puppet; disabled deleteLinks and refreshLinks* jobs |
[production] |
08:45 |
<ori> |
mw1168 -- disabled puppet; disabled restbase jobs |
[production] |
08:41 |
<ori> |
mw1169 -- disables cirrus jobs. |
[production] |
08:33 |
<ori> |
Attempting to isolate cause of T122069 by toggling job types on mw1169. Disabling Puppet to prevent it from clobbering config changes. |
[production] |
08:29 |
<paravoid> |
restarting hhvm on jobrunners again |
[production] |
04:58 |
<paravoid> |
powercycling mw1005, mw1008, mw1009 -- unresponsive due to OOM |
[production] |
04:56 |
<paravoid> |
restarting HHVM on eqiad jobrunners, OOM, memleak faster than the 24h restarts |
[production] |
2016-01-07
§
|
23:24 |
<akosiaris> |
repooled scb1002 for mobileapps |
[production] |
23:24 |
<akosiaris> |
enabled puppet,salt on scb1001 |
[production] |
23:23 |
<mobrovac> |
mobileapps deploying 58b371a on scb1001 |
[production] |
23:09 |
<mobrovac> |
mobileapps deploying 58b371a on scb1002 |
[production] |
23:01 |
<akosiaris> |
apt-mark hold nodejs on scb1001, etherpad1001 and maps-test200{1,2,3,4} |
[production] |
22:58 |
<akosiaris> |
disable puppet and salt on scb1001 from nodejs 4.2 transition |
[production] |
22:57 |
<akosiaris> |
depool scb1002 for mobileapps. Transition to nodejs 4.2 ongoing |
[production] |
19:21 |
<YuviPanda> |
started tools / maps backup on labstore1001 |
[production] |
19:13 |
<YuviPanda> |
remove snapshots others20150815030010, others20150815030010, maps20151216040005 and maps20151028040004 that were all stale and should've been removed anyway (on labstore2001) |
[production] |
19:13 |
<YuviPanda> |
remove snapshots others20150815030010, others20150815030010, maps20151216040005 and maps20151028040004 that were all stale and should've been removed anyway |
[production] |
19:11 |
<YuviPanda> |
run sudo lvremove backup/tools20151216020005 on labstore2001 to clean up full snapshot |
[production] |
19:11 |
<jynus> |
setting up watchdog process killing long running queries on db1051 |
[production] |
18:54 |
<_joe_> |
also resetting the drac |
[production] |
18:53 |
<_joe_> |
powercycling ms-be1013 |
[production] |
02:32 |
<l10nupdate@tin> |
l10nupdate@tin ResourceLoader cache refresh completed at Thu Jan 7 02:32:04 UTC 2016 (duration 6m 54s) |
[production] |
02:25 |
<mwdeploy@tin> |
mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 33s) |
[production] |