2018-01-17
ยง
|
15:41 |
<_joe_> |
dropping ruwiki htmlCacheUpdate records stuck int he old jobqueue |
[production] |
15:36 |
<moritzm> |
upgrading nginx on mw servers in codfw to 1.13.6-2+wmf1~jessie1 |
[production] |
15:32 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Fully repool db1104 (duration: 01m 12s) |
[production] |
14:57 |
<moritzm> |
resetting RAC on labsdb1007 (serial console inaccessible) |
[production] |
14:53 |
<moritzm> |
resetting RAC on labsdb1006 (serial console inaccessible) |
[production] |
14:38 |
<chasemp> |
labstore1001:~# /sbin/reboot |
[production] |
14:27 |
<zeljkof> |
EU SWAT finished |
[production] |
14:23 |
<zfilipin@tin> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:404327|Create "eliminator" user group on ur.wikipedia (T184607)]] (duration: 01m 12s) |
[production] |
14:14 |
<moritzm> |
repooling chromium |
[production] |
14:14 |
<zfilipin@tin> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:404624|Add Draft Namespace in enwikiversity (T184957)]] (duration: 01m 12s) |
[production] |
14:07 |
<moritzm> |
rebooting chromium for kernel security update |
[production] |
14:04 |
<gehel> |
restart of elasticsearch / cirrus eqiad completed (cluster still recovering) |
[production] |
14:03 |
<moritzm> |
depooling chromium |
[production] |
13:51 |
<chasemp> |
reboot labstore2003 |
[production] |
13:46 |
<akosiaris> |
reboot sca2003 webperf2001 planet2001 poolcounter2002 mx2001 kubetcd200{1,2,3} install2002 dbmonitor2001 alsafi acrux hassaleh diadem nihal pybal-test200{1,2,3} releases2001 tureis for PCID, INVPCID |
[production] |
13:45 |
<chasemp> |
labstore2002:~# sudo update-grub && /sbin/reboot |
[production] |
13:40 |
<chasemp> |
labstore2001:~# /sbin/reboot |
[production] |
13:39 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 (duration: 01m 13s) |
[production] |
13:31 |
<akosiaris> |
reboot acrab for PCID,INVPCID enabling |
[production] |
13:22 |
<marostegui> |
Deploy schema change on db1099:3318 - https://phabricator.wikimedia.org/T174569 |
[production] |
13:22 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 - T174569 (duration: 01m 12s) |
[production] |
13:17 |
<moritzm> |
upgrading app server canaries to 3.18.5+wmf4 |
[production] |
13:12 |
<marostegui> |
Fixing drifts on db1065 - T162807 |
[production] |
12:28 |
<moritzm> |
uploading HHVM 3.18.5+wmf4 for jessie-wikimedia to apt.wikimedia.org (3.18.7 with the patch https://github.com/facebook/hhvm/commit/bd7b2bcfe70b053a3a001480653012f68599250f backed out) |
[production] |
12:10 |
<moritzm> |
updating HHVM in deployment-prep to 3.18.5+wmf4 |
[production] |
11:40 |
<godog> |
bootstrap cassandra-b on restbase1016 |
[production] |
11:28 |
<moritzm> |
rearmed keyholder on neodymium |
[production] |
11:24 |
<moritzm> |
rebooting neodymium for kernel security update |
[production] |
11:19 |
<_joe_> |
restarted nginx on mw1346, was in a bad state |
[production] |
10:51 |
<moritzm> |
reset RAC on chromium, serial console is inaccessible |
[production] |
10:42 |
<moritzm> |
repooling hydrogen |
[production] |
10:39 |
<moritzm> |
rebooting hydrogen for kernel security update |
[production] |
10:34 |
<moritzm> |
depooling hydrogen again |
[production] |
10:22 |
<moritzm> |
repooling hydrogen (and pdns-recursor restarted), experiment concluded |
[production] |
10:14 |
<moritzm> |
depooling hydrogen (and keeping pdns-recursor stopped for a few minutes to check whether problems with load-balanced recdns traffic are still an issue) |
[production] |
10:11 |
<moritzm> |
reset RAC on hydrogen, serial console was inaccessible |
[production] |
10:01 |
<godog> |
start cassandra-a on restbase1016 |
[production] |
09:52 |
<elukey> |
reboot druid1005 for kernel upgrades |
[production] |
09:46 |
<elukey> |
removed upstart config for brrd on eventlog1001 (failing and spamming syslog, old leftover?) |
[production] |
09:34 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Full repool db1101:3318 (duration: 01m 11s) |
[production] |
09:30 |
<moritzm> |
rebooting flerovium and furud for kernel security update |
[production] |
09:17 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Increase traffic for db1101:3318 (duration: 01m 12s) |
[production] |
09:14 |
<godog> |
reimage restbase1016 - T184100 |
[production] |
09:06 |
<elukey> |
reboot analytics1003 for kernel upgrades |
[production] |
09:00 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 01m 11s) |
[production] |
08:56 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Slowly repool db1101:3318 (duration: 15m 42s) |
[production] |
08:44 |
<elukey> |
reboot stat100[456] for kernel upgrades |
[production] |
07:48 |
<elukey> |
restart varnish backend on cp4024 (ton of 503s, icinga alerting for mailbox lag) |
[production] |
07:46 |
<oblivian@neodymium> |
conftool action : set/pooled=inactive; selector: cluster=appserver,name=mw12([0-1][0-9]|20)\.eqiad\.wmnet |
[production] |
07:45 |
<_joe_> |
depooling mw1209-1220 from the appserver cluster for decommissioning, T185004 |
[production] |