2017-07-17
§
|
07:05 |
<marostegui@tin> |
scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details) |
[production] |
07:00 |
<marostegui> |
Rename labsdb1011 main replication thread to an specific one - T153743 |
[production] |
06:50 |
<marostegui> |
Stop replication on db1095 for maintenance - T153743 |
[production] |
06:48 |
<marostegui> |
Deploy alter table on s1 - db1073 - T166204 |
[production] |
06:47 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Depool db1073 - T166204 (duration: 01m 04s) |
[production] |
05:21 |
<marostegui> |
Add 50G to /srv on db1069 |
[production] |
05:09 |
<marostegui> |
Restart MySQL on labsdb1009 for maintenance - T170657 |
[production] |
03:13 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Mon Jul 17 03:13:51 UTC 2017 (duration 7m 16s) |
[production] |
03:06 |
<l10nupdate@tin> |
scap sync-l10n completed (1.30.0-wmf.9) (duration: 12m 48s) |
[production] |
02:31 |
<l10nupdate@tin> |
scap sync-l10n completed (1.30.0-wmf.7) (duration: 09m 13s) |
[production] |
2017-07-14
§
|
21:48 |
<mutante> |
netmon1003 - reinstalled with jessie - saw nothing on ganeti console at all which was a bit confusing, but install finished anyways - adding to puppet / signing cert (T170655) |
[production] |
20:47 |
<bblack> |
mailbox lag: restarting cp1074 backend |
[production] |
19:50 |
<mutante> |
wikitech-static: re-enabled HSTS - line was commented out in Apache config, activated it again |
[production] |
18:54 |
<herron> |
added exim from/subject filter for spam observed from qq.com - T170601 |
[production] |
16:36 |
<herron> |
lowered mailman/lists spam_score exim acl to 6 - T170601 |
[production] |
11:41 |
<marostegui> |
Add 50G to /srv/ on dbstore1002 - T168303 |
[production] |
11:35 |
<jynus> |
stop db2062 and db2072 for cloning |
[production] |
10:43 |
<jynus> |
altering wmde_analytics_betafeature_users_today table to ENGINE=InnoDB |
[production] |
10:17 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Depool db2062 (duration: 00m 47s) |
[production] |
09:57 |
<moritzm> |
uploaded nodejs_6.11.0~dfsg-1+wmf to apt.wikimedia.org (for jessie and stretch) (T170548) |
[production] |
07:22 |
<marostegui> |
Stop replication on labsdb1011 for maintenance - T153743 |
[production] |
06:59 |
<marostegui> |
Create views for dinwiki on labsdb1009, 1010 and 1011 - T169193 |
[production] |
05:54 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Repool db1072 - T166204 (duration: 00m 46s) |
[production] |
04:21 |
<mutante> |
netmon1002/netmon2001 - change UID/GID for rancid to universal 445/445, use find -exec to chown existing files, for unmessy data syncing, define UID on wikitech page UID (T166180) |
[production] |
2017-07-13
§
|
23:47 |
<thcipriani@tin> |
rebuilt wikiversions.php and synchronized wikiversions files: revert all wikis to php-1.30.0-wmf.9, again |
[production] |
23:19 |
<thcipriani@tin> |
rebuilt wikiversions.php and synchronized wikiversions files: all wikis to php-1.30.0-wmf.9 |
[production] |
23:08 |
<thcipriani@tin> |
rebuilt wikiversions.php and synchronized wikiversions files: revert all wikis to php-1.30.0-wmf.9 |
[production] |
22:57 |
<thcipriani@tin> |
rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.9 |
[production] |
22:04 |
<bd808> |
Stashbot working after backend ElasticSearch cluster upgrade |
[production] |
21:28 |
<robh@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=wdqs2002.codfw.wmnet |
[production] |
21:28 |
<robh@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=wdqs2003.codfw.wmnet |
[production] |
20:56 |
<demon@tin> |
Synchronized wmf-config/InitialiseSettings.php: MinervaNeue on testwiki (duration: 00m 47s) |
[production] |
20:01 |
<smalyshev@tin> |
Finished deploy [wdqs/wdqs@a32dbeb]: Redeploy GUI due to breakage in T165228 (duration: 02m 19s) |
[production] |
19:59 |
<smalyshev@tin> |
Started deploy [wdqs/wdqs@a32dbeb]: Redeploy GUI due to breakage in T165228 |
[production] |
18:39 |
<dzahn@neodymium> |
conftool action : set/pooled=yes; selector: name=mw2202.codfw.wmnet |
[production] |
18:38 |
<dzahn@neodymium> |
conftool action : set/pooled=yes; selector: name=mw2201.codfw.wmnet |
[production] |
18:31 |
<arlolra> |
Updated Parsoid to 71c07681 (T169293) |
[production] |
18:29 |
<bblack> |
upgrading nginx on +wmf1 hosts: conf[1001-1003].eqiad.wmnet,cp1048.eqiad.wmnet,cp3036.esams.wmnet,elastic2020.codfw.wmnet,hassaleh.codfw.wmnet,hassium.eqiad.wmnet |
[production] |
18:22 |
<arlolra@tin> |
Finished deploy [parsoid/deploy@d0041f2]: Updating Parsoid to 71c07681 (duration: 11m 12s) |
[production] |
18:11 |
<arlolra@tin> |
Started deploy [parsoid/deploy@d0041f2]: Updating Parsoid to 71c07681 |
[production] |
17:46 |
<volans> |
re-enabling puppet and force run on 'R:Package = nginx-common' |
[production] |
17:38 |
<bblack> |
restarting varnish-be on cp1049 (mailbox lag) |
[production] |
17:36 |
<bblack> |
restarting puppetmasters, staggered |
[production] |
17:06 |
<volans> |
disabled puppet on nitrogen |
[production] |
16:34 |
<chasemp> |
labstore2001:~# systemctl disable lvm2-activation && systemctl disable lvm2-activation-early && systemctl reset-failed (slated to be reimaged by madhu -- this alert is non-actionable) |
[production] |
16:19 |
<urandom> |
Starting cassandra-a, restbase2007 (OOM) |
[production] |