2018-10-25
§
|
15:02 |
<godog> |
test rsyslog 8.38 upgrade on lithium - T136312 |
[production] |
14:28 |
<elukey> |
upgrade druid on druid100[4-6] to Druid 0.12.3 |
[analytics] |
14:28 |
<elukey> |
upgrade druid on druid100[4-6] to Druid 0.12.3 |
[production] |
14:24 |
<elukey> |
added AAAA DNS records to all the druid nodes |
[analytics] |
14:20 |
<banyek> |
running dns update (gerrit patch: 467711) |
[production] |
13:48 |
<anomie@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Setting comment table migration stage to write-new/read-both on all wikis (T166733) (duration: 00m 55s) |
[production] |
13:46 |
<godog> |
reformat ms-be2043 xfs filesystems - T199198 |
[production] |
13:29 |
<XioNoX> |
test successful, rollback add term return-tcp permit on cr2-codfw |
[production] |
13:28 |
<XioNoX> |
test add term return-tcp permit on cr2-codfw |
[production] |
12:14 |
<volans> |
rebooting cumin1001 to pick new kernel and clear any potential weird state after OOMs |
[production] |
12:01 |
<zeljkof> |
EU SWAT finished |
[production] |
11:17 |
<zfilipin@deploy1001> |
Synchronized wmf-config/throttle.php: SWAT: [[gerrit:469261|New throttle rule for Johannesburg Event on 2018-10-27 (T207742)]] (duration: 00m 55s) |
[production] |
11:08 |
<zfilipin@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:465418|Stop collecting data CitaitonUsage and CitationUsagePageLoad (T191086 T203253)]] (duration: 00m 57s) |
[production] |
10:57 |
<volans> |
restart pdfrender on scb1003 |
[production] |
10:36 |
<joal> |
Resuming oozie webrequest and pageview druid hourly indexation jobs |
[analytics] |
10:35 |
<elukey> |
upgraded Druid on druid100[1-3] to 0.12.3-1 |
[analytics] |
10:11 |
<elukey> |
upgrade druid100[1-3] to druid 0.12.3 |
[production] |
09:51 |
<gehel> |
resetting deployment directory on wdqs1003 |
[production] |
09:16 |
<elukey> |
upgrade turnilo to 1.8.1 |
[analytics] |
09:15 |
<elukey@deploy1001> |
Finished deploy [analytics/turnilo/deploy@84bf1ad]: Upgrade to 1.8.1 (duration: 00m 10s) |
[production] |
09:15 |
<elukey@deploy1001> |
Started deploy [analytics/turnilo/deploy@84bf1ad]: Upgrade to 1.8.1 |
[production] |
09:10 |
<ema> |
resume cache hosts rolling reboots for kernel/microcode updates T203011 |
[production] |
08:56 |
<elukey> |
restart hive-server on an-coord1001 to pick up new prometheus settings |
[analytics] |
08:10 |
<joal> |
Suspend webrequest-druid-hourly and pageview-druid-hourly oozie jobs |
[analytics] |
07:52 |
<joal> |
Manually add za.wikimedia to pageview-witelist (patch merged: https://gerrit.wikimedia.org/r/469557) |
[analytics] |
07:50 |
<hashar> |
enabling puppet again on deployment-deploy01 . Was disabled by _joe_ for apache-fast-test hacking |
[releng] |
07:16 |
<vgutierrez> |
Uploaded certcentral 0.3 to apt.wikimedia.org (stretch) - T207737 T207478 |
[production] |
07:11 |
<moritzm> |
installing requests security updates on trusty |
[production] |
06:17 |
<SMalyshev> |
depooling wdqs1003 again, it's not catching up like the other hosts |
[production] |
06:06 |
<elukey> |
upload druid 0.12.3-1 debs to stretch-wikimedia |
[production] |
2018-10-24
§
|
23:24 |
<maxsem@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/469495/ (duration: 00m 54s) |
[production] |
23:15 |
<maxsem@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/462040/ (duration: 00m 55s) |
[production] |
23:08 |
<bawolff@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Deploy csp report-only to small.dblist wikis T207900 (duration: 00m 56s) |
[production] |
22:38 |
<bawolff@deploy1001> |
Synchronized wmf-config/CommonSettings.php: Deploy csp report-only to outreachwiki T207900 (duration: 00m 54s) |
[production] |
22:36 |
<bawolff@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Deploy csp report-only to outreachwiki T207900 (duration: 00m 54s) |
[production] |
22:33 |
<bawolff@deploy1001> |
scap failed: average error rate on 8/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) |
[production] |
22:27 |
<eileen_> |
civicrm revision changed from 1c0a1b2406 to 97506677e8, config revision is c0a8be03a1 |
[production] |
21:33 |
<banyek> |
compressing tables in s1@dbstore2002 (T204930) |
[production] |
21:26 |
<banyek> |
pausing replication on dbstore2002 (T204930) |
[production] |
20:44 |
<hasharDinner> |
Rebuilding CI containers for Quibble 0.0.28 |
[releng] |
20:39 |
<gtirloni> |
disabled puppet temporarily on shinken-02 |
[shinken] |
20:27 |
<hasharDinner> |
tagged Quibble 0.0.28 at 1ac8fe353a8b7bbf5b31e6094ef06ee459c553a3 |
[releng] |
19:38 |
<twentyafterfour> |
The train is now blocked by database lock contention of unknown origin |
[production] |
19:31 |
<twentyafterfour> |
the errors were all coming from wmf.26 but the error rate skyrocketed after deploying 1.33.0-wmf.1 to group1 so there is some query in the new branch which is holding a lock. T207881 |
[production] |
19:19 |
<twentyafterfour@deploy1001> |
rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.1 refs T206655 |
[production] |
18:16 |
<XioNoX> |
enable BGP sessions to transit/peering on cr2-eqord - T204170 |
[production] |
17:20 |
<gehel> |
repooling all elasticsearch servers in eqiad |
[production] |
17:12 |
<cmjohnson1> |
rebooting cloudvirt1019 |
[production] |
17:04 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-disable WBMI on Beta Commons for now T180981 (duration: 00m 54s) |
[production] |
17:03 |
<jforrester@deploy1001> |
scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) |
[production] |