2020-12-16
§
|
12:21 |
<urbanecm@deploy1001> |
Synchronized wmf-config/MetaContactPages.php: 0c651a6adc2d07b4163fba47109a5070884e7f54: MetaContactPages: Remove licenseabuse contact page (T269781) (duration: 01m 03s) |
[production] |
12:21 |
<hnowlan@puppetmaster1001> |
conftool action : set/weight=5; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1265.eqiad.wmnet |
[production] |
12:20 |
<hnowlan@puppetmaster1001> |
conftool action : set/weight=5; selector: dc=eqiad,cluster=appserver,service=apache2,name=mw1265.eqiad.wmnet |
[production] |
12:20 |
<jayme> |
imported kubernetes 1.16.15-2 into component/kubernetes-future stretch-wikimedia |
[production] |
11:52 |
<marostegui> |
Stop s1, s3, s5 and s8 on db1124 to copy it to db1154 (this will generate lag on wikireplicas) T268742 |
[production] |
11:19 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2001.codfw.wmnet with reason: REIMAGE |
[production] |
11:19 |
<jiji@deploy1001> |
Synchronized wmf-config/ProductionServices.php: Swap mc1019 with mc1031 for Redis lock manager - T265643 (duration: 01m 17s) |
[production] |
11:17 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2022.codfw.wmnet with reason: REIMAGE |
[production] |
11:15 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1022.eqiad.wmnet with reason: REIMAGE |
[production] |
11:15 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc2022.codfw.wmnet with reason: REIMAGE |
[production] |
11:14 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on pki2001.codfw.wmnet with reason: REIMAGE |
[production] |
11:13 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc1022.eqiad.wmnet with reason: REIMAGE |
[production] |
11:10 |
<jynus> |
stopping and restarting dbstore1004 to mitigate (short term) T270112 |
[production] |
10:37 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
10:37 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE |
[production] |
10:35 |
<jbond42> |
reboot rpki2001 |
[production] |
10:35 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
10:35 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE |
[production] |
10:34 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
10:30 |
<jbond42> |
reboot rpki1001 |
[production] |
10:30 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
10:05 |
<gehel@cumin1001> |
END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) |
[production] |
10:02 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE |
[production] |
10:00 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE |
[production] |
09:49 |
<godog> |
swift eqiad-prod: add weight to ms-be106[0-3] - T268435 |
[production] |
09:32 |
<_joe_> |
reset-failed for docker report jobs on deneb, failed because of a registry gateway timeout |
[production] |
09:29 |
<elukey> |
force execution of cumin-check-aliases.service on cumin[12]001 hosts to clear alarms |
[production] |
08:35 |
<gehel@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
08:23 |
<vgutierrez> |
acme-chief and acme-chief-api restarts for openssl upgrades (CVE-2020-1971) |
[production] |
07:55 |
<gehel> |
depool wdqs1005 (catching up on lag) |
[production] |
07:20 |
<marostegui> |
Stop mysql on db2142 to clone db1151 - T269324 |
[production] |
2020-12-15
§
|
23:47 |
<dduvall@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' . |
[production] |
23:45 |
<dduvall@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' . |
[production] |
23:34 |
<dduvall@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . |
[production] |
22:10 |
<mholloway-shell@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: WikimediaEvents: Promote SessionTick to group1 T248987 (duration: 01m 04s) |
[production] |
20:29 |
<marxarelli> |
group0 to 1.36.0-wmf.22 complete. no new errors or concerning rates (refs T267415) |
[production] |
20:26 |
<tzatziki> |
reset email for User:Cnk1220 |
[production] |
20:06 |
<dduvall@deploy1001> |
rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.22 |
[production] |
19:32 |
<joal@deploy1001> |
Finished deploy [analytics/refinery@2202db5] (thin): Regular analytics weekly train - THIN [analytics/refinery@2202db5] (duration: 00m 08s) |
[production] |
19:32 |
<joal@deploy1001> |
Started deploy [analytics/refinery@2202db5] (thin): Regular analytics weekly train - THIN [analytics/refinery@2202db5] |
[production] |
19:31 |
<joal@deploy1001> |
Finished deploy [analytics/refinery@2202db5]: Regular analytics weekly train [analytics/refinery@2202db5] (duration: 16m 36s) |
[production] |
19:14 |
<joal@deploy1001> |
Started deploy [analytics/refinery@2202db5]: Regular analytics weekly train [analytics/refinery@2202db5] |
[production] |
18:48 |
<dduvall@deploy1001> |
Pruned MediaWiki: 1.36.0-wmf.20 (duration: 04m 19s) |
[production] |
18:41 |
<dduvall@deploy1001> |
Finished scap: testwikis wikis to 1.36.0-wmf.22 (duration: 46m 41s) |
[production] |
17:55 |
<dduvall@deploy1001> |
Started scap: testwikis wikis to 1.36.0-wmf.22 |
[production] |
16:47 |
<ottomata> |
bumped eventate-main memory limits from 300M to 600M - T249745 |
[production] |
16:47 |
<otto@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . |
[production] |
16:47 |
<otto@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . |
[production] |
16:45 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1265.eqiad.wmnet with reason: REIMAGE |
[production] |
16:44 |
<otto@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . |
[production] |