2021-07-23
§
|
19:02 |
<topranks> |
De-pooling eqiad again after successful replacement of linecard in cr2-codfw T287110 |
[production] |
18:26 |
<legoktm@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'shellbox' for release 'main' . |
[production] |
18:24 |
<legoktm@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'shellbox' for release 'main' . |
[production] |
18:14 |
<topranks> |
Turning up et-0/0/[0-1] and et-0/2/[0-1] interfaces on cr2-codfw after line card replacement slot 0. |
[production] |
18:12 |
<legoktm@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' . |
[production] |
16:15 |
<effie> |
enable puppet on mc-gp* hosts |
[production] |
15:47 |
<papaul> |
powerdown wdqs2002 for IDRAC reset |
[production] |
15:45 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
15:44 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
15:11 |
<elukey> |
stop ml-serve-ctrl1001 + gnt-instance modify -t plain ml-serve-ctrl1001.eqiad.wmnet on ganeti1009 + start instance back - T287238 |
[production] |
14:36 |
<_joe_> |
rebuilding httpd-fcgi, mediawiki-http fixing logging T285384 |
[production] |
14:16 |
<brennen> |
gitlab1001: running ansible to deploy [[gerrit:707236|fix puma exporter listen address]] (T275170) |
[production] |
13:35 |
<otto@deploy1002> |
Finished deploy [analytics/refinery@15521b3]: Add property disabling gobblin lock - T271232 (duration: 03m 32s) |
[production] |
13:31 |
<otto@deploy1002> |
Started deploy [analytics/refinery@15521b3]: Add property disabling gobblin lock - T271232 |
[production] |
12:16 |
<jelto@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on mw[1440-1442].eqiad.wmnet with reason: setup new canary mw api servers in eqiad D8 https://phabricator.wikimedia.org/T279309 |
[production] |
12:16 |
<jelto@cumin1001> |
START - Cookbook sre.hosts.downtime for 3:00:00 on mw[1440-1442].eqiad.wmnet with reason: setup new canary mw api servers in eqiad D8 https://phabricator.wikimedia.org/T279309 |
[production] |
12:15 |
<jelto@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on mw1439.eqiad.wmnet with reason: setup new canary mw api servers in eqiad D8 https://phabricator.wikimedia.org/T279309 |
[production] |
12:15 |
<jelto@cumin1001> |
START - Cookbook sre.hosts.downtime for 3:00:00 on mw1439.eqiad.wmnet with reason: setup new canary mw api servers in eqiad D8 https://phabricator.wikimedia.org/T279309 |
[production] |
11:50 |
<marostegui> |
Change innodb_checksum_algorithm to full_crc32 on pc1011-1014 and pc2011-2014 - T287244 |
[production] |
11:17 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1446.eqiad.wmnet |
[production] |
11:17 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1445.eqiad.wmnet |
[production] |
11:11 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1443.eqiad.wmnet |
[production] |
11:11 |
<dzahn@cumin1001> |
conftool action : set/weight=30; selector: name=mw144[3-6].eqiad.wmnet |
[production] |
11:00 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[1443,1445-1446].eqiad.wmnet with reason: new host |
[production] |
11:00 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[1443,1445-1446].eqiad.wmnet with reason: new host |
[production] |
10:58 |
<arturo> |
adding packages to buster-wikimedia/thirdparty/kubeadm-k8s-1-19 @ apt1001 |
[production] |
10:02 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1442.eqiad.wmnet |
[production] |
09:57 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1441.eqiad.wmnet |
[production] |
09:49 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1440.eqiad.wmnet |
[production] |
09:47 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1439.eqiad.wmnet |
[production] |
09:20 |
<hashar@deploy1002> |
Finished deploy [integration/docroot@edae2b4]: doc: add footer link to wikitech documentation (duration: 00m 11s) |
[production] |
09:20 |
<hashar@deploy1002> |
Started deploy [integration/docroot@edae2b4]: doc: add footer link to wikitech documentation |
[production] |
08:59 |
<dzahn@cumin1001> |
conftool action : set/weight=30; selector: name=mw144[0-2].eqiad.wmnet |
[production] |
08:58 |
<dzahn@cumin1001> |
conftool action : set/weight=30; selector: name=mw1439.eqiad.wmnet |
[production] |
08:56 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[1439-1442].eqiad.wmnet with reason: new host |
[production] |
08:56 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[1439-1442].eqiad.wmnet with reason: new host |
[production] |
08:24 |
<elukey> |
run 'gnt-instance modify -t plain ml-serve-ctrl1002.eqiad.wmnet' on ganeti1009 as test to track down latency/perf issues with kubelets |
[production] |
03:11 |
<ryankemper> |
T287223 Installed `nginx-light` on all of `cloudelastic*`, and it looks like `relforge` didn't need the upgrade. This operation is done. |
[production] |
03:09 |
<ryankemper> |
T287223 Installed `nginx-light` on all of `elastic1*` (eqiad) |
[production] |
03:06 |
<ryankemper> |
T287223 Installed `nginx-light` on all of `elastic2*` (codfw) |
[production] |
02:53 |
<ejegg> |
updated Fundraising CiviCRM from 819c11307d to 739c936298 |
[production] |
02:26 |
<ryankemper> |
[WDQS] Pooled `wdqs1004` (all caught up on its mountain of lag) |
[production] |
01:28 |
<ejegg> |
updated payments-wiki from 844b59ee42 to cc5d14ea7f |
[production] |
01:20 |
<legoktm> |
legoktm@deneb:~$ docker rmi docker-registry.wikimedia.org/mwcachedir:0.0.1 # T287222 |
[production] |