2021-06-02
§
|
05:32 |
<razzi@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
05:31 |
<ladsgroup@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:697671|Fix pageterms API call for Special:Nearby in Wikidata (T281639)]] (duration: 00m 56s) [REPLAY FROM 2021-06-01 21:44:06] |
[production] |
05:30 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [REPLAY FROM 2021-06-01 19:42:38] |
[production] |
05:30 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox [REPLAY FROM 2021-06-01 19:29:26] |
[production] |
05:28 |
<razzi@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1183.eqiad.wmnet |
[production] |
05:19 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1144:3314', diff saved to https://phabricator.wikimedia.org/P16251 and previous config saved to /var/cache/conftool/dbconfig/20210602-051919-marostegui.json |
[production] |
05:18 |
<razzi@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts db1183.eqiad.wmnet |
[production] |
05:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 100%: Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P16250 and previous config saved to /var/cache/conftool/dbconfig/20210602-051738-root.json |
[production] |
05:15 |
<volans|off> |
restart tcpircbot-logmsgbot on alert1001 - T284123 |
[production] |
04:56 |
<marostegui> |
Test |
[production] |
2021-06-01
§
|
21:09 |
<andrewbogott> |
dropping a bunch of tables from the labswiki db as per T284108 |
[production] |
17:23 |
<Amir1> |
starting deletion of mbox files on lists1001 for mailman2, first reading-web-team.mbox, then smallest lists (T282303) |
[production] |
16:31 |
<moritzm> |
updating debmonitor clients to 0.3.0 (along with cleanup of sysuser UID allocation) |
[production] |
15:38 |
<legoktm> |
stopped mailman2 service on lists1001 (T52864) |
[production] |
15:23 |
<ryankemper@cumin1001> |
END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) reboot without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic reboot - ryankemper@cumin1001 - T283223 |
[production] |
15:16 |
<ryankemper> |
T283223 `sudo -i cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic reboot" --reboot --nodes-per-run 1 --start-datetime 2021-05-20T05:16:40 --task-id T283223` on `ryankemper@cumin1001` tmux session `restart_cloudelastic` |
[production] |
15:16 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic reboot - ryankemper@cumin1001 - T283223 |
[production] |
14:59 |
<topranks> |
Restoring Lumen CCT 442550293 to normal metric / bring back into service (T274234) |
[production] |
13:56 |
<marostegui> |
Stop mysql on db2079 (codfw master) - T283743 |
[production] |
13:53 |
<topranks> |
Draining Lumen CCT 442550293 to do some comparative bandwidth tests from eqiad to codfw (T274234) |
[production] |
13:53 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 3f757748a14ac8c205f6a5fac0611216c01ceb1c: cawiki: Fix help panel links (T280673) (duration: 00m 58s) |
[production] |
13:48 |
<otto@deploy1002> |
Finished deploy [analytics/refinery@c0a02e5] (hadoop-test): deploy to an-test-coord1001 to get airflow/dags/hello_world.py - T272973 (duration: 02m 58s) |
[production] |
13:45 |
<otto@deploy1002> |
Started deploy [analytics/refinery@c0a02e5] (hadoop-test): deploy to an-test-coord1001 to get airflow/dags/hello_world.py - T272973 |
[production] |
13:43 |
<topranks> |
Restoring Telia CT IC-307235 to normal metric / bring back into service (T274234) |
[production] |
13:08 |
<jynus@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2098.codfw.wmnet with reason: REIMAGE |
[production] |
13:06 |
<jynus@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db2098.codfw.wmnet with reason: REIMAGE |
[production] |
12:12 |
<dcausse> |
re-pooling wdsq1005 (caught-up lag) |
[production] |
12:06 |
<moritzm> |
installing djvulibre security updates |
[production] |
11:16 |
<jbond@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2003.codfw.wmnet with reason: REIMAGE |
[production] |
11:14 |
<jbond@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2003.codfw.wmnet with reason: REIMAGE |
[production] |
11:04 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: e4989d2b19e07d2a816cd7f6afae077f86aca54e: Enable "Diff" RSS feed on meta (T283380) (duration: 00m 58s) |
[production] |
11:04 |
<jiji@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
10:39 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on maps1009.eqiad.wmnet with reason: Postgis version juggling |
[production] |
10:39 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on maps1009.eqiad.wmnet with reason: Postgis version juggling |
[production] |
10:38 |
<jiji@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
09:37 |
<topranks> |
Draining Telia CT IC-307235 to do some comparative bandwidth tests from eqiad to codfw (T274234) |
[production] |
08:03 |
<hashar> |
Restarted Gerrit on gerrit1001 for Java 11 upgrade # T268225 |
[production] |
08:02 |
<hashar> |
Restarted Gerrit on gerrit2001 for Java 11 upgrade # T268225 |
[production] |
07:26 |
<dcausse> |
depooling wdsq1005 (lag) |
[production] |
07:14 |
<moritzm> |
installing nginx security updates |
[production] |
05:56 |
<legoktm> |
restarting mailman3 on lists1001 |
[production] |
05:37 |
<legoktm> |
uploaded django-allauth_0.44.0+ds-1~bpo10+1 mailman3_3.3.3-1~bpo10+4 to apt.wm.o |
[production] |
05:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1146:3314', diff saved to https://phabricator.wikimedia.org/P16242 and previous config saved to /var/cache/conftool/dbconfig/20210601-053137-marostegui.json |
[production] |
05:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1147 (re)pooling @ 100%: Repool db1147', diff saved to https://phabricator.wikimedia.org/P16241 and previous config saved to /var/cache/conftool/dbconfig/20210601-052349-root.json |
[production] |
05:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1147 (re)pooling @ 75%: Repool db1147', diff saved to https://phabricator.wikimedia.org/P16240 and previous config saved to /var/cache/conftool/dbconfig/20210601-050845-root.json |
[production] |
04:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1147 (re)pooling @ 50%: Repool db1147', diff saved to https://phabricator.wikimedia.org/P16239 and previous config saved to /var/cache/conftool/dbconfig/20210601-045341-root.json |
[production] |
04:38 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1147 (re)pooling @ 25%: Repool db1147', diff saved to https://phabricator.wikimedia.org/P16238 and previous config saved to /var/cache/conftool/dbconfig/20210601-043837-root.json |
[production] |
00:46 |
<legoktm@deploy1002> |
Synchronized logos/config.yaml: Revert "Use eswiki 20th anniversary logos" (T280908) (duration: 01m 07s) |
[production] |
00:43 |
<legoktm@deploy1002> |
Synchronized wmf-config/logos.php: Revert "Use eswiki 20th anniversary logos" (T280908) (duration: 01m 00s) |
[production] |