3501-3550 of 10000 results (24ms)
2021-09-06 §
09:06 <gehel> restart blazegraph and updater on wdqs1007 [production]
08:46 <jbond> update networking fact - gerrit:715943 [production]
07:57 <godog> fail sdw on ms-be1062, reported errors [production]
07:51 <moritzm> installing libssh security updates [production]
07:45 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
07:45 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
07:44 <moritzm> installing squashfs-tools security updates [production]
06:56 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
06:56 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
06:28 <marostegui> Optimize table mkwiki.flaggedtemplates in eqiad T290057 [production]
06:26 <marostegui> Optimize table bewiki.flaggedtemplates in eqiad T290057 [production]
06:23 <marostegui> Optimize table dewiki.flaggedtemplates in eqiad T290057 [production]
05:34 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2090.codfw.wmnet with reason: REIMAGE [production]
05:32 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db2090.codfw.wmnet with reason: REIMAGE [production]
05:07 <marostegui> Stop replication on db2090 (old s4 master) T289650 T288803 [production]
05:05 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2110 (current master) from API T289650', diff saved to https://phabricator.wikimedia.org/P17223 and previous config saved to /var/cache/conftool/dbconfig/20210906-050502-marostegui.json [production]
05:04 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2090 T289650', diff saved to https://phabricator.wikimedia.org/P17222 and previous config saved to /var/cache/conftool/dbconfig/20210906-050419-marostegui.json [production]
05:01 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db2110 to s4 primary and set section read-write T289650', diff saved to https://phabricator.wikimedia.org/P17221 and previous config saved to /var/cache/conftool/dbconfig/20210906-050140-root.json [production]
05:00 <marostegui@cumin1001> dbctl commit (dc=all): 'Set s4 codfw as read-only for maintenance - T289650', diff saved to https://phabricator.wikimedia.org/P17220 and previous config saved to /var/cache/conftool/dbconfig/20210906-050048-root.json [production]
05:00 <marostegui> Starting s4 codfw failover from db2090 to db2110 - T289650 [production]
04:07 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db2110 with weight 0 T289650', diff saved to https://phabricator.wikimedia.org/P17219 and previous config saved to /var/cache/conftool/dbconfig/20210906-040740-root.json [production]
04:07 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 33 hosts with reason: Primary switchover s4 T289650 [production]
04:06 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on 33 hosts with reason: Primary switchover s4 T289650 [production]
2021-09-05 §
18:54 <urbanecm> wikiadmin@10.192.0.119(ptwiki)> update protected_titles set pt_create_perm='editautoreviewprotected' where pt_create_perm='autoreviewer'; # T290396 [production]
15:15 <andrewbogott> changing the puppetmaster for integration-puppetmaster-02 to the default puppetmaster. It's a lot easier to have a project-local puppetmaster if it isn't its own master; otherwise there's epic cert confusion. I've confirmed that there aren't any custom features applied to this master. [integration]
2021-09-04 §
19:50 <wm-bot> <lokal-profil> Deploy latest from Git master: b4d3e0e, 339838b (T289929), 7816a36 (T289930) [tools.heritage]
13:35 <marostegui@cumin1001> dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 100%: Slowly repool T290374', diff saved to https://phabricator.wikimedia.org/P17217 and previous config saved to /var/cache/conftool/dbconfig/20210904-133532-root.json [production]
13:20 <marostegui@cumin1001> dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 75%: Slowly repool T290374', diff saved to https://phabricator.wikimedia.org/P17216 and previous config saved to /var/cache/conftool/dbconfig/20210904-132029-root.json [production]
13:05 <marostegui@cumin1001> dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 50%: Slowly repool T290374', diff saved to https://phabricator.wikimedia.org/P17215 and previous config saved to /var/cache/conftool/dbconfig/20210904-130525-root.json [production]
12:50 <marostegui@cumin1001> dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 25%: Slowly repool T290374', diff saved to https://phabricator.wikimedia.org/P17214 and previous config saved to /var/cache/conftool/dbconfig/20210904-125021-root.json [production]
12:35 <marostegui@cumin1001> dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 10%: Slowly repool T290374', diff saved to https://phabricator.wikimedia.org/P17213 and previous config saved to /var/cache/conftool/dbconfig/20210904-123518-root.json [production]
12:20 <marostegui@cumin1001> dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 5%: Slowly repool T290374', diff saved to https://phabricator.wikimedia.org/P17212 and previous config saved to /var/cache/conftool/dbconfig/20210904-122014-root.json [production]
09:03 <elukey> restart wmf_auto_restart_rsyslog.service on puppetdb1002 [production]
09:00 <elukey> `systemctl reset-failed ifup@ens6.service` on puppetdb2002 - T273026 [production]
03:02 <rzl@cumin2001> dbctl commit (dc=all): 'Depool db2137:3314', diff saved to https://phabricator.wikimedia.org/P17210 and previous config saved to /var/cache/conftool/dbconfig/20210904-030231-rzl.json [production]
2021-09-03 §
23:02 <Krinkle> Creating integration-agent-qemu-1002 (Debian 11 Bullseye, g3.cores8.ram24.disk20.ephemeral40.4xiops), ref T284774 [releng]
22:36 <bstorm> backfilling quotas in screen for T286784 [tools]
22:34 <bstorm> backfilled quotas for T286784 [toolsbeta]
21:49 <bd808@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
20:30 <bd808@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
19:33 <krinkle@deploy1002> Finished deploy [integration/docroot@6492b3d]: I48480e89e5f6 (duration: 00m 10s) [production]
19:33 <krinkle@deploy1002> Started deploy [integration/docroot@6492b3d]: I48480e89e5f6 [production]
19:26 <bd808@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
19:19 <bstorm> adding config group validation rules for postgresql and mysql T290349 [trove]
19:14 <bstorm> adding config group validation rules for mariadb 10.5.10 T290349 [trove]
19:04 <ryankemper> T290330 `ryankemper@cumin1001:~$ sudo -E cumin 'P{wdqs2*}' 'sudo rm -fv /etc/cron.hourly/restart-blazegraph'` (Cleaned up manually created crons now that we have [somewhat hacky] systemd timers doing the same job) [production]
17:42 <dduvall@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' . [production]
17:42 <dduvall> deploying blubberoid:2021-09-03-160524-production to eqiad/codfw (https://gerrit.wikimedia.org/r/c/blubber/+/716519) (T289367) [releng]
17:40 <dduvall@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' . [production]
17:39 <andrewbogott> restarting celery workers and reloading web UI to pick up timeout changes [quarry]