2021-03-10
ยง
|
19:15 |
<ryankemper> |
T266470 `sudo puppet cert clean wdqs.discovery.wmnet` |
[production] |
19:15 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: REIMAGE |
[production] |
19:14 |
<ryankemper> |
T266470 on `ryankemper@cumin1001`: `sudo -E cumin 'A:wdqs-all' 'sudo disable-puppet "revoking old cert and generating new one with new alt_names - T266470"'` |
[production] |
19:14 |
<ryankemper> |
T266470 Temporarily disabling puppet on all `wdqs*` hosts in preparation for `wdqs.discovery.wmnet` certificate revocation |
[production] |
19:06 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: fe99c312b3ce635342cbd690c34e2610184b74b0: Remove unused config for InukaPageView (T265921) (duration: 01m 26s) |
[production] |
18:56 |
<dwisehaupt> |
all fundraising servers are now running buster - T254198 |
[production] |
18:56 |
<wm-bot> |
<lucaswerkmeister> deployed c86ef3f7a5 (better error handling) |
[tools.pagepile-visual-filter] |
18:46 |
<Majavah> |
switch floating ip 185.15.56.34 to deployment-ircd02 T277081 |
[releng] |
18:44 |
<mforns> |
finished deployment of refinery (session length oozie job) |
[analytics] |
18:37 |
<mforns@deploy1002> |
Finished deploy [analytics/refinery@7fbc3c7] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@7fbc3c700ccb3c598690da9a38990ef7cb187656] (duration: 04m 12s) |
[production] |
18:33 |
<mforns@deploy1002> |
Started deploy [analytics/refinery@7fbc3c7] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@7fbc3c700ccb3c598690da9a38990ef7cb187656] |
[production] |
18:33 |
<mforns@deploy1002> |
Finished deploy [analytics/refinery@7fbc3c7] (thin): Regular analytics weekly train THIN [analytics/refinery@7fbc3c700ccb3c598690da9a38990ef7cb187656] (duration: 00m 07s) |
[production] |
18:33 |
<mforns@deploy1002> |
Started deploy [analytics/refinery@7fbc3c7] (thin): Regular analytics weekly train THIN [analytics/refinery@7fbc3c700ccb3c598690da9a38990ef7cb187656] |
[production] |
18:32 |
<mforns@deploy1002> |
Finished deploy [analytics/refinery@7fbc3c7]: Regular analytics weekly train [analytics/refinery@7fbc3c700ccb3c598690da9a38990ef7cb187656] (duration: 14m 30s) |
[production] |
18:18 |
<mforns@deploy1002> |
Started deploy [analytics/refinery@7fbc3c7]: Regular analytics weekly train [analytics/refinery@7fbc3c700ccb3c598690da9a38990ef7cb187656] |
[production] |
18:16 |
<mforns> |
starting deployment of refinery (session length oozie job) |
[analytics] |
18:05 |
<Majavah> |
create deployment-ircd02 for T277081 |
[releng] |
17:48 |
<mutante> |
new Wikimedia project language "trv" added - Seediq is an Atayalic language spoken in the mountains of Northern Taiwan by the Seediq and Taroko people. |
[production] |
17:45 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: REIMAGE |
[production] |
17:42 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: REIMAGE |
[production] |
17:26 |
<marxarelli> |
`rm -rf /srv/dump` on deployment-db06 and reenabling puppet |
[releng] |
17:25 |
<marxarelli> |
`rm -rf /srv/restore` on deployment-db08 and reenabling puppet |
[releng] |
17:24 |
<marxarelli> |
`rm -rf /srv/backup /srv/restore` on deployment-db07 and reenabling puppet |
[releng] |
17:19 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: REIMAGE |
[production] |
17:17 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: REIMAGE |
[production] |
17:09 |
<Majavah> |
set beta cluster mediawiki as read write on mw config (T276968) |
[releng] |
17:03 |
<Majavah> |
make deployment-db06 read-write T276968 |
[releng] |
16:56 |
<aborrero@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt1030.eqiad.wmnet |
[production] |
16:54 |
<razzi> |
rebalance kafka partitions for webrequest_upload partition 15 |
[analytics] |
16:52 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: REIMAGE |
[production] |
16:51 |
<arturo> |
rebooting cloudvirt1030 for T275753 |
[admin] |
16:50 |
<aborrero@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host cloudvirt1030.eqiad.wmnet |
[production] |
16:50 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: REIMAGE |
[production] |
16:50 |
<Majavah> |
`reset slave;` on new master deployment-db06 T276968 |
[releng] |
16:49 |
<Majavah> |
add deployment-db07 as a replica of db06 for T276968 |
[releng] |
16:48 |
<arturo> |
briefly stopping VM content-similarity-prototype to migrate hypervisor |
[wmf-research-tools] |
16:48 |
<arturo> |
briefly stopping VM toolsbeta-test-k8s-etcd-8 to migrate hypervisor |
[toolsbeta] |
16:48 |
<arturo> |
briefly stopping VM toolhub-beta01 to migrate hypervisor |
[toolhub] |
16:48 |
<arturo> |
briefly stopping VM maps-beta-1 to migrate hypervisor |
[entity-detection] |
16:47 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: REIMAGE |
[production] |
16:45 |
<Urbanecm> |
root@deployment-db07:/opt/wmf-mariadb104/bin# ./mysql_upgrade -h 127.0.0.1 # T276968 |
[releng] |
16:45 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: REIMAGE |
[production] |
16:20 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: REIMAGE |
[production] |
16:18 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: REIMAGE |
[production] |
16:12 |
<Majavah> |
deployment-db08 CHANGE MASTER to MASTER_USER='repl', MASTER_PASSWORD='redacted', MASTER_PORT=3306, MASTER_HOST='deployment-db06.deployment-prep.eqiad1.wikimedia.cloud', MASTER_LOG_FILE='deployment-db06-bin.000059', MASTER_LOG_POS=522469730; (T276968) |
[releng] |
16:06 |
<Urbanecm> |
start root@deployment-db07:/srv/sqldata.db06# rsync --progress -r deployment-db06:/srv/sqldata/ . (T276968) |
[releng] |
15:57 |
<Majavah> |
set deployment-db06 as readonly from mysql side T276968 |
[releng] |
15:54 |
<Urbanecm> |
Start `root@deployment-db08:/opt/wmf-mariadb104/bin# ./mysql_upgrade -h 127.0.0.1` (T276968) |
[releng] |
15:54 |
<Urbanecm> |
Start mariadb on db08 (T276968) |
[releng] |
15:33 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: Repool db1127 after schema change', diff saved to https://phabricator.wikimedia.org/P14744 and previous config saved to /var/cache/conftool/dbconfig/20210310-153324-root.json |
[production] |