2021-03-10
§
|
16:45 |
<Urbanecm> |
root@deployment-db07:/opt/wmf-mariadb104/bin# ./mysql_upgrade -h 127.0.0.1 # T276968 |
[releng] |
16:12 |
<Majavah> |
deployment-db08 CHANGE MASTER to MASTER_USER='repl', MASTER_PASSWORD='redacted', MASTER_PORT=3306, MASTER_HOST='deployment-db06.deployment-prep.eqiad1.wikimedia.cloud', MASTER_LOG_FILE='deployment-db06-bin.000059', MASTER_LOG_POS=522469730; (T276968) |
[releng] |
16:06 |
<Urbanecm> |
start root@deployment-db07:/srv/sqldata.db06# rsync --progress -r deployment-db06:/srv/sqldata/ . (T276968) |
[releng] |
15:57 |
<Majavah> |
set deployment-db06 as readonly from mysql side T276968 |
[releng] |
15:54 |
<Urbanecm> |
Start `root@deployment-db08:/opt/wmf-mariadb104/bin# ./mysql_upgrade -h 127.0.0.1` (T276968) |
[releng] |
15:54 |
<Urbanecm> |
Start mariadb on db08 (T276968) |
[releng] |
15:22 |
<Urbanecm> |
rsync deployment-db06:/srv/sqldata to deployment-db08:/srv/sqldata in a tmux session on deploymdeployment-db08 (T276968) |
[releng] |
14:52 |
<Majavah> |
delete deployment-db08 /srv/sqldata to attempt procedure in https://phabricator.wikimedia.org/T276968#6900199 |
[releng] |
10:16 |
<arturo> |
briefly stopping deployment-puppetdb03 to disable VMX CPU flag |
[releng] |
00:28 |
<marxarelli> |
mariadb successfully started on db07 following transfer/extraction using mariabackup and following mysql_upgrade (T276968) |
[releng] |
00:10 |
<marxarelli> |
restore of db06 failed yet again. trying mariabackup db06 -> db07 instead of mysqldump (after fixing docs/usage of the former) (T276968) |
[releng] |
2021-03-09
§
|
21:54 |
<marxarelli> |
restoring from db06 dump on db07 and db08 following `DROP VIEW IF EXISTS user` workaround (T276968) |
[releng] |
20:53 |
<marxarelli> |
restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 (T276968) |
[releng] |
20:53 |
<marxarelli> |
restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 |
[releng] |
20:39 |
<marxarelli> |
doing `--skip-grant-tables` on deployment-db08 and creating a new root@127.0.0.1 user (T276968) |
[releng] |
20:33 |
<Majavah> |
install mariadb on deployment-db08 T276968 |
[releng] |
19:59 |
<marxarelli> |
creating new instance deployment-db08 to use as new beta replica db (T276968) |
[releng] |
19:56 |
<marxarelli> |
deleting deployment-db05 to free up quota for new replica (T276968) |
[releng] |
19:50 |
<marxarelli> |
restoring database dump on deployment-db07 (T276968) |
[releng] |
18:49 |
<marxarelli> |
restarting db dump on db06 `mysqldump -h 127.0.0.1 --events --routines --triggers --all-databases -f --single-transaction` (T276968) |
[releng] |
18:38 |
<Majavah> |
installing mariadb 10.4 via role::mariadb::beta to db07 T276968 |
[releng] |
18:25 |
<marxarelli> |
"View 'labswiki.tag_summary' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them" when using LOCK TABLES" during mysqldump on db06 (T276968) |
[releng] |
18:21 |
<Majavah> |
create deployment-db07 as g2.cores8.ram16.disk160 Buster T276968 |
[releng] |
18:20 |
<marxarelli> |
disabled puppet on deployment-db06 and started mysqldump (T276968) |
[releng] |
18:09 |
<Majavah> |
set deployment-db05 to read-only to avoid issues with T276968 |
[releng] |
18:04 |
<marxarelli> |
deleting shut down memc* deployment-prep instances to free up quota for replacement db instances (T276968) |
[releng] |
17:25 |
<marxarelli> |
seeing "[ 2886.337845] EXT4-fs error (device vda3): ext4_validate_block_bitmap:" for deployment-db05 |
[releng] |
17:22 |
<marxarelli> |
restarting deployment-db05 via horizon |
[releng] |
17:22 |
<marxarelli> |
deployment-db05 seems to be acting up (intermittent connection failures) which is causing issues with beta-update-databases-eqiad, which is (possibly) causing post-merge jobs to pile up |
[releng] |
16:47 |
<marxarelli> |
still seeing "JobOffer[deployment-deploy01 #3] rejected beta-scap-eqiad: Waiting for next available executor on ‘deployment-deploy01’" despite available executors |
[releng] |
16:26 |
<marxarelli> |
builds once again being scheduled on deployment-deploy01 |
[releng] |
16:24 |
<marxarelli> |
cycling gearman plugin on integration.wikimedia.org |
[releng] |
16:16 |
<marxarelli> |
taking deployment-deploy01 agent offline to mitigate stuck post-merge jobs |
[releng] |
13:32 |
<arturo> |
hard-reboot deployment-db05 because issues related to T276922 |
[releng] |
12:34 |
<arturo> |
briefly rebooting VM deployment-db05, we need to reboot its hypervisor cloudvirt1038 and failed to migrate to other |
[releng] |
2021-03-07
§
|
17:46 |
<James_F> |
Deleting deployment-snapshot01, shut off since 2020-10-03. |
[releng] |
17:43 |
<James_F> |
Deleting deployment-cumin02, shut off since 2020-10-16. |
[releng] |
17:18 |
<Majavah> |
shutdown deployment-memc[04-05] T276707 |
[releng] |
16:51 |
<Majavah> |
cherry pick 669436 and 669436 to deployment-puppetmaster04 T276707 |
[releng] |
15:52 |
<Majavah> |
redis::shards change shard01 from deployment-memc04 to deployment-memc08, shard02 from deployment-memc05 to deployment-memc10 T276707 |
[releng] |
15:44 |
<Majavah> |
create deployment-memc10 on Buster T276707, beta cluster is almost on full quota but will get better when old shutdown Jessie instances will be deleted |
[releng] |
15:28 |
<Majavah> |
remove and shard04 (deployment-memc07) from redis::shards, switch shard03 from deployment-memc06 to deployment-memc09, [06-07] are both already shut down and 09 is a new in setup Buster machine to replace it, T276707 T250585 |
[releng] |
13:14 |
<Majavah> |
create deployment-memc09 on Buster T276707 |
[releng] |