| 2021-03-10
      
      § | 
    
  | 17:25 | <marxarelli> | `rm -rf /srv/restore` on deployment-db08 and reenabling puppet | [releng] | 
            
  | 17:24 | <marxarelli> | `rm -rf /srv/backup /srv/restore` on deployment-db07 and reenabling puppet | [releng] | 
            
  | 17:09 | <Majavah> | set beta cluster mediawiki as read write on mw config (T276968) | [releng] | 
            
  | 17:03 | <Majavah> | make deployment-db06 read-write T276968 | [releng] | 
            
  | 16:50 | <Majavah> | `reset slave;` on new master deployment-db06 T276968 | [releng] | 
            
  | 16:49 | <Majavah> | add deployment-db07 as a replica of db06 for T276968 | [releng] | 
            
  | 16:45 | <Urbanecm> | root@deployment-db07:/opt/wmf-mariadb104/bin# ./mysql_upgrade -h 127.0.0.1 # T276968 | [releng] | 
            
  | 16:12 | <Majavah> | deployment-db08 CHANGE MASTER to MASTER_USER='repl', MASTER_PASSWORD='redacted', MASTER_PORT=3306, MASTER_HOST='deployment-db06.deployment-prep.eqiad1.wikimedia.cloud', MASTER_LOG_FILE='deployment-db06-bin.000059', MASTER_LOG_POS=522469730; (T276968) | [releng] | 
            
  | 16:06 | <Urbanecm> | start root@deployment-db07:/srv/sqldata.db06# rsync --progress -r deployment-db06:/srv/sqldata/ . (T276968) | [releng] | 
            
  | 15:57 | <Majavah> | set deployment-db06 as readonly from mysql side T276968 | [releng] | 
            
  | 15:54 | <Urbanecm> | Start `root@deployment-db08:/opt/wmf-mariadb104/bin# ./mysql_upgrade -h 127.0.0.1` (T276968) | [releng] | 
            
  | 15:54 | <Urbanecm> | Start mariadb on db08 (T276968) | [releng] | 
            
  | 15:22 | <Urbanecm> | rsync deployment-db06:/srv/sqldata to deployment-db08:/srv/sqldata in a tmux session on deploymdeployment-db08 (T276968) | [releng] | 
            
  | 14:52 | <Majavah> | delete deployment-db08 /srv/sqldata to attempt procedure in https://phabricator.wikimedia.org/T276968#6900199 | [releng] | 
            
  | 10:16 | <arturo> | briefly stopping deployment-puppetdb03 to disable VMX CPU flag | [releng] | 
            
  | 00:28 | <marxarelli> | mariadb successfully started on db07 following transfer/extraction using mariabackup and following mysql_upgrade (T276968) | [releng] | 
            
  | 00:10 | <marxarelli> | restore of db06 failed yet again. trying mariabackup db06 -> db07 instead of mysqldump (after fixing docs/usage of the former) (T276968) | [releng] | 
            
  
    | 2021-03-09
      
      § | 
    
  | 21:54 | <marxarelli> | restoring from db06 dump on db07 and db08 following `DROP VIEW IF EXISTS user` workaround (T276968) | [releng] | 
            
  | 20:53 | <marxarelli> | restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 (T276968) | [releng] | 
            
  | 20:53 | <marxarelli> | restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 | [releng] | 
            
  | 20:39 | <marxarelli> | doing `--skip-grant-tables` on deployment-db08 and creating a new root@127.0.0.1 user (T276968) | [releng] | 
            
  | 20:33 | <Majavah> | install mariadb on deployment-db08 T276968 | [releng] | 
            
  | 19:59 | <marxarelli> | creating new instance deployment-db08 to use as new beta replica db (T276968) | [releng] | 
            
  | 19:56 | <marxarelli> | deleting deployment-db05 to free up quota for new replica (T276968) | [releng] | 
            
  | 19:50 | <marxarelli> | restoring database dump on deployment-db07 (T276968) | [releng] | 
            
  | 18:49 | <marxarelli> | restarting db dump on db06 `mysqldump -h 127.0.0.1 --events --routines --triggers --all-databases -f --single-transaction` (T276968) | [releng] | 
            
  | 18:38 | <Majavah> | installing mariadb 10.4 via role::mariadb::beta to db07 T276968 | [releng] | 
            
  | 18:25 | <marxarelli> | "View 'labswiki.tag_summary' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them" when using LOCK TABLES" during mysqldump on db06 (T276968) | [releng] | 
            
  | 18:21 | <Majavah> | create deployment-db07 as g2.cores8.ram16.disk160 Buster T276968 | [releng] | 
            
  | 18:20 | <marxarelli> | disabled puppet on deployment-db06 and started mysqldump (T276968) | [releng] | 
            
  | 18:09 | <Majavah> | set deployment-db05 to read-only to avoid issues with T276968 | [releng] | 
            
  | 18:04 | <marxarelli> | deleting shut down memc* deployment-prep instances to free up quota for replacement db instances (T276968) | [releng] | 
            
  | 17:25 | <marxarelli> | seeing "[ 2886.337845] EXT4-fs error (device vda3): ext4_validate_block_bitmap:" for deployment-db05 | [releng] | 
            
  | 17:22 | <marxarelli> | restarting deployment-db05 via horizon | [releng] | 
            
  | 17:22 | <marxarelli> | deployment-db05 seems to be acting up (intermittent connection failures) which is causing issues with beta-update-databases-eqiad, which is (possibly) causing post-merge jobs to pile up | [releng] | 
            
  | 16:47 | <marxarelli> | still seeing "JobOffer[deployment-deploy01 #3] rejected beta-scap-eqiad: Waiting for next available executor on ‘deployment-deploy01’" despite available executors | [releng] | 
            
  | 16:26 | <marxarelli> | builds once again being scheduled on deployment-deploy01 | [releng] | 
            
  | 16:24 | <marxarelli> | cycling gearman plugin on integration.wikimedia.org | [releng] | 
            
  | 16:16 | <marxarelli> | taking deployment-deploy01 agent offline to mitigate stuck post-merge jobs | [releng] | 
            
  | 13:32 | <arturo> | hard-reboot deployment-db05 because issues related to T276922 | [releng] | 
            
  | 12:34 | <arturo> | briefly rebooting VM deployment-db05, we need to reboot its hypervisor cloudvirt1038 and failed to migrate to other | [releng] | 
            
  
    | 2021-03-07
      
      § | 
    
  | 17:46 | <James_F> | Deleting deployment-snapshot01, shut off since 2020-10-03. | [releng] | 
            
  | 17:43 | <James_F> | Deleting deployment-cumin02, shut off since 2020-10-16. | [releng] | 
            
  | 17:18 | <Majavah> | shutdown deployment-memc[04-05] T276707 | [releng] | 
            
  | 16:51 | <Majavah> | cherry pick 669436 and 669436 to deployment-puppetmaster04 T276707 | [releng] | 
            
  | 15:52 | <Majavah> | redis::shards change shard01 from deployment-memc04 to deployment-memc08, shard02 from deployment-memc05 to deployment-memc10 T276707 | [releng] | 
            
  | 15:44 | <Majavah> | create deployment-memc10 on Buster T276707, beta cluster is almost on full quota but will get better when old shutdown Jessie instances will be deleted | [releng] | 
            
  | 15:28 | <Majavah> | remove and shard04 (deployment-memc07) from redis::shards, switch shard03 from deployment-memc06 to deployment-memc09, [06-07] are both already shut down and 09 is a new in setup Buster machine to replace it, T276707 T250585 | [releng] |