4251-4300 of 10000 results (39ms)
2021-06-15 §
17:46 <razzi> remove hdfs namenode backup on stat1004 [analytics]
17:45 <razzi> enable puppet on an-launcher [analytics]
17:45 <razzi> sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues [analytics]
17:21 <mutante> new Wikimedia language "shi" added - Shilha /ˈʃɪlhə/ is a Berber language native to Shilha people. The endonym is Taclḥit /taʃlʜijt/, and in recent English publications the language is often rendered Tashelhiyt or Tashelhit. [production]
17:17 <mutante> new Wikimedia language "dag" added - Dagbani (or Dagbane), also known as Dagbanli and Dagbanle, is a Gur language spoken in Ghana. [production]
17:11 <razzi@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-master1002.eqiad.wmnet with reason: REIMAGE [production]
17:09 <razzi@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-master1002.eqiad.wmnet with reason: REIMAGE [production]
16:55 <razzi> sudo -i wmf-auto-reimage-host -p T278423 an-master1002.eqiad.wmnet [analytics]
16:53 <razzi> run uid script on an-master1002 [analytics]
16:33 <elukey> restart hadoop-yarn-resourcemanager on an-master1001 [analytics]
16:31 <bstorm> truncated 26GB error.log T284964 [tools.stimmberechtigung]
16:16 <razzi> sudo systemctl stop 'hadoop-*' on an-master1002 [analytics]
16:15 <majavah> deleting unused shutdown nodes: tools-checker-03 tools-k8s-haproxy-1 tools-k8s-haproxy-2 [tools]
16:14 <razzi> sudo systemctl stop hadoop-* on an-master1001, then realize I meant to do this on an-master1002, so start hadoop-* [analytics]
16:12 <balloons> add 8 CPU/16G RAM to quota T284973 [metricsinfra]
16:11 <razzi> downtime an-master1002 [analytics]
16:11 <razzi@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 60 days, 0:00:00 on an-master1002.eqiad.wmnet with reason: Update operating system to bullseye [production]
16:11 <razzi@cumin1001> START - Cookbook sre.hosts.downtime for 60 days, 0:00:00 on an-master1002.eqiad.wmnet with reason: Update operating system to bullseye [production]
16:09 <majavah> set toolsbeta-bastion-05 as grid submit host [toolsbeta]
16:08 <bstorm> truncated 28GB person_bkl2.out T284964 [tools.persondata]
15:55 <razzi> sudo transfer.py an-master1001.eqiad.wmnet:/srv/hadoop/backup/hdfs-namenode-snapshot-buster-reimage-2021-06-15.tar.gz stat1004.eqiad.wmnet:/home/razzi/hdfs-namenode-fsimage [analytics]
15:54 <bstorm> truncated 42GB virgule.err file T284964 [tools.robokobot]
15:42 <razzi> tar -czf /srv/hadoop/backup/hdfs-namenode-snapshot-buster-reimage-$(date --iso-8601).tar.gz current on an-master1001 [analytics]
15:38 <razzi> backup /srv/hadoop/name/current to /home/razzi/hdfs-namenode-snapshot-buster-reimage-2021-06-15.tar.gz on an-master1001 [analytics]
15:33 <razzi> sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -saveNamespace [analytics]
15:28 <MacFan4000> copied freenode channel config for #wikimedia-fundraising to libera [wm-bot]
15:27 <razzi> sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -safemode enter [analytics]
15:25 <razzi> kill running yarn applications via for loop [analytics]
15:11 <razzi> sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues [analytics]
15:09 <razzi> disable puppet on an-mastesr [analytics]
15:08 <razzi> run puppet on an-masters to update capacity-scheduler.xml [analytics]
15:02 <razzi> disable puppet on an-masters [analytics]
15:01 <razzi> sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues to stop queues [analytics]
14:55 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
14:51 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
14:35 <razzi> disable jobs that use hadoop on an-launcher1002 following https://phabricator.wikimedia.org/T278423#7094641 [analytics]
14:25 <XioNoX> re-enable cr1-codfw:xe-5/1/2 [production]
13:23 <marostegui> Upgrade clouddb1018 [production]
13:15 <effie> enable puppet on canaries [production]
13:10 <effie> disable puppet on canaries to deploy 699908 [production]
12:54 <MacFan4000> killed a few lingering connections to freenode (wm-bot on freenode is now discontinued) [wm-bot]
10:45 <XioNoX> re-enable cr1-codfw:xe-5/1/2 [production]
09:42 <XioNoX> cr1-codfw# set interfaces xe-5/1/2 disable [production]
09:25 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db2080', diff saved to https://phabricator.wikimedia.org/P16533 and previous config saved to /var/cache/conftool/dbconfig/20210615-092511-marostegui.json [production]
09:24 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db2086:3318, db2082', diff saved to https://phabricator.wikimedia.org/P16532 and previous config saved to /var/cache/conftool/dbconfig/20210615-092409-marostegui.json [production]
09:08 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2086:3318', diff saved to https://phabricator.wikimedia.org/P16531 and previous config saved to /var/cache/conftool/dbconfig/20210615-090802-marostegui.json [production]
09:06 <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db2083', diff saved to https://phabricator.wikimedia.org/P16530 and previous config saved to /var/cache/conftool/dbconfig/20210615-090650-marostegui.json [production]
09:02 <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db2084', diff saved to https://phabricator.wikimedia.org/P16529 and previous config saved to /var/cache/conftool/dbconfig/20210615-090243-marostegui.json [production]
09:02 <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db2081', diff saved to https://phabricator.wikimedia.org/P16528 and previous config saved to /var/cache/conftool/dbconfig/20210615-090206-marostegui.json [production]
08:59 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2082', diff saved to https://phabricator.wikimedia.org/P16527 and previous config saved to /var/cache/conftool/dbconfig/20210615-085953-marostegui.json [production]