2021-06-15
§
|
17:46 |
<razzi> |
remove hdfs namenode backup on stat1004 |
[analytics] |
17:45 |
<razzi> |
enable puppet on an-launcher |
[analytics] |
17:45 |
<razzi> |
sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues |
[analytics] |
17:21 |
<mutante> |
new Wikimedia language "shi" added - Shilha /ˈʃɪlhə/ is a Berber language native to Shilha people. The endonym is Taclḥit /taʃlʜijt/, and in recent English publications the language is often rendered Tashelhiyt or Tashelhit. |
[production] |
17:17 |
<mutante> |
new Wikimedia language "dag" added - Dagbani (or Dagbane), also known as Dagbanli and Dagbanle, is a Gur language spoken in Ghana. |
[production] |
17:11 |
<razzi@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-master1002.eqiad.wmnet with reason: REIMAGE |
[production] |
17:09 |
<razzi@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-master1002.eqiad.wmnet with reason: REIMAGE |
[production] |
16:55 |
<razzi> |
sudo -i wmf-auto-reimage-host -p T278423 an-master1002.eqiad.wmnet |
[analytics] |
16:53 |
<razzi> |
run uid script on an-master1002 |
[analytics] |
16:33 |
<elukey> |
restart hadoop-yarn-resourcemanager on an-master1001 |
[analytics] |
16:31 |
<bstorm> |
truncated 26GB error.log T284964 |
[tools.stimmberechtigung] |
16:16 |
<razzi> |
sudo systemctl stop 'hadoop-*' on an-master1002 |
[analytics] |
16:15 |
<majavah> |
deleting unused shutdown nodes: tools-checker-03 tools-k8s-haproxy-1 tools-k8s-haproxy-2 |
[tools] |
16:14 |
<razzi> |
sudo systemctl stop hadoop-* on an-master1001, then realize I meant to do this on an-master1002, so start hadoop-* |
[analytics] |
16:12 |
<balloons> |
add 8 CPU/16G RAM to quota T284973 |
[metricsinfra] |
16:11 |
<razzi> |
downtime an-master1002 |
[analytics] |
16:11 |
<razzi@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 60 days, 0:00:00 on an-master1002.eqiad.wmnet with reason: Update operating system to bullseye |
[production] |
16:11 |
<razzi@cumin1001> |
START - Cookbook sre.hosts.downtime for 60 days, 0:00:00 on an-master1002.eqiad.wmnet with reason: Update operating system to bullseye |
[production] |
16:09 |
<majavah> |
set toolsbeta-bastion-05 as grid submit host |
[toolsbeta] |
16:08 |
<bstorm> |
truncated 28GB person_bkl2.out T284964 |
[tools.persondata] |
15:55 |
<razzi> |
sudo transfer.py an-master1001.eqiad.wmnet:/srv/hadoop/backup/hdfs-namenode-snapshot-buster-reimage-2021-06-15.tar.gz stat1004.eqiad.wmnet:/home/razzi/hdfs-namenode-fsimage |
[analytics] |
15:54 |
<bstorm> |
truncated 42GB virgule.err file T284964 |
[tools.robokobot] |
15:42 |
<razzi> |
tar -czf /srv/hadoop/backup/hdfs-namenode-snapshot-buster-reimage-$(date --iso-8601).tar.gz current on an-master1001 |
[analytics] |
15:38 |
<razzi> |
backup /srv/hadoop/name/current to /home/razzi/hdfs-namenode-snapshot-buster-reimage-2021-06-15.tar.gz on an-master1001 |
[analytics] |
15:33 |
<razzi> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -saveNamespace |
[analytics] |
15:28 |
<MacFan4000> |
copied freenode channel config for #wikimedia-fundraising to libera |
[wm-bot] |
15:27 |
<razzi> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -safemode enter |
[analytics] |
15:25 |
<razzi> |
kill running yarn applications via for loop |
[analytics] |
15:11 |
<razzi> |
sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues |
[analytics] |
15:09 |
<razzi> |
disable puppet on an-mastesr |
[analytics] |
15:08 |
<razzi> |
run puppet on an-masters to update capacity-scheduler.xml |
[analytics] |
15:02 |
<razzi> |
disable puppet on an-masters |
[analytics] |
15:01 |
<razzi> |
sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues to stop queues |
[analytics] |
14:55 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
14:51 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
14:35 |
<razzi> |
disable jobs that use hadoop on an-launcher1002 following https://phabricator.wikimedia.org/T278423#7094641 |
[analytics] |
14:25 |
<XioNoX> |
re-enable cr1-codfw:xe-5/1/2 |
[production] |
13:23 |
<marostegui> |
Upgrade clouddb1018 |
[production] |
13:15 |
<effie> |
enable puppet on canaries |
[production] |
13:10 |
<effie> |
disable puppet on canaries to deploy 699908 |
[production] |
12:54 |
<MacFan4000> |
killed a few lingering connections to freenode (wm-bot on freenode is now discontinued) |
[wm-bot] |
10:45 |
<XioNoX> |
re-enable cr1-codfw:xe-5/1/2 |
[production] |
09:42 |
<XioNoX> |
cr1-codfw# set interfaces xe-5/1/2 disable |
[production] |
09:25 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db2080', diff saved to https://phabricator.wikimedia.org/P16533 and previous config saved to /var/cache/conftool/dbconfig/20210615-092511-marostegui.json |
[production] |
09:24 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db2086:3318, db2082', diff saved to https://phabricator.wikimedia.org/P16532 and previous config saved to /var/cache/conftool/dbconfig/20210615-092409-marostegui.json |
[production] |
09:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db2086:3318', diff saved to https://phabricator.wikimedia.org/P16531 and previous config saved to /var/cache/conftool/dbconfig/20210615-090802-marostegui.json |
[production] |
09:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db2083', diff saved to https://phabricator.wikimedia.org/P16530 and previous config saved to /var/cache/conftool/dbconfig/20210615-090650-marostegui.json |
[production] |
09:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db2084', diff saved to https://phabricator.wikimedia.org/P16529 and previous config saved to /var/cache/conftool/dbconfig/20210615-090243-marostegui.json |
[production] |
09:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db2081', diff saved to https://phabricator.wikimedia.org/P16528 and previous config saved to /var/cache/conftool/dbconfig/20210615-090206-marostegui.json |
[production] |
08:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db2082', diff saved to https://phabricator.wikimedia.org/P16527 and previous config saved to /var/cache/conftool/dbconfig/20210615-085953-marostegui.json |
[production] |