2021-05-25
ยง
|
18:14 |
<razzi> |
sudo systemctl start eventlogging_to_druid_navigationtiming_hourly.service |
[analytics] |
18:08 |
<krinkle@deploy1002> |
Synchronized wmf-config/CommonSettings.php: I2ebe9674fb109f (duration: 00m 56s) |
[production] |
18:01 |
<razzi> |
manually edit /etc/hadoop/conf/capacity-scheduler.xml to make queues running and sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues |
[analytics] |
17:52 |
<razzi> |
sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues on an-master1001 and an-master1002 |
[analytics] |
17:34 |
<Krinkle> |
mwmaint1002: Running purge-parsercache-now.php on server 2/4 (pc1007, depooled spare). Ref P16060, T280605, T282761. |
[production] |
17:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1164 (re)pooling @ 100%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16207 and previous config saved to /var/cache/conftool/dbconfig/20210525-173031-root.json |
[production] |
17:28 |
<razzi> |
sudo systemctl restart refine_eventlogging_legacy |
[analytics] |
17:28 |
<razzi> |
sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues to enable submitting jobs once again |
[analytics] |
17:22 |
<effie> |
disable puppet on mc2019 (for tests) |
[production] |
17:15 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1164 (re)pooling @ 75%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16206 and previous config saved to /var/cache/conftool/dbconfig/20210525-171527-root.json |
[production] |
17:14 |
<andrewbogott> |
deleting old ingress controllers toolsbeta-test-k8s-ingress-1 and toolsbeta-test-k8s-ingress-2 |
[toolsbeta] |
17:13 |
<andrewbogott> |
created two new ingress nodes, toolsbeta-test-k8s-ingress-4 and toolsbeta-test-k8s-ingress-5 |
[toolsbeta] |
17:07 |
<razzi> |
re-enabled puppet on an-masters and an-launcher |
[analytics] |
17:04 |
<razzi> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -safemode leave |
[analytics] |
17:03 |
<razzi> |
sudo -u hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet |
[analytics] |
17:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1164 (re)pooling @ 50%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16205 and previous config saved to /var/cache/conftool/dbconfig/20210525-170024-root.json |
[production] |
16:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1164 (re)pooling @ 25%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16203 and previous config saved to /var/cache/conftool/dbconfig/20210525-164520-root.json |
[production] |
16:43 |
<razzi> |
sudo systemctl restart hadoop-hdfs-namenode on an-master1001 |
[analytics] |
16:38 |
<razzi> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -saveNamespace |
[analytics] |
16:35 |
<razzi> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -safemode enter |
[analytics] |
16:28 |
<razzi> |
sudo -u hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet |
[analytics] |
16:23 |
<razzi> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -safemode leave |
[analytics] |
16:14 |
<bd808> |
Closed #wikimedia-cloud-admin on f***node |
[admin] |
16:11 |
<bd808> |
Closed #wikimedia-cloud-feed on f***node |
[admin] |
16:06 |
<razzi> |
sudo systemctl restart hadoop-hdfs-namenode |
[analytics] |
15:52 |
<razzi> |
checkpoint hdfs with sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -saveNamespace |
[analytics] |
15:51 |
<razzi> |
enable safe mode on an-master1001 with sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -safemode enter |
[analytics] |
15:36 |
<razzi> |
disable puppet on an-master1001.eqiad.wmnet and an-master1002.eqiad.wmnet again |
[analytics] |
15:35 |
<razzi> |
re-enable puppet on an-masters, run puppet, and sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues |
[analytics] |
15:32 |
<razzi> |
disable puppet on an-master1001.eqiad.wmnet and an-master1002.eqiad.wmnet |
[analytics] |
15:19 |
<dcaro> |
rebooted cloudvirt1020, starting VMs (T275893) |
[admin] |
15:13 |
<dcaro> |
rebooting cloudvirt1020 (T275893) |
[admin] |
15:09 |
<dcaro> |
turning off VM toolsbeta-test-k8s-etcd-14 to be able to reboot cloudvirt1020 |
[toolsbeta] |
14:42 |
<dcaro> |
taking cloudvirt1020 out for maintenance (openstack wise) so no new VMs are scheduled on it (T275893) |
[admin] |
14:39 |
<razzi> |
stop puppet on an-launcher and stop hadoop-related timers |
[analytics] |
14:38 |
<wm-bot> |
<bd808> Restart to fix irc connections. This is getting really boring. |
[tools.bridgebot] |
14:35 |
<dcaro> |
taking down clouddb1002 replica for reboot of cloudvirt1020 (T275893) |
[clouddb-services] |
12:55 |
<urbanecm@deploy1002> |
Synchronized static/images/project-logos/: 63ad5fda: Revert "Add svwiki 20th anniversary logos" (T282389) (duration: 00m 56s) |
[production] |
12:52 |
<urbanecm@deploy1002> |
Synchronized wmf-config/logos.php: 94ede526: Revert "Use svwiki 20th anniversary logos" (T282389) (duration: 00m 56s) |
[production] |
12:21 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1164', diff saved to https://phabricator.wikimedia.org/P16200 and previous config saved to /var/cache/conftool/dbconfig/20210525-122127-marostegui.json |
[production] |
12:07 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'remove db1124 from dbctl', diff saved to https://phabricator.wikimedia.org/P16199 and previous config saved to /var/cache/conftool/dbconfig/20210525-120718-marostegui.json |
[production] |
11:35 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1124 will be moved to the test cluster', diff saved to https://phabricator.wikimedia.org/P16198 and previous config saved to /var/cache/conftool/dbconfig/20210525-113521-marostegui.json |
[production] |
11:26 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps1009.eqiad.wmnet with reason: Planet reimport |
[production] |
11:26 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps1009.eqiad.wmnet with reason: Planet reimport |
[production] |
11:21 |
<Lucas_WMDE> |
EU backport&config window done |
[production] |
11:20 |
<lucaswerkmeister-wmde@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:679327|Change HTTP to HTTPS for concept URIs on Commons (T258590)]] (duration: 00m 56s) |
[production] |
11:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P16196 and previous config saved to /var/cache/conftool/dbconfig/20210525-111719-root.json |
[production] |
11:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P16195 and previous config saved to /var/cache/conftool/dbconfig/20210525-110215-root.json |
[production] |
10:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P16194 and previous config saved to /var/cache/conftool/dbconfig/20210525-104711-root.json |
[production] |
10:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P16193 and previous config saved to /var/cache/conftool/dbconfig/20210525-103208-root.json |
[production] |