1401-1450 of 10000 results (23ms)
2021-07-20 ยง
20:53 <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: caa5a076f39b051b01622aa3e4c9d716a8643eef: Set wgGEMentorDashboardBackendEnabled properly (T285811) (duration: 00m 57s) [production]
20:49 <urbanecm@deploy1002> Synchronized php-1.37.0-wmf.14/extensions/GrowthExperiments/maintenance/updateMenteeData.php: dafd953eb5cd35bddbd2fd348b03066420a42362: updateMenteeData: Make it possible to disable script per-wiki (T285811) (duration: 00m 58s) [production]
20:30 <joal> rerun webrequest timed-out instances [analytics]
19:46 <wm-bot> <lucaswerkmeister> deployed f309990d5b (update config loading code; also upgraded venv, e.g. Flask v2) [tools.ranker]
19:36 <wm-bot> <lucaswerkmeister> (and now also upgraded the venv accordingly, Flask v2 etc.) [tools.pagepile-visual-filter]
19:33 <wm-bot> <lucaswerkmeister> deployed 93add9ed8b (update config loading code) [tools.pagepile-visual-filter]
19:22 <wm-bot> <lucaswerkmeister> deployed 1d17bc8e87 (update config loading code) [tools.quickcategories]
19:09 <bstorm> upgraded version of maintain-kubeusers to the latest in master branch T285011 [toolsbeta]
19:07 <wm-bot> <lucaswerkmeister> deployed b1f23a7801 (minor updates) [tools.quickcategories]
18:58 <mforns> starting refinery deployment [analytics]
18:57 <urbanecm> Start server-side upload for 4 large PNG files (T285708) [production]
18:42 <majavah> deploying systemd security tools on toolforge public stretch machines T287004 [tools]
18:40 <razzi> razzi@an-launcher1002:~$ sudo puppet agent --enable [analytics]
18:39 <razzi> razzi@an-master1001:/var/log/hadoop-hdfs$ sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues [analytics]
18:39 <hashar> Rolling back Jenkins jobs from Quibble 1.0.0 to 0.0.47 # T287001 [releng]
18:37 <razzi> razzi@an-master1002:~$ sudo -i puppet agent --enable [analytics]
18:34 <razzi> razzi@an-master1002:~$ sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues [analytics]
18:32 <razzi> razzi@an-master1002:~$ sudo systemctl start hadoop-yarn-resourcemanager.service [analytics]
18:31 <razzi> razzi@an-master1002:~$ sudo systemctl stop hadoop-yarn-resourcemanager.service [analytics]
18:22 <razzi> sudo -u hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet [analytics]
18:21 <razzi> re-enable yarn queues by merging puppet patch https://gerrit.wikimedia.org/r/c/operations/puppet/+/705732 [analytics]
18:05 <Jeff_Green> authdns-update to point fundraising.wm.o CNAME to a new server [production]
17:57 <razzi@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-master1001.eqiad.wmnet with reason: REIMAGE [production]
17:55 <razzi@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-master1001.eqiad.wmnet with reason: REIMAGE [production]
17:45 <arturo> pushed new toolforge-jobs-framework-api docker image into the registry (3a6ae38d51202c5c765c8d800cb8380e2a20b998) (T286126 [tools]
17:37 <arturo> added toolforge-jobs-framework-cli v3 to aptly buster-tools and buster-toolsbeta [tools]
17:27 <razzi> razzi@cumin1001:~$ sudo -i wmf-auto-reimage-host -p T278423 an-master1001.eqiad.wmnet [analytics]
17:17 <razzi> stop all hadoop processes on an-master1001 [analytics]
17:07 <andrewbogott> reloading haproxy on dbproxy1018 for T286598 [admin]
17:06 <rzl> enabled puppet on A:mw [production]
16:54 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 20 hosts with reason: dealing with an-master1001 rebuild issue [production]
16:54 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 20 hosts with reason: dealing with an-master1001 rebuild issue [production]
16:53 <rzl> disabled puppet on A:mw to test https://gerrit.wikimedia.org/r/676508 [production]
16:53 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 64 hosts with reason: dealing with an-master1001 rebuild issue [production]
16:53 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 64 hosts with reason: dealing with an-master1001 rebuild issue [production]
16:52 <razzi> starting hadoop processes on an-master1001 since they didn't failover cleanly [analytics]
16:44 <dcausse@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . [production]
16:37 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw1297.eqiad.wmnet [production]
16:31 <razzi> sudo bash gid_script.bash on an-maseter1001 [analytics]
16:29 <razzi> razzi@alert1001:~$ sudo icinga-downtime -h an-master1001 -d 7200 -r "an-master1001 debian upgrade" [analytics]
16:25 <razzi> razzi@an-master1001:~$ sudo systemctl stop hadoop-mapreduce-historyserver [analytics]
16:25 <razzi> sudo systemctl stop hadoop-hdfs-zkfc.service on an-master1001 again [analytics]
16:25 <dcausse@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . [production]
16:25 <razzi> sudo systemctl stop hadoop-yarn-resourcemanager on an-master1001 again [analytics]
16:24 <dzahn@cumin1001> START - Cookbook sre.hosts.decommission for hosts mw1297.eqiad.wmnet [production]
16:23 <razzi> sudo systemctl stop hadoop-hdfs-namenode on an-master1001 [analytics]
16:21 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw1290.eqiad.wmnet [production]
16:19 <razzi> razzi@an-master1001:~$ sudo systemctl stop hadoop-hdfs-zkfc [analytics]
16:19 <razzi> razzi@an-master1001:~$ sudo systemctl stop hadoop-yarn-resourcemanager [analytics]
16:18 <razzi> sudo systemctl stop hadoop-hdfs-namenode [analytics]